12 R vs Python in txt and csv

通过R和Python简单比较相同内容的txt和csv文件

Published

January 1, 2026

在2025-12-26我们通过QuPath的groovy script输出了detections的csv（comma-separated）格式文件和txt（tab-separated）格式文件（QuPath：输出detections的csv和txt格式文件）。

本着严谨的态度，还是应该（简单）比较下csv和txt格式文件的内容是否完全相同。

我们尝试通过R和Python简单比较，另外：

比较R代码的两种风格。
比较R代码和Python代码。

1. 通过R简单比较

# R v4.5.2

# Load packages
readr |> library()
# or
library(readr)

# ===== Read csv file =====
csv_file_path <- "raw_data/AsPC LZ #1 GEM  Ker488 FN 568 pN 647 _01.vsi - 20x_detections_trimmed.csv"

csv_file <- csv_file_path |> 
    read_csv(show_col_types = FALSE)
# or
csv_file <- read_csv(csv_file_path, show_col_types = FALSE)

# ===== Read txt file =====
txt_file_path <- "raw_data/AsPC LZ #1 GEM  Ker488 FN 568 pN 647 _01.vsi - 20x_detections_trimmed.txt"

txt_file <- txt_file_path |> 
    read_tsv(show_col_types = FALSE)
# or
txt_file <- read_tsv(txt_file_path, show_col_types = FALSE)

# ===== some comparisons =====
# Compare dimention
csv_file |> dim() # rows columns

[1]  10 102

# or
dim(csv_file)

[1]  10 102

txt_file |> dim()

[1]  10 102

# or
dim(txt_file)

[1]  10 102

(csv_file |> dim()) == (txt_file |> dim()) # check if they are the same

[1] TRUE TRUE

# or
dim(csv_file) == dim(txt_file)

[1] TRUE TRUE

# Compare column names
(csv_file |> names()) == (txt_file |> names()) # check if the column names are the same

  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

# or
names(csv_file) == names(txt_file)

  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

# Compare two specific positions/cells
csv_file[5, 10] == txt_file[5, 10] # check if the content at row 500 and column 10 is the same

     Nucleus: Area µm^2
[1,]               TRUE

csv_file[10, 30] == txt_file[10, 30]

     Nucleus: cytokeratin: Min
[1,]                      TRUE

2. 通过Python简单比较

# Python 3.13.9

# Load packages
import pandas as pd

# ===== Read csv file =====
csv_file_path = "raw_data/AsPC LZ #1 GEM  Ker488 FN 568 pN 647 _01.vsi - 20x_detections_trimmed.csv"

csv_file = pd.read_csv(csv_file_path)

# ===== Read txt file =====
txt_file_path = "raw_data/AsPC LZ #1 GEM  Ker488 FN 568 pN 647 _01.vsi - 20x_detections_trimmed.txt"

txt_file = pd.read_table(txt_file_path)

# ===== Some comparisons =====
# Compare dimention
print(csv_file.shape)  # (rows, columns)

(10, 102)

print(txt_file.shape)

(10, 102)

csv_file.shape == txt_file.shape # check if they are the same

True

# Compare column names
csv_file.columns == txt_file.columns

array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True])

# Compare two specific position/cell
csv_file.iloc[4, 10] == txt_file.iloc[4, 11] # check if the content at row 500 and column 10 is the same

np.False_

csv_file.iloc[9, 30] == csv_file.iloc[9, 30]

np.True_

R和Python都说明csv格式和txt格式文件的内容完全相同。

给我买杯茶🍵