R语言:快速读取txt文件

时间:2022-10-29 17:22:16

用R语言来读取600多M的txt文件,使用常用的read.table,太费时间。想起之前用readr包来读取csv、spss的、sas、excel的很快,便尝试了翻。

直接上代码:


> library(readr)
> a1=Sys.time()
> BRCA_RNAseqGene<-read_delim("20151101-BRCA-RNAseqGene.txt", "\t", escape_double = FALSE, trim_ws = TRUE)
Parsed with column specification:
cols(
.default = col_character()
)
See spec(...) for full column specifications.
|================================================================================| 100% 611 MB
| 0%
Warning message:
Duplicated column names deduplicated: 'TCGA-A1-A0SB-01A-11R-A144-07' => 'TCGA-A1-A0SB-01A-11R-A144-07_1' [3], 'TCGA-A1-A0SB-01A-11R-A144-07' => 'TCGA-A1-A0SB-01A-11R-A144-07_2' [4], 'TCGA-A1-A0SD-01A-11R-A115-07' => 'TCGA-A1-A0SD-01A-11R-A115-07_1' [6], 'TCGA-A1-A0SD-01A-11R-A115-07' => 'TCGA-A1-A0SD-01A-11R-A115-07_2' [7], 'TCGA-A1-A0SE-01A-11R-A084-07' => 'TCGA-A1-A0SE-01A-11R-A084-07_1' [9], 'TCGA-A1-A0SE-01A-11R-A084-07' => 'TCGA-A1-A0SE-01A-11R-A084-07_2' [10], 'TCGA-A1-A0SF-01A-11R-A144-07' => 'TCGA-A1-A0SF-01A-11R-A144-07_1' [12], 'TCGA-A1-A0SF-01A-11R-A144-07' => 'TCGA-A1-A0SF-01A-11R-A144-07_2' [13], 'TCGA-A1-A0SG-01A-11R-A144-07' => 'TCGA-A1-A0SG-01A-11R-A144-07_1' [15], 'TCGA-A1-A0SG-01A-11R-A144-07' => 'TCGA-A1-A0SG-01A-11R-A144-07_2' [16], 'TCGA-A1-A0SH-01A-11R-A084-07' => 'TCGA-A1-A0SH-01A-11R-A084-07_1' [18], 'TCGA-A1-A0SH-01A-11R-A084-07' => 'TCGA-A1-A0SH-01A-11R-A084-07_2' [19], 'TCGA-A1-A0SI-01A-11R-A144-07' => 'TCGA-A1-A0SI-01A-11R-A144-07_1' [21], 'TCGA-A1-A0SI-01A-... <truncated>
> a2 =Sys.time()
> a2 -a1
Time difference of 43.15733 secs

用时不到44秒,超级的快。