使用R,如何在不打开文件的情况下找到文件中特定短语的行号?

时间:2023-01-14 19:26:55

I am new to R. I have a large (3.2 Gb) txt file containing two columns. First column has the human genome sequence position and the Second column has a value corresponding to each position. I want to find the line numbers for specific positions from the first column and then read those lines into a table in R. I cannot import the file because of memory issue. here is an example of the R code I tried to get the line number of one specific position from the first column of my data file. the data file is called my.data.

我是r的新手,我有一个很大的(3.2 Gb) txt文件,包含两个列。第一列有人类基因组序列的位置,第二列有对应于每个位置的值。我想从第一列找到特定位置的行号,然后将这些行读入r中的一个表中,由于内存问题,我无法导入文件。下面是我试图从数据文件的第一列获取某一特定位置的行号的R代码示例。数据文件称为my.data。

con <- file("my.data",open="r");
grep("13108", con)

grep does not work.

grep不工作。

I will appreciate if someone can tell me the correct code.

如果有人能告诉我正确的代码,我将不胜感激。

1 个解决方案

#1


4  

Try something along the lines of:

尝试一些类似的事情:

read.csv(pipe("grep 13108 my.data"), ...)

(fill the parameters appropriately for your data)

(为您的数据适当填写参数)

#1


4  

Try something along the lines of:

尝试一些类似的事情:

read.csv(pipe("grep 13108 my.data"), ...)

(fill the parameters appropriately for your data)

(为您的数据适当填写参数)