使用R，如何在不打开文件的情况下找到文件中特定短语的行号?

I am new to R. I have a large (3.2 Gb) txt file containing two columns. First column has the human genome sequence position and the Second column has a value corresponding to each position. I want to find the line numbers for specific positions from the first column and then read those lines into a table in R. I cannot import the file because of memory issue. here is an example of the R code I tried to get the line number of one specific position from the first column of my data file. the data file is called my.data.

我是r的新手，我有一个很大的(3.2 Gb) txt文件，包含两个列。第一列有人类基因组序列的位置，第二列有对应于每个位置的值。我想从第一列找到特定位置的行号，然后将这些行读入r中的一个表中，由于内存问题，我无法导入文件。下面是我试图从数据文件的第一列获取某一特定位置的行号的R代码示例。数据文件称为my.data。

con <- file("my.data",open="r");
grep("13108", con)

grep does not work.

grep不工作。

I will appreciate if someone can tell me the correct code.

如果有人能告诉我正确的代码，我将不胜感激。

1 个解决方案

#1

Try something along the lines of:

尝试一些类似的事情:

read.csv(pipe("grep 13108 my.data"), ...)

(fill the parameters appropriately for your data)

(为您的数据适当填写参数)

#1