I am new to R. I have a large (3.2 Gb) txt file containing two columns. First column has the human genome sequence position and the Second column has a value corresponding to each position. I want to find the line numbers for specific positions from the first column and then read those lines into a table in R. I cannot import the file because of memory issue. here is an example of the R code I tried to get the line number of one specific position from the first column of my data file. the data file is called my.data.
我是r的新手,我有一个很大的(3.2 Gb) txt文件,包含两个列。第一列有人类基因组序列的位置,第二列有对应于每个位置的值。我想从第一列找到特定位置的行号,然后将这些行读入r中的一个表中,由于内存问题,我无法导入文件。下面是我试图从数据文件的第一列获取某一特定位置的行号的R代码示例。数据文件称为my.data。
con <- file("my.data",open="r");
grep("13108", con)
grep does not work.
grep不工作。
I will appreciate if someone can tell me the correct code.
如果有人能告诉我正确的代码,我将不胜感激。
1 个解决方案
#1
4
Try something along the lines of:
尝试一些类似的事情:
read.csv(pipe("grep 13108 my.data"), ...)
(fill the parameters appropriately for your data)
(为您的数据适当填写参数)
#1
4
Try something along the lines of:
尝试一些类似的事情:
read.csv(pipe("grep 13108 my.data"), ...)
(fill the parameters appropriately for your data)
(为您的数据适当填写参数)