I have a dataframe like x where the column genes is a factor. I want to remove all the rows where column genes has nothing. So in table X I want to remove row 4. Is there a way to do this for a large dataframe?
我有一个像x这样的数据框,其中列基因是一个因素。我想删除列基因什么都没有的所有行。所以在表X中我想删除第4行。有没有办法为大型数据帧执行此操作?
X
names values genes
1 A 0.2876113 EEF1A1
2 B 0.6681894 GAPDH
3 C 0.1375420 SLC35E2
4 D -1.9063386
5 E -0.4949905 RPS28
Finally result:
最后结果:
X
names values genes
1 A 0.2876113 EEF1A1
2 B 0.6681894 GAPDH
3 C 0.1375420 SLC35E2
5 E -0.4949905 RPS28
Thank you all!
谢谢你们!
2 个解决方案
#1
22
It's not completely obvious from your question what the empty values are, but you should be able to adopt the solution below (here I assume the 'empty' values are empty strings):
从您的问题来看,空值是什么并不完全明显,但您应该能够采用下面的解决方案(这里我假设'空'值是空字符串):
toBeRemoved<-which(X$genes=="")
X<-X[-toBeRemoved,]
#2
10
@Nick Sabbe provided a great answer, but it has one caveat:
@Nick Sabbe提供了一个很好的答案,但有一点需要注意:
Using -which(...)
is a neat trick to (sometimes) speed up the subsetting operation when there are only a few elements to remove.
使用-which(...)是一个巧妙的技巧(有时)加速子集操作,只需要删除几个元素。
...But if there are no elements to remove, it fails!
...但如果没有要删除的元素,它就会失败!
So, if X$genes
does not contain any empty strings, which
will return an empty integer vector. Negating that is still an empty vector. And X[integer(0)] returns an empty data.frame!
因此,如果X $基因不包含任何空字符串,则返回空整数向量。否定这仍然是一个空的向量。并且X [integer(0)]返回一个空的data.frame!
toBeRemoved <- which(X$genes=="")
if (length(toBeRemoved>0)) { # MUST check for 0-length
X<-X[-toBeRemoved,]
}
Or, if the speed gain isn't important, simply:
或者,如果速度增益不重要,只需:
X<-X[X$genes!="",]
Or, as @nullglob pointed out,
或者,正如@nullglob指出的那样,
subset(X, genes != "")
#1
22
It's not completely obvious from your question what the empty values are, but you should be able to adopt the solution below (here I assume the 'empty' values are empty strings):
从您的问题来看,空值是什么并不完全明显,但您应该能够采用下面的解决方案(这里我假设'空'值是空字符串):
toBeRemoved<-which(X$genes=="")
X<-X[-toBeRemoved,]
#2
10
@Nick Sabbe provided a great answer, but it has one caveat:
@Nick Sabbe提供了一个很好的答案,但有一点需要注意:
Using -which(...)
is a neat trick to (sometimes) speed up the subsetting operation when there are only a few elements to remove.
使用-which(...)是一个巧妙的技巧(有时)加速子集操作,只需要删除几个元素。
...But if there are no elements to remove, it fails!
...但如果没有要删除的元素,它就会失败!
So, if X$genes
does not contain any empty strings, which
will return an empty integer vector. Negating that is still an empty vector. And X[integer(0)] returns an empty data.frame!
因此,如果X $基因不包含任何空字符串,则返回空整数向量。否定这仍然是一个空的向量。并且X [integer(0)]返回一个空的data.frame!
toBeRemoved <- which(X$genes=="")
if (length(toBeRemoved>0)) { # MUST check for 0-length
X<-X[-toBeRemoved,]
}
Or, if the speed gain isn't important, simply:
或者,如果速度增益不重要,只需:
X<-X[X$genes!="",]
Or, as @nullglob pointed out,
或者,正如@nullglob指出的那样,
subset(X, genes != "")