I would like to choose rows based on the subsets of their names, for example
例如,我想根据它们的名称的子集来选择行。
If I have the following data:
如果我有以下数据:
data <- structure(c(91, 92, 108, 104, 87, 91, 91, 97, 81, 98),
.Names = c("fee-", "fi", "fo-", "fum-", "foo-", "foo1234-", "123foo-",
"fum-", "fum-", "fum-"))
how do I select the rows matching 'foo'?
如何选择匹配“foo”的行?
using grep() doesn't work:
使用grep()不工作:
grep('foo', data)
returns:
返回:
integer(0)
what am I doing wrong? or, is there a better way?
我做错了什么?或者,有没有更好的方法?
Thanks!
谢谢!
3 个解决方案
#1
27
You need to grep the names property of data, not the values property.
您需要grep命名数据的属性,而不是值属性。
For your example, use
在你的例子中,使用
> grep("foo",names(data))
[1] 5 6 7
> data[grep("foo",names(data))]
foo- foo1234- 123foo-
87 91 91
One other clean way to do this is using data frames.
另一种干净的方法是使用数据帧。
> data <- data.frame(values=c(91, 92, 108, 104, 87, 91, 91, 97, 81, 98),
names = c("fee-", "fi", "fo-", "fum-", "foo-", "foo1234-", "123foo-",
"fum-", "fum-", "fum-"))
> data$values[grep("foo",data$names)]
[1] 87 91 91
#2
6
Use subset in combination with regular expressions:
与正则表达式结合使用子集:
subset(your_data, regexpr("foo", your_data$your_column_to_match) > 0))
If you just care about a dataset with one column I guess you do not need to specify a column name...
如果您只关心一个包含一个列的数据集,那么您不需要指定一个列名……
Philip
菲利普
#3
2
> grep("foo",names(data), value=T)
[1] "foo-" "foo1234-" "123foo-"
if value is true, it returns the content instead of the index
如果值为true,它将返回内容而不是索引。
#1
27
You need to grep the names property of data, not the values property.
您需要grep命名数据的属性,而不是值属性。
For your example, use
在你的例子中,使用
> grep("foo",names(data))
[1] 5 6 7
> data[grep("foo",names(data))]
foo- foo1234- 123foo-
87 91 91
One other clean way to do this is using data frames.
另一种干净的方法是使用数据帧。
> data <- data.frame(values=c(91, 92, 108, 104, 87, 91, 91, 97, 81, 98),
names = c("fee-", "fi", "fo-", "fum-", "foo-", "foo1234-", "123foo-",
"fum-", "fum-", "fum-"))
> data$values[grep("foo",data$names)]
[1] 87 91 91
#2
6
Use subset in combination with regular expressions:
与正则表达式结合使用子集:
subset(your_data, regexpr("foo", your_data$your_column_to_match) > 0))
If you just care about a dataset with one column I guess you do not need to specify a column name...
如果您只关心一个包含一个列的数据集,那么您不需要指定一个列名……
Philip
菲利普
#3
2
> grep("foo",names(data), value=T)
[1] "foo-" "foo1234-" "123foo-"
if value is true, it returns the content instead of the index
如果值为true,它将返回内容而不是索引。