I have a data frame that may or may not have some particular columns present. I want to select columns using dplyr
if they do exist and, if not, just ignore that I tried to select them. Here's an example:
我有一个数据框,可能有也可能没有某些特定的列。我想使用dplyr选择列,如果它们确实存在,如果不存在,只是忽略我尝试选择它们。这是一个例子:
# Load libraries
library(dplyr)
# Create data frame
df <- data.frame(year = 2000:2010, foo = 0:10, bar = 10:20)
# Pull out some columns
df %>% select(year, contains("bar"))
# Result
# year bar
# 1 2000 10
# 2 2001 11
# 3 2002 12
# 4 2003 13
# 5 2004 14
# 6 2005 15
# 7 2006 16
# 8 2007 17
# 9 2008 18
# 10 2009 19
# 11 2010 20
# Try again for non-existent column
df %>% select(year, contains("boo"))
# Result
#data frame with 0 columns and 11 rows
In the latter case, I just want to return a data frame with the column year
since the column boo
doesn't exist. My question is why do I get an empty data frame in the latter case and what is a good way of avoiding this and achieving the desired result?
在后一种情况下,我只想返回一个包含列年份的数据框,因为列boo不存在。我的问题是为什么在后一种情况下我得到一个空的数据框?什么是避免这种情况并实现预期结果的好方法?
EDIT: Session info
编辑:会话信息
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_0.5.0
loaded via a namespace (and not attached):
[1] lazyeval_0.2.0 magrittr_1.5 R6_2.2.0 assertthat_0.2.0 DBI_0.6-1 tools_3.3.3
[7] tibble_1.3.0 Rcpp_0.12.10
1 个解决方案
#1
8
In the devel version of dplyr
在dplyr的devel版本中
df %>%
select(year, contains("boo"))
# year
#1 2000
#2 2001
#3 2002
#4 2003
#5 2004
#6 2005
#7 2006
#8 2007
#9 2008
#10 2009
#11 2010
gives the expected output
给出预期的输出
Otherwise one option would be to use one_of
否则一个选项是使用one_of
df %>%
select(one_of("year", "boo"))
It returns a warning message if the column is not available
如果列不可用,它将返回警告消息
Other option is matches
其他选项是匹配
df %>%
select(matches("year|boo"))
#1
8
In the devel version of dplyr
在dplyr的devel版本中
df %>%
select(year, contains("boo"))
# year
#1 2000
#2 2001
#3 2002
#4 2003
#5 2004
#6 2005
#7 2006
#8 2007
#9 2008
#10 2009
#11 2010
gives the expected output
给出预期的输出
Otherwise one option would be to use one_of
否则一个选项是使用one_of
df %>%
select(one_of("year", "boo"))
It returns a warning message if the column is not available
如果列不可用,它将返回警告消息
Other option is matches
其他选项是匹配
df %>%
select(matches("year|boo"))