Even at the risk of being this question labeled as duplicated, I am going to ask since all the related questions I have checked do not solve my problem...
即使冒着被贴上重复标签的风险,我还是会问,因为我检查过的所有相关问题都不能解决我的问题……
I have a labs
vector and I want to find the elements that are exact matches to 3 groups stored in a groups
variable.
我有一个lab向量,我想找到与存储在group变量中的三个组完全匹配的元素。
set.seed(1)
labs <- sample(c(rep('BC-89HX',3), rep('BC-89HX with 2% Puricare + 5% Merquat',3), rep('Own SH',4)), 10)
labs
groups <- c('BC-89HX','BC-89HX with 2% Puricare + 5% Merquat','Own SH')
I want to identify the "BC-89HX" group elements (not the "BC-89HX with 2% Puricare + 5% Merquat" ones)
我想识别“BC-89HX”组元素(而不是“BC-89HX, 2%纯化+ 5%美喹特”)
grep(groups[1], labs, val=TRUE, fixed=TRUE) #finds more elements than the ones I need
grep(paste(groups[1],"$",sep=""), labs, val=TRUE, fixed=TRUE) #does not work
grep(paste("\\b",groups[1],"\\b",sep=""), labs, val=TRUE, fixed=TRUE) #does not work
Any help?
任何帮助吗?
1 个解决方案
#1
2
The solution to be make sure that "BC-89HX" is the only characters in the string and by paste
ing ^
and $
we identify the starting and end position
解决方案是确保“bc - 89 hx”是唯一的字符在字符串和粘贴^和$我们确定起始和结束位置
grep(paste0("^", groups[1], "$"), labs, value=TRUE)
#[1] "BC-89HX" "BC-89HX" "BC-89HX"
In this case, we cannot use the fixed = TRUE
as ^
and $
are metacharacters which imply the start and end location. If we do fixed = TRUE
, it will parse it as literal characters which the 'labs' doesn't have
在这种情况下,我们不能使用固定= TRUE ^和$元字符,意味着开始和结束的位置。如果我们做fixed = TRUE,它会将它解析为文字字符,这是“实验室”没有的
Another option is to use ==
or %in%
as we are comparing fixed strings instead of matching substring in a string
另一种选择是在比较固定字符串时使用==或%,而不是在字符串中匹配子字符串
labs[labs == groups[1]]
#[1] "BC-89HX" "BC-89HX" "BC-89HX"
labs[labs == groups[2]]
#[1] "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat"
Update
If we really wanted to use grep
with fixed = TRUE
, then one way is to paste
in both the pattern
and the strings with the same characters i.e.
如果我们真的想用fixed = TRUE来使用grep,那么一种方法就是在模式和字符串中粘贴相同的字符。
labs[grep(paste0("^", groups[2], "$"), paste0("^", labs, "$"), fixed = TRUE) ]
#[1] "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat"
labs[grep(paste0("^", groups[1], "$"), paste0("^", labs, "$"), fixed = TRUE) ]
#[1] "BC-89HX" "BC-89HX" "BC-89HX"
#1
2
The solution to be make sure that "BC-89HX" is the only characters in the string and by paste
ing ^
and $
we identify the starting and end position
解决方案是确保“bc - 89 hx”是唯一的字符在字符串和粘贴^和$我们确定起始和结束位置
grep(paste0("^", groups[1], "$"), labs, value=TRUE)
#[1] "BC-89HX" "BC-89HX" "BC-89HX"
In this case, we cannot use the fixed = TRUE
as ^
and $
are metacharacters which imply the start and end location. If we do fixed = TRUE
, it will parse it as literal characters which the 'labs' doesn't have
在这种情况下,我们不能使用固定= TRUE ^和$元字符,意味着开始和结束的位置。如果我们做fixed = TRUE,它会将它解析为文字字符,这是“实验室”没有的
Another option is to use ==
or %in%
as we are comparing fixed strings instead of matching substring in a string
另一种选择是在比较固定字符串时使用==或%,而不是在字符串中匹配子字符串
labs[labs == groups[1]]
#[1] "BC-89HX" "BC-89HX" "BC-89HX"
labs[labs == groups[2]]
#[1] "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat"
Update
If we really wanted to use grep
with fixed = TRUE
, then one way is to paste
in both the pattern
and the strings with the same characters i.e.
如果我们真的想用fixed = TRUE来使用grep,那么一种方法就是在模式和字符串中粘贴相同的字符。
labs[grep(paste0("^", groups[2], "$"), paste0("^", labs, "$"), fixed = TRUE) ]
#[1] "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat" "BC-89HX with 2% Puricare + 5% Merquat"
labs[grep(paste0("^", groups[1], "$"), paste0("^", labs, "$"), fixed = TRUE) ]
#[1] "BC-89HX" "BC-89HX" "BC-89HX"