My problem is that I want to replace y with x using gsub(), but not for all observations, I want to replace observations in y only with the following observations in x: keyword1 and keyword2.
我的问题是我想用gsub()替换y,但不是所有的观察,我想用x中的以下观察结果替换y中的观察:关键字1和关键字2。
My cols do not contain NA or missing values.
我的cols不包含NA或缺少值。
What I have
是)我有的
x =c('this', 'is', 'some', 'keyword1', 'or', 'terms', 'keyword2')
y =c('SFP', 'VERB', 'ADP', 'NOUN', 'ADP', 'VERB', 'SFP')
toString(y)
toString(x)
df = cbind(x,y)
df = data.frame(df)
df
x y
1 this SFP
2 is VERB
3 some ADP
4 keyword1 NOUN
5 or ADP
6 terms VERB
7 keyword2 SFP
What I need:
我需要的:
x y
1 this SFP
2 is VERB
3 some ADP
4 keyword1 keyword1
5 or ADP
6 terms VERB
7 keyword2 keyword2
2 个解决方案
#1
1
You don't need gsub
as you don't want to replace the matched character itself. The following code will replace y
elements with the keywords where grepl
finds a match in column x
.
您不需要gsub,因为您不想替换匹配的字符本身。以下代码将使用grepl在列x中找到匹配项的关键字替换y元素。
keywords <- c("keyword1", "keyword2")
for (kw in keywords)
df$y[grepl(kw, df$x)] <- kw
If you know that the matches will be exact, it is more natural to use:
如果你知道匹配是准确的,那么使用它更自然:
for (kw in keywords)
df$y[df$x == kw] <- kw.
FYI, you can create the dataframe more easily:
仅供参考,您可以更轻松地创建数据框:
x = c('this', 'is', 'some', 'keyword1', 'or', 'terms', 'keyword2')
y = c('SFP', 'VERB', 'ADP', 'NOUN', 'ADP', 'VERB', 'SFP')
df = data.frame(x, y, stringsAsFactors = FALSE)
#2
1
As @Rich Scriven suggested, let's first have character columns:
正如@Rich Scriven建议的那样,让我们首先使用字符列:
df <- data.frame(x, y, stringsAsFactors = FALSE)
Then a couple of nice options would be
然后有几个不错的选择
z <- c("keyword1", "keyword2")
df$y[df$x %in% z] <- df$x[df$x %in% z]
# and
df$y <- ifelse(df$x %in% z, df$x, df$y)
gsub
is not necessary here as your matches seem to be exact. That is, you are not looking for you keywords somewhere in a certain element of df$y
.
这里不需要gsub,因为你的匹配似乎是准确的。也就是说,你不是在df $ y的某个元素中的某个地方寻找你的关键词。
#1
1
You don't need gsub
as you don't want to replace the matched character itself. The following code will replace y
elements with the keywords where grepl
finds a match in column x
.
您不需要gsub,因为您不想替换匹配的字符本身。以下代码将使用grepl在列x中找到匹配项的关键字替换y元素。
keywords <- c("keyword1", "keyword2")
for (kw in keywords)
df$y[grepl(kw, df$x)] <- kw
If you know that the matches will be exact, it is more natural to use:
如果你知道匹配是准确的,那么使用它更自然:
for (kw in keywords)
df$y[df$x == kw] <- kw.
FYI, you can create the dataframe more easily:
仅供参考,您可以更轻松地创建数据框:
x = c('this', 'is', 'some', 'keyword1', 'or', 'terms', 'keyword2')
y = c('SFP', 'VERB', 'ADP', 'NOUN', 'ADP', 'VERB', 'SFP')
df = data.frame(x, y, stringsAsFactors = FALSE)
#2
1
As @Rich Scriven suggested, let's first have character columns:
正如@Rich Scriven建议的那样,让我们首先使用字符列:
df <- data.frame(x, y, stringsAsFactors = FALSE)
Then a couple of nice options would be
然后有几个不错的选择
z <- c("keyword1", "keyword2")
df$y[df$x %in% z] <- df$x[df$x %in% z]
# and
df$y <- ifelse(df$x %in% z, df$x, df$y)
gsub
is not necessary here as your matches seem to be exact. That is, you are not looking for you keywords somewhere in a certain element of df$y
.
这里不需要gsub,因为你的匹配似乎是准确的。也就是说,你不是在df $ y的某个元素中的某个地方寻找你的关键词。