使用R中的gsub（）将x替换为y

My problem is that I want to replace y with x using gsub(), but not for all observations, I want to replace observations in y only with the following observations in x: keyword1 and keyword2.

我的问题是我想用gsub()替换y,但不是所有的观察,我想用x中的以下观察结果替换y中的观察:关键字1和关键字2。

My cols do not contain NA or missing values.

我的cols不包含NA或缺少值。

What I have

是)我有的

x =c('this', 'is', 'some', 'keyword1', 'or', 'terms', 'keyword2')
y =c('SFP', 'VERB', 'ADP', 'NOUN', 'ADP', 'VERB', 'SFP')
toString(y)
toString(x)
df = cbind(x,y)
df = data.frame(df)
df
         x    y
1     this  SFP
2       is VERB
3     some  ADP
4 keyword1 NOUN
5       or  ADP
6    terms VERB
7 keyword2  SFP

What I need:

我需要的:

      x    y
1     this SFP
2       is VERB
3     some ADP
4 keyword1 keyword1
5       or ADP
6    terms VERB
7 keyword2 keyword2

2 个解决方案

#1

You don't need gsub as you don't want to replace the matched character itself. The following code will replace y elements with the keywords where grepl finds a match in column x.

您不需要gsub,因为您不想替换匹配的字符本身。以下代码将使用grepl在列x中找到匹配项的关键字替换y元素。

keywords <- c("keyword1", "keyword2")
for (kw in keywords)
  df$y[grepl(kw, df$x)] <- kw

If you know that the matches will be exact, it is more natural to use:

如果你知道匹配是准确的,那么使用它更自然:

for (kw in keywords)
      df$y[df$x == kw] <- kw.

FYI, you can create the dataframe more easily:

仅供参考,您可以更轻松地创建数据框:

x = c('this', 'is', 'some', 'keyword1', 'or', 'terms', 'keyword2')
y = c('SFP', 'VERB', 'ADP', 'NOUN', 'ADP', 'VERB', 'SFP')
df = data.frame(x, y, stringsAsFactors = FALSE)

#2

As @Rich Scriven suggested, let's first have character columns:

正如@Rich Scriven建议的那样,让我们首先使用字符列:

df <- data.frame(x, y, stringsAsFactors = FALSE)

Then a couple of nice options would be

然后有几个不错的选择

z <- c("keyword1", "keyword2")
df$y[df$x %in% z] <- df$x[df$x %in% z]
# and
df$y <- ifelse(df$x %in% z, df$x, df$y)

gsub is not necessary here as your matches seem to be exact. That is, you are not looking for you keywords somewhere in a certain element of df$y.

这里不需要gsub,因为你的匹配似乎是准确的。也就是说,你不是在df $ y的某个元素中的某个地方寻找你的关键词。

#1