用gsub替换除某些字符串之外的字符

时间:2021-08-18 16:53:32

I'm trying to replace characters in a column that do not match the pattern in a gsub function.

我试图替换与gsub函数中的模式不匹配的列中的字符。

data column:

数据列:

library(tidyverse)

df <- structure(list(partij_kort = c("COMBGB", "VVD", "GL", "NIEUWEL", 
"CDA")), .Names = "partij_kort", row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

  partij_kort
  <chr>      
1 COMBGB     
2 VVD        
3 GL         
4 NIEUWEL    
5 CDA 

This code does the opposite what I want:

这段代码与我想要的相反:

df %>% mutate(new = gsub("VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL",
                         "something",
                         partij_kort))

  partij_kort new      
  <chr>       <chr>    
1 COMBGB      COMBGB   
2 VVD         something
3 GL          something
4 NIEUWEL     NIEUWEL  
5 CDA         something

I want every string that's not in that pattern (COMBGB and NIEUWEL) to change in something.

我想要每一个不属于那个模式的字符串(战斗和NIEUWEL)来改变一些东西。

But the exclamtion mark ! doesn't work with gsub (I use it a lot with grepl).

但是惊叹号!不使用gsub(我经常使用grepl)。

Desired outcome:

期望结果:

  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA 

What's the best way to do this?

最好的方法是什么?

3 个解决方案

#1


1  

Actually, no regex is needed, imo:

实际上,不需要regex,在我看来:

library(dplyr)

exceptions <- c("VVD","GL","CDA","CU","D66","PVDA","CUSGP","SGP","PVDAGL")

df %>%
  mutate(new = if_else(!(partij_kort %in% exceptions), 
                       "something", 
                       partij_kort))

This yields

这个收益率

# A tibble: 5 x 2
  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA      

#2


1  

You need to use perl=TRUE in gsub and a regex negating your selection.

您需要在gsub中使用perl=TRUE,并使用regex来否定您的选择。

library(tidyverse)

df <- structure(list(partij_kort = c("COMBGB", "VVD", "GL", "NIEUWEL", "CDA", "anything", "good" ,"bad","whtever")), 
                .Names = "partij_kort", 
                row.names = c(NA, -9L), 
                class = c("tbl_df", "tbl", "data.frame"))

df %>% mutate(new = gsub("^((?!(VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL)).)*$",
                         "something", partij_kort, perl = TRUE))


# A tibble: 9 x 2
  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA      
6 anything    something
7 good        something
8 bad         something
9 whtever     something

Thank You

谢谢你!

#3


0  

You can also use replace with grepl like below:

您也可以使用如下所示的grepl替换:

library(tidyverse)
df %>% mutate(new = replace(partij_kort , !grepl("VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL",
                         partij_kort),"something"))


# A tibble: 5 x 2
#  partij_kort       new
#        <chr>     <chr>
#1      COMBGB something
#2         VVD       VVD
#3          GL        GL
#4     NIEUWEL something
#5         CDA       CDA

#1


1  

Actually, no regex is needed, imo:

实际上,不需要regex,在我看来:

library(dplyr)

exceptions <- c("VVD","GL","CDA","CU","D66","PVDA","CUSGP","SGP","PVDAGL")

df %>%
  mutate(new = if_else(!(partij_kort %in% exceptions), 
                       "something", 
                       partij_kort))

This yields

这个收益率

# A tibble: 5 x 2
  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA      

#2


1  

You need to use perl=TRUE in gsub and a regex negating your selection.

您需要在gsub中使用perl=TRUE,并使用regex来否定您的选择。

library(tidyverse)

df <- structure(list(partij_kort = c("COMBGB", "VVD", "GL", "NIEUWEL", "CDA", "anything", "good" ,"bad","whtever")), 
                .Names = "partij_kort", 
                row.names = c(NA, -9L), 
                class = c("tbl_df", "tbl", "data.frame"))

df %>% mutate(new = gsub("^((?!(VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL)).)*$",
                         "something", partij_kort, perl = TRUE))


# A tibble: 9 x 2
  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA      
6 anything    something
7 good        something
8 bad         something
9 whtever     something

Thank You

谢谢你!

#3


0  

You can also use replace with grepl like below:

您也可以使用如下所示的grepl替换:

library(tidyverse)
df %>% mutate(new = replace(partij_kort , !grepl("VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL",
                         partij_kort),"something"))


# A tibble: 5 x 2
#  partij_kort       new
#        <chr>     <chr>
#1      COMBGB something
#2         VVD       VVD
#3          GL        GL
#4     NIEUWEL something
#5         CDA       CDA