R: gsub, pattern = vector, replace = vector

时间:2021-10-25 03:00:40

As the title states, I am trying to use gsub where I use a vector for the "pattern" and "replacement". Currently, I have a code that looks like this:

正如标题所述,我正在尝试使用gsub,在那里我使用一个向量来表示“模式”和“替换”。目前,我的代码是这样的:

  names(x1) <- gsub("2110027599", "Inv1", names(x1)) #x1 is a data frame
  names(x1) <- gsub("2110025622", "Inv2", names(x1))
  names(x1) <- gsub("2110028045", "Inv3", names(x1))
  names(x1) <- gsub("2110034716", "Inv4", names(x1))
  names(x1) <- gsub("2110069349", "Inv5", names(x1))
  names(x1) <- gsub("2110023264", "Inv6", names(x1))

What I hope to do is something like this:

我希望做的是:

  a <- c("2110027599","2110025622","2110028045","2110034716", "2110069349", "2110023264")
  b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")
  names(x1) <- gsub(a,b,names(x1))

I'm guessing there is an apply function somewhere that can do this, but I am not very sure which one to use!

我猜在某个地方有一个apply函数可以做到这一点,但我不太确定该使用哪个函数!

EDIT: names(x1) looks like this (There are many more columns, but I'm leaving them out):

编辑:names(x1)看起来像这样(还有很多列,但我把它们省略了):

> names(x1)
  [1] "2110023264A.Ms.Amp"        "2110023264A.Ms.Vol"        "2110023264A.Ms.Watt"       "2110023264A1.Ms.Amp"      
  [5] "2110023264A2.Ms.Amp"       "2110023264A3.Ms.Amp"       "2110023264A4.Ms.Amp"       "2110023264A5.Ms.Amp"      
  [9] "2110023264B.Ms.Amp"        "2110023264B.Ms.Vol"        "2110023264B.Ms.Watt"       "2110023264B1.Ms.Amp"      
 [13] "2110023264Error"           "2110023264E-Total"         "2110023264GridMs.Hz"       "2110023264GridMs.PhV.phsA"
 [17] "2110023264GridMs.PhV.phsB" "2110023264GridMs.PhV.phsC" "2110023264GridMs.TotPFPrc" "2110023264Inv.TmpLimStt"  
 [21] "2110023264InvCtl.Stt"      "2110023264Mode"            "2110023264Mt.TotOpTmh"     "2110023264Mt.TotTmh"      
 [25] "2110023264Op.EvtCntUsr"    "2110023264Op.EvtNo"        "2110023264Op.GriSwStt"     "2110023264Op.TmsRmg"      
 [29] "2110023264Pac"             "2110023264PlntCtl.Stt"     "2110023264Serial Number"   "2110025622A.Ms.Amp"       
 [33] "2110025622A.Ms.Vol"        "2110025622A.Ms.Watt"       "2110025622A1.Ms.Amp"       "2110025622A2.Ms.Amp"      
 [37] "2110025622A3.Ms.Amp"       "2110025622A4.Ms.Amp"       "2110025622A5.Ms.Amp"       "2110025622B.Ms.Amp"       
 [41] "2110025622B.Ms.Vol"        "2110025622B.Ms.Watt"       "2110025622B1.Ms.Amp"       "2110025622Error"          
 [45] "2110025622E-Total"         "2110025622GridMs.Hz"       "2110025622GridMs.PhV.phsA" "2110025622GridMs.PhV.phsB"

What I hope to get is this:

我希望得到的是:

> names(x1)
  [1] "Inv6A.Ms.Amp"        "Inv6A.Ms.Vol"        "Inv6A.Ms.Watt"       "Inv6A1.Ms.Amp"       "Inv6A2.Ms.Amp"      
  [6] "Inv6A3.Ms.Amp"       "Inv6A4.Ms.Amp"       "Inv6A5.Ms.Amp"       "Inv6B.Ms.Amp"        "Inv6B.Ms.Vol"       
 [11] "Inv6B.Ms.Watt"       "Inv6B1.Ms.Amp"       "Inv6Error"           "Inv6E-Total"         "Inv6GridMs.Hz"      
 [16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC" "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt"  
 [21] "Inv6InvCtl.Stt"      "Inv6Mode"            "Inv6Mt.TotOpTmh"     "Inv6Mt.TotTmh"       "Inv6Op.EvtCntUsr"   
 [26] "Inv6Op.EvtNo"        "Inv6Op.GriSwStt"     "Inv6Op.TmsRmg"       "Inv6Pac"             "Inv6PlntCtl.Stt"    
 [31] "Inv6Serial Number"   "Inv2A.Ms.Amp"        "Inv2A.Ms.Vol"        "Inv2A.Ms.Watt"       "Inv2A1.Ms.Amp"      
 [36] "Inv2A2.Ms.Amp"       "Inv2A3.Ms.Amp"       "Inv2A4.Ms.Amp"       "Inv2A5.Ms.Amp"       "Inv2B.Ms.Amp"       
 [41] "Inv2B.Ms.Vol"        "Inv2B.Ms.Watt"       "Inv2B1.Ms.Amp"       "Inv2Error"           "Inv2E-Total"        
 [46] "Inv2GridMs.Hz"       "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB" 

5 个解决方案

#1


21  

Lot's of solutions already, here are one more:

很多的解决方案,这里还有一个:

The qdap package:

qdap包:

library(qdap)
names(x1) <- mgsub(a,b,names(x1))

#2


9  

From stringr documentation of str_replace_all, "If you want to apply multiple patterns and replacements to the same string, pass a named version to pattern."

在str_replace_all的stringr文档中,“如果您想对相同的字符串应用多个模式和替换,请将命名的版本传递给模式。”

Thus using a, b, and names(x1) from above

因此使用上面的a、b和名称(x1)

library(stringr)
names(b) <- a
str_replace_all(names(x1), b)

#3


8  

New Answer

If we can make another assumption, the following should work. The assumption this time is that you are really interested in substituting the first 10 characters from each value in names(x1).

如果我们可以做另一个假设,下面的方法应该是有效的。这次的假设是,您真正感兴趣的是用名称(x1)中的每个值替换前10个字符。

Here, I've stored names(x1) as a character vector named "X1". The solution essentially uses substr to separate the values in X1 into 2 parts, match to figure out the correct replacement option, and paste to put everything back together.

这里,我将名称(x1)存储为一个名为“x1”的字符向量。这个解决方案本质上是使用substr将X1中的值分离为2个部分,匹配以找出正确的替换选项,然后粘贴以将所有东西重新放在一起。

a <- c("2110027599", "2110025622", "2110028045",
       "2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")

X1pre <- substr(X1, 1, 10)
X1post <- substr(X1, 11, max(nchar(X1)))

paste0(b[match(X1pre, a)], X1post)
#  [1] "Inv6A.Ms.Amp"        "Inv6A.Ms.Vol"        "Inv6A.Ms.Watt"      
#  [4] "Inv6A1.Ms.Amp"       "Inv6A2.Ms.Amp"       "Inv6A3.Ms.Amp"      
#  [7] "Inv6A4.Ms.Amp"       "Inv6A5.Ms.Amp"       "Inv6B.Ms.Amp"       
# [10] "Inv6B.Ms.Vol"        "Inv6B.Ms.Watt"       "Inv6B1.Ms.Amp"      
# [13] "Inv6Error"           "Inv6E-Total"         "Inv6GridMs.Hz"      
# [16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC"
# [19] "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt"   "Inv6InvCtl.Stt"     
# [22] "Inv6Mode"            "Inv6Mt.TotOpTmh"     "Inv6Mt.TotTmh"      
# [25] "Inv6Op.EvtCntUsr"    "Inv6Op.EvtNo"        "Inv6Op.GriSwStt"    
# [28] "Inv6Op.TmsRmg"       "Inv6Pac"             "Inv6PlntCtl.Stt"    
# [31] "Inv6Serial Number"   "Inv2A.Ms.Amp"        "Inv2A.Ms.Vol"       
# [34] "Inv2A.Ms.Watt"       "Inv2A1.Ms.Amp"       "Inv2A2.Ms.Amp"      
# [37] "Inv2A3.Ms.Amp"       "Inv2A4.Ms.Amp"       "Inv2A5.Ms.Amp"      
# [40] "Inv2B.Ms.Amp"        "Inv2B.Ms.Vol"        "Inv2B.Ms.Watt"      
# [43] "Inv2B1.Ms.Amp"       "Inv2Error"           "Inv2E-Total"        
# [46] "Inv2GridMs.Hz"       "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB"

Old Answer

If we can assume that names(x1) is in the same order as the pattern and replacement and that it is basically a one-for-one replacement, you might be able to get away with just sapply.

如果我们可以假设名称(x1)与模式和替换的顺序相同,并且它基本上是一对一的替换,那么您可能只需要sapply。

Here's an example of that particular situation:

这里有一个特殊情况的例子:

Imagine "names(x)" looks something like this:

想象一下“名字(x)”看起来是这样的:

X1 <- paste0("A2", a, sequence(length(a)))
X1
# [1] "A221100275991" "A221100256222" "A221100280453" 
# [4] "A221100347164" "A221100693495" "A221100232646"

Here's our pattern and replacement vectors:

这是我们的模式和替换向量:

a <- c("2110027599", "2110025622", "2110028045", 
       "2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")

This is how we might use sapply if these assumptions are valid.

如果这些假设是有效的,这就是我们使用sapply的方式。

sapply(seq_along(a), function(x) gsub(a[x], b[x], X1[x]))
# [1] "A2Inv11" "A2Inv22" "A2Inv33" "A2Inv44" "A2Inv55" "A2Inv66"

#4


2  

Somehow names<- and match seems much more appropriate here...

不知何故,名字 <和匹配在这里似乎更合适……< p>

names( x1 ) <- b[ match( names( x1 ) , a ) ]

But I am making the assumption that the elements of vector a are the actual names of your data.frame.

但是我假设向量a的元素是数据。frame的实际名称。

If a really is a pattern found within each of the names of x1 then this grepl approach with names<- could be useful...

如果a确实是在x1的每个名称中发现的一种模式,那么使用name <-的grepl方法可能会很有用……

new <- sapply( a , grepl , x = names( x1 ) )
names( x1 ) <- b[ apply( new , 1 , which.max ) ]

#5


1  

Try mapply.

宾州。

names(x1) <- mapply(gsub, a, b, names(x1), USE.NAMES = FALSE)

Or, even easier, str_replace from stringr.

或者,更简单的,str_replace从stringr。

library(stringr)
names(x1) <- str_replace(names(x1), a, b)

#1


21  

Lot's of solutions already, here are one more:

很多的解决方案,这里还有一个:

The qdap package:

qdap包:

library(qdap)
names(x1) <- mgsub(a,b,names(x1))

#2


9  

From stringr documentation of str_replace_all, "If you want to apply multiple patterns and replacements to the same string, pass a named version to pattern."

在str_replace_all的stringr文档中,“如果您想对相同的字符串应用多个模式和替换,请将命名的版本传递给模式。”

Thus using a, b, and names(x1) from above

因此使用上面的a、b和名称(x1)

library(stringr)
names(b) <- a
str_replace_all(names(x1), b)

#3


8  

New Answer

If we can make another assumption, the following should work. The assumption this time is that you are really interested in substituting the first 10 characters from each value in names(x1).

如果我们可以做另一个假设,下面的方法应该是有效的。这次的假设是,您真正感兴趣的是用名称(x1)中的每个值替换前10个字符。

Here, I've stored names(x1) as a character vector named "X1". The solution essentially uses substr to separate the values in X1 into 2 parts, match to figure out the correct replacement option, and paste to put everything back together.

这里,我将名称(x1)存储为一个名为“x1”的字符向量。这个解决方案本质上是使用substr将X1中的值分离为2个部分,匹配以找出正确的替换选项,然后粘贴以将所有东西重新放在一起。

a <- c("2110027599", "2110025622", "2110028045",
       "2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")

X1pre <- substr(X1, 1, 10)
X1post <- substr(X1, 11, max(nchar(X1)))

paste0(b[match(X1pre, a)], X1post)
#  [1] "Inv6A.Ms.Amp"        "Inv6A.Ms.Vol"        "Inv6A.Ms.Watt"      
#  [4] "Inv6A1.Ms.Amp"       "Inv6A2.Ms.Amp"       "Inv6A3.Ms.Amp"      
#  [7] "Inv6A4.Ms.Amp"       "Inv6A5.Ms.Amp"       "Inv6B.Ms.Amp"       
# [10] "Inv6B.Ms.Vol"        "Inv6B.Ms.Watt"       "Inv6B1.Ms.Amp"      
# [13] "Inv6Error"           "Inv6E-Total"         "Inv6GridMs.Hz"      
# [16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC"
# [19] "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt"   "Inv6InvCtl.Stt"     
# [22] "Inv6Mode"            "Inv6Mt.TotOpTmh"     "Inv6Mt.TotTmh"      
# [25] "Inv6Op.EvtCntUsr"    "Inv6Op.EvtNo"        "Inv6Op.GriSwStt"    
# [28] "Inv6Op.TmsRmg"       "Inv6Pac"             "Inv6PlntCtl.Stt"    
# [31] "Inv6Serial Number"   "Inv2A.Ms.Amp"        "Inv2A.Ms.Vol"       
# [34] "Inv2A.Ms.Watt"       "Inv2A1.Ms.Amp"       "Inv2A2.Ms.Amp"      
# [37] "Inv2A3.Ms.Amp"       "Inv2A4.Ms.Amp"       "Inv2A5.Ms.Amp"      
# [40] "Inv2B.Ms.Amp"        "Inv2B.Ms.Vol"        "Inv2B.Ms.Watt"      
# [43] "Inv2B1.Ms.Amp"       "Inv2Error"           "Inv2E-Total"        
# [46] "Inv2GridMs.Hz"       "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB"

Old Answer

If we can assume that names(x1) is in the same order as the pattern and replacement and that it is basically a one-for-one replacement, you might be able to get away with just sapply.

如果我们可以假设名称(x1)与模式和替换的顺序相同,并且它基本上是一对一的替换,那么您可能只需要sapply。

Here's an example of that particular situation:

这里有一个特殊情况的例子:

Imagine "names(x)" looks something like this:

想象一下“名字(x)”看起来是这样的:

X1 <- paste0("A2", a, sequence(length(a)))
X1
# [1] "A221100275991" "A221100256222" "A221100280453" 
# [4] "A221100347164" "A221100693495" "A221100232646"

Here's our pattern and replacement vectors:

这是我们的模式和替换向量:

a <- c("2110027599", "2110025622", "2110028045", 
       "2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")

This is how we might use sapply if these assumptions are valid.

如果这些假设是有效的,这就是我们使用sapply的方式。

sapply(seq_along(a), function(x) gsub(a[x], b[x], X1[x]))
# [1] "A2Inv11" "A2Inv22" "A2Inv33" "A2Inv44" "A2Inv55" "A2Inv66"

#4


2  

Somehow names<- and match seems much more appropriate here...

不知何故,名字 <和匹配在这里似乎更合适……< p>

names( x1 ) <- b[ match( names( x1 ) , a ) ]

But I am making the assumption that the elements of vector a are the actual names of your data.frame.

但是我假设向量a的元素是数据。frame的实际名称。

If a really is a pattern found within each of the names of x1 then this grepl approach with names<- could be useful...

如果a确实是在x1的每个名称中发现的一种模式,那么使用name <-的grepl方法可能会很有用……

new <- sapply( a , grepl , x = names( x1 ) )
names( x1 ) <- b[ apply( new , 1 , which.max ) ]

#5


1  

Try mapply.

宾州。

names(x1) <- mapply(gsub, a, b, names(x1), USE.NAMES = FALSE)

Or, even easier, str_replace from stringr.

或者,更简单的,str_replace从stringr。

library(stringr)
names(x1) <- str_replace(names(x1), a, b)