In a string
在一个字符串
string="aaaaaaaaaSTARTbbbbbbbbbbSTOPccccccccSTARTddddddddddSTOPeeeeeee"
I would like to remove all parts that occur between START and STOP, yielding
我想把开始和停止之间发生的所有部分都去掉
"aaaaaaaaacccccccceeeeeee"
if I try with gsub("START(.*)STOP","",string)
this gives me "aaaaaaaaaeeeeeee"
though.
如果我尝试使用gsub(“START(.*)STOP”、“”、“string”),会得到“aaaaaaaaaaaaaaaaaaaaaeeeee”。
What would be the correct way to do this, allowing for multiple occurrences of START and STOP?
如果允许多次出现启动和停止,正确的方法是什么?
2 个解决方案
#1
3
Add a ?
in there too.
添加一个吗?也在那里。
gsub("START.*?STOP", "", string)
# [1] "aaaaaaaaacccccccceeeeeee"
#2
0
Not nearly as elegant as Ananda's answer, but there are some other ways using the stringr & plyr packages.
虽然没有Ananda的回答那么优雅,但是使用stringr和plyr包还有其他一些方法。
library(stringr)
library(plyr)
start <- ldply(str_locate_all(string, 'START'))[1, 1]
end <- ldply(str_locate_all(string, 'STOP'))
end <- end[nrow(end), 2]
expression <- str_sub(string, start, end)
str_replace(string, expression, '')
#1
3
Add a ?
in there too.
添加一个吗?也在那里。
gsub("START.*?STOP", "", string)
# [1] "aaaaaaaaacccccccceeeeeee"
#2
0
Not nearly as elegant as Ananda's answer, but there are some other ways using the stringr & plyr packages.
虽然没有Ananda的回答那么优雅,但是使用stringr和plyr包还有其他一些方法。
library(stringr)
library(plyr)
start <- ldply(str_locate_all(string, 'START'))[1, 1]
end <- ldply(str_locate_all(string, 'STOP'))
end <- end[nrow(end), 2]
expression <- str_sub(string, start, end)
str_replace(string, expression, '')