I want to split this vector
我要把这个向量拆分
c("CC", "C/C")
to
来
[[1]]
[1] "C" "C"
[[2]]
[1] "C" "C"
My final data should look like:
我的最终数据应该是:
c("C_C", "C_C")
Thus, I need some regex
, but don't found how to solve the "non-space" part:
因此,我需要一些regex,但是没有找到如何解决“非空间”部分:
strsplit(c("CC", "C/C"),"|/")
3 个解决方案
#1
8
You can use sub
(or gsub
if it occurs more than once in your string) to directly replace either nothing or a forward slash with an underscore (capturing one character words around):
您可以使用sub(或gsub,如果它不止一次出现在您的字符串中),直接用下划线(捕捉一个字符的单词)直接替换任何东西或一个正斜杠:
sub("(\\w)(|/)(\\w)", "\\1_\\3", c("CC", "C/C"))
#[1] "C_C" "C_C"
#2
5
We can split the string at every character, omit the "/" and paste
them together.
我们可以对每个字符分割字符串,省略“/”并将它们粘贴到一起。
sapply(strsplit(x, ""), function(v) paste0(v[v!= "/"], collapse = "_"))
#[1] "C_C" "C_C"
data
数据
x <- c("CC", "C/C")
#3
5
We can use
我们可以使用
lapply(strsplit(v1, "/|"), function(x) x[nzchar(x)])
Or use a regex lookaround
或者使用regex查找
strsplit(v1, "(?<=[^/])(/|)", perl = TRUE)
#[[1]]
#[1] "C" "C"
#[[2]]
#[1] "C" "C"
If the final output should be a vector, then
如果最终输出应该是一个向量,那么
gsub("(?<=[^/])(/|)(?=[^/])", "_", v1, perl = TRUE)
#[1] "C_C" "C_C"
#1
8
You can use sub
(or gsub
if it occurs more than once in your string) to directly replace either nothing or a forward slash with an underscore (capturing one character words around):
您可以使用sub(或gsub,如果它不止一次出现在您的字符串中),直接用下划线(捕捉一个字符的单词)直接替换任何东西或一个正斜杠:
sub("(\\w)(|/)(\\w)", "\\1_\\3", c("CC", "C/C"))
#[1] "C_C" "C_C"
#2
5
We can split the string at every character, omit the "/" and paste
them together.
我们可以对每个字符分割字符串,省略“/”并将它们粘贴到一起。
sapply(strsplit(x, ""), function(v) paste0(v[v!= "/"], collapse = "_"))
#[1] "C_C" "C_C"
data
数据
x <- c("CC", "C/C")
#3
5
We can use
我们可以使用
lapply(strsplit(v1, "/|"), function(x) x[nzchar(x)])
Or use a regex lookaround
或者使用regex查找
strsplit(v1, "(?<=[^/])(/|)", perl = TRUE)
#[[1]]
#[1] "C" "C"
#[[2]]
#[1] "C" "C"
If the final output should be a vector, then
如果最终输出应该是一个向量,那么
gsub("(?<=[^/])(/|)(?=[^/])", "_", v1, perl = TRUE)
#[1] "C_C" "C_C"