I have a character vector containing variable names such as x <- c("AB.38.2", "GF.40.4", "ABC.34.2")
. I want to extract the letters so that I have a character vector now containing only the letters e.g. c("AB", "GF", "ABC")
.
我有一个包含变量名称的字符向量,例如x < - c(“AB.38.2”,“GF.40.4”,“ABC.34.2”)。我想提取字母,以便我有一个字符向量,现在只包含字母,例如c(“AB”,“GF”,“ABC”)。
Because the number of letters varies, I cannot use substring
to specify the first and last characters.
因为字母数量不同,我不能使用子字符串来指定第一个和最后一个字符。
How can I go about this?
我怎么能这样做?
5 个解决方案
#1
3
you can try
你可以试试
sub("^([[:alpha:]]*).*", "\\1", x)
[1] "AB" "GF" "ABC"
#2
2
None of the answers work if you have mixed letter with spaces. Here is what I'm doing for those cases:
如果你有空格的混合字母,这些答案都不起作用。以下是我为这些案例所做的事情:
x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd")
unique(na.omit(unlist(strsplit(unlist(x), "[^a-zA-Z]+"))))
[1] "AB" "GF" "ABC" "A" "B" "C" "Fd"
[1]“AB”“GF”“ABC”“A”“B”“C”“Fd”
#3
2
This is how I managed to solve this problem. I use this because it returns the 5 items cleanly and I can control if i want a space in between the words:
这就是我设法解决这个问题的方法。我使用它是因为它可以干净地返回5个项目,我可以控制是否需要在单词之间留出空格:
x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd", " a")
extract.alpha <- function(x, space = ""){
require(stringr)
require(purrr)
require(magrittr)
y <- strsplit(unlist(x), "[^a-zA-Z]+")
z <- y %>% map(~paste(., collapse = space)) %>% simplify()
return(z)}
extract.alpha(x, space = " ")
#4
0
I realize this is an old question but since I was looking for a similar answer just now and found it, I thought I'd share.
我意识到这是一个老问题,但由于我现在正在寻找类似的答案并找到它,我想我会分享。
The simplest and fastest solution I found myself:
我发现自己最简单,最快速的解决方案:
x <- c("AB.38.2", "GF.40.4", "ABC.34.2")
only_letters <- function(x) { gsub("^([[:alpha:]]*).*$","\\1",x) }
only_letters(x)
And the output is:
输出是:
[1] "AB" "GF" "ABC"
Hope this helps someone!
希望这有助于某人!
#5
0
The previous answers seem more complicated than necessary. This question regarding digits also works with letters:
以前的答案似乎比必要的复杂。关于数字的这个问题也适用于字母:
> x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd", " a")
> gsub("[^a-zA-Z]", "", x)
[1] "AB" "GF" "ABC" "ABCFd" "a"
#1
3
you can try
你可以试试
sub("^([[:alpha:]]*).*", "\\1", x)
[1] "AB" "GF" "ABC"
#2
2
None of the answers work if you have mixed letter with spaces. Here is what I'm doing for those cases:
如果你有空格的混合字母,这些答案都不起作用。以下是我为这些案例所做的事情:
x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd")
unique(na.omit(unlist(strsplit(unlist(x), "[^a-zA-Z]+"))))
[1] "AB" "GF" "ABC" "A" "B" "C" "Fd"
[1]“AB”“GF”“ABC”“A”“B”“C”“Fd”
#3
2
This is how I managed to solve this problem. I use this because it returns the 5 items cleanly and I can control if i want a space in between the words:
这就是我设法解决这个问题的方法。我使用它是因为它可以干净地返回5个项目,我可以控制是否需要在单词之间留出空格:
x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd", " a")
extract.alpha <- function(x, space = ""){
require(stringr)
require(purrr)
require(magrittr)
y <- strsplit(unlist(x), "[^a-zA-Z]+")
z <- y %>% map(~paste(., collapse = space)) %>% simplify()
return(z)}
extract.alpha(x, space = " ")
#4
0
I realize this is an old question but since I was looking for a similar answer just now and found it, I thought I'd share.
我意识到这是一个老问题,但由于我现在正在寻找类似的答案并找到它,我想我会分享。
The simplest and fastest solution I found myself:
我发现自己最简单,最快速的解决方案:
x <- c("AB.38.2", "GF.40.4", "ABC.34.2")
only_letters <- function(x) { gsub("^([[:alpha:]]*).*$","\\1",x) }
only_letters(x)
And the output is:
输出是:
[1] "AB" "GF" "ABC"
Hope this helps someone!
希望这有助于某人!
#5
0
The previous answers seem more complicated than necessary. This question regarding digits also works with letters:
以前的答案似乎比必要的复杂。关于数字的这个问题也适用于字母:
> x <- c("AB.38.2", "GF.40.4", "ABC.34.2", "A B ..C 312, Fd", " a")
> gsub("[^a-zA-Z]", "", x)
[1] "AB" "GF" "ABC" "ABCFd" "a"