在R中提取字符串中的最后一个单词

时间:2021-03-20 18:51:46

What's the most elegant way to extract the last word in a sentence string?

提取句子字符串中最后一个单词的最优雅方法是什么?

The sentence does not end with a "." Words are seperated by blanks.

句子不以“。”结尾。单词由空格分隔。

sentence <- "The quick brown fox"
TheFunction(sentence)

should return: "fox"

应该回归:“狐狸”

I do not want to use a package if a simple solution is possible. If a simple solution based on package exists, that is also fine.

如果可以使用简单的解决方案,我不想使用包。如果存在基于包的简单解决方案,那也没关系。

5 个解决方案

#1


17  

tail(strsplit('this is a sentence',split=" ")[[1]],1)

Basically as suggested by @Señor O.

基本上是@SeñorO的建议。

#2


37  

Just for completeness: The library stringr contains a function for exactly this problem.

只是为了完整性:库字符串包含一个完全解决此问题的函数。

library(stringr)

sentence <- "The quick brown fox"
word(sentence,-1)
[1] "fox"

#3


11  

x <- 'The quick brown fox'
sub('^.* ([[:alnum:]]+)$', '\\1', x)

That will catch the last string of numbers and characters before then end of the string.

这将捕获字符串之前的最后一串数字和字符。

You can also use the regexec and regmatches functions, but I find sub cleaner:

你也可以使用regexec和regmatches函数,但我找到sub cleaner:

m <- regexec('^.* ([[:alnum:]]+)$', x)
regmatches(x, m)

See ?regex and ?sub for more info.

有关详细信息,请参阅?regex和?sub。

#4


11  

Another packaged option is stri_extract_last_words() from the stringi package

另一个打包选项是stringi包中的stri_extract_last_words()

library(stringi)

stri_extract_last_words("The quick brown fox")
# [1] "fox"

The function also removes any punctuation that may be at the end of the sentence.

该函数还删除可能在句子末尾的任何标点符号。

stri_extract_last_words("The quick brown fox? ...")
# [1] "fox"

#5


5  

Going in the package direction, this is the simplest answer I can think of:

顺便提一下,这是我能想到的最简单的答案:

library(stringr)

x <- 'The quick brown fox'
str_extract(x, '\\w+$')
#[1] "fox"

#1


17  

tail(strsplit('this is a sentence',split=" ")[[1]],1)

Basically as suggested by @Señor O.

基本上是@SeñorO的建议。

#2


37  

Just for completeness: The library stringr contains a function for exactly this problem.

只是为了完整性:库字符串包含一个完全解决此问题的函数。

library(stringr)

sentence <- "The quick brown fox"
word(sentence,-1)
[1] "fox"

#3


11  

x <- 'The quick brown fox'
sub('^.* ([[:alnum:]]+)$', '\\1', x)

That will catch the last string of numbers and characters before then end of the string.

这将捕获字符串之前的最后一串数字和字符。

You can also use the regexec and regmatches functions, but I find sub cleaner:

你也可以使用regexec和regmatches函数,但我找到sub cleaner:

m <- regexec('^.* ([[:alnum:]]+)$', x)
regmatches(x, m)

See ?regex and ?sub for more info.

有关详细信息,请参阅?regex和?sub。

#4


11  

Another packaged option is stri_extract_last_words() from the stringi package

另一个打包选项是stringi包中的stri_extract_last_words()

library(stringi)

stri_extract_last_words("The quick brown fox")
# [1] "fox"

The function also removes any punctuation that may be at the end of the sentence.

该函数还删除可能在句子末尾的任何标点符号。

stri_extract_last_words("The quick brown fox? ...")
# [1] "fox"

#5


5  

Going in the package direction, this is the simplest answer I can think of:

顺便提一下,这是我能想到的最简单的答案:

library(stringr)

x <- 'The quick brown fox'
str_extract(x, '\\w+$')
#[1] "fox"