
时间:2022-05-05 19:20:41

I have a string formatted for example like "segmentation_level1_id_10" and would like to extract the level number associated to it (i.e. the number directly after the word level).


I have a solution that does this in two steps, first finds the pattern level\\d+ then replaces the level with missing after, but I would like to know if it's possible to do this in one step just with str_extract


Example below:



segmentation_id <- "segmentation_level1_id_10"

segmentation_level <- str_replace(str_extract(segmentation_id, "level\\d+"), "level", "")

1 个解决方案



One way to do it is by using a stringr library str_extract function with a regex featuring a lookbehind:


> library(stringr)
> s = "segmentation_level1_id_10"
> str_extract(s, "(?<=level)\\d+")
## or to make sure we match the level after _: str_extract(s, "(?<=_level)\\d+")
[1] "1"

Or using str_match that allows extracting captured group texts:


> str_match(s, "_level(\\d+)")[,2]
[1] "1"

It can be done with base R using the gsub and making use of the same capturing mechanism used in str_match, but also using a backreference to restore the captured text in the replacement result:

可以使用gsub使用base R,使用str_match中使用的捕获机制,也可以使用backreference恢复替换结果中捕获的文本:

> gsub("^.*level(\\d+).*", "\\1", s)
[1] "1"



One way to do it is by using a stringr library str_extract function with a regex featuring a lookbehind:


> library(stringr)
> s = "segmentation_level1_id_10"
> str_extract(s, "(?<=level)\\d+")
## or to make sure we match the level after _: str_extract(s, "(?<=_level)\\d+")
[1] "1"

Or using str_match that allows extracting captured group texts:


> str_match(s, "_level(\\d+)")[,2]
[1] "1"

It can be done with base R using the gsub and making use of the same capturing mechanism used in str_match, but also using a backreference to restore the captured text in the replacement result:

可以使用gsub使用base R,使用str_match中使用的捕获机制,也可以使用backreference恢复替换结果中捕获的文本:

> gsub("^.*level(\\d+).*", "\\1", s)
[1] "1"