模式匹配以提取字符串匹配条件

时间:2022-09-13 12:20:36

I am trying to extract a string matching a pattern in a string. To make sense:

我正在尝试提取匹配字符串中的模式的字符串。有意义:

 x <- "this.is.fairly//Whatit.is/path/IDbeginUntilhere7/seenit"

The objective is of the regex is to return: IDbeginUntilhere. I tried this:

正则表达式的目标是返回:IDbeginUntilhere。我试过这个:

 str <- regmatches(x, gregexpr("^I.*7$", x))

which I understand it doesn't work since the I is located in the middle of the string. The question may be too simple, but I'd appreciate any help I can get.

我明白它不起作用,因为我位于字符串的中间。这个问题可能太简单了,但我很感激我能得到的任何帮助。

1 个解决方案

#1


2  

It is clear the main issue is the anchors: start of string ^ and end of string $.

很明显,主要问题是锚点:字符串^的开头和字符串$的结尾。

The secondary issue is the greedy dot that will also match across / delimited subparts (i.e. will match the whole Id7/Not-to-match7 instead of Id7).

次要问题是贪婪的点也将匹配跨/分隔的子部分(即将匹配整个Id7 / Not-to-match7而不是Id7)。

You need to use something like

你需要使用类似的东西

str <- regmatches(x, gregexpr("I[^/]*7", x))

See regex demo

请参阅正则表达式演示

If you do not need the 7, you need to use a look-ahead, and a Perl-like regex:

如果你不需要7,你需要使用前瞻和类似Perl的正则表达式:

str <- regmatches(x, gregexpr("I[^/]*(?=7)", x, perl=TRUE))

See another demo

看另一个演示

#1


2  

It is clear the main issue is the anchors: start of string ^ and end of string $.

很明显,主要问题是锚点:字符串^的开头和字符串$的结尾。

The secondary issue is the greedy dot that will also match across / delimited subparts (i.e. will match the whole Id7/Not-to-match7 instead of Id7).

次要问题是贪婪的点也将匹配跨/分隔的子部分(即将匹配整个Id7 / Not-to-match7而不是Id7)。

You need to use something like

你需要使用类似的东西

str <- regmatches(x, gregexpr("I[^/]*7", x))

See regex demo

请参阅正则表达式演示

If you do not need the 7, you need to use a look-ahead, and a Perl-like regex:

如果你不需要7,你需要使用前瞻和类似Perl的正则表达式:

str <- regmatches(x, gregexpr("I[^/]*(?=7)", x, perl=TRUE))

See another demo

看另一个演示