如何在两个字符串之间获取文本?

时间:2022-04-22 21:47:26

String is given below from which i want to extract the text.

下面给出了字符串,我想从中提取文本。

String:

字符串:

Hello Mr John and Hello Ms Rita

Regex

正则表达式

Hello(.*?)Rita

I am try to get text between 2 strings which "Hello" and "Rita" I am using the above given regex, but its is giving me

我正在尝试在两个字符串之间获取文本,我正在使用上面给定的regex,但是它给了我

Mr John and Hello Ms

which is wrong. I need only "Ms" Can anyone help me out to write proper regex for this situation?

这是错误的。我只需要“Ms”,有人能帮我写一份合适的正则表达式吗?

4 个解决方案

#1


2  

Use a tempered greedy token:

使用调和的贪心令牌:

Hello((?:(?!Hello|Rita).)*)Rita
       ^^^^^^^^^^^^^^^^^^^

See regex demo here

看到regex演示

The (?:(?!Hello|Rita).)* is the tempered greedy token that only matches text that is not Hello or Rita. You may add word boundaries \b if you need to check for whole words.

(?)*是只匹配不是Hello或Rita的文本的缓和的贪心令牌。如果需要检查整个单词,可以添加单词边界\b。

In order to get a Ms without spaces on both ends, use this regex variation:

为了获得两端没有空格的Ms,使用这个regex变体:

Hello\s*((?:(?!Hello|Rita).)*?)\s*Rita

Adding the ? to * will form a lazy quantifier *? that matches as few characters as needed to find a match, and \s* will match zero or more whitespaces.

添加?到*将形成一个懒量词*?这将匹配尽可能少的字符以找到匹配,而\s*将匹配零个或更多的空格。

#2


1  

To get the closest match towards ending word, let a greedy dot in front of the initial word consume.

为了得到与结尾词最接近的匹配,让一个贪婪的点在第一个单词前面消费。

.*Hello(.*?)Rita

See demo at regex101

看演示regex101

Or without whitespace in captured: .*Hello\s*(.*?)\s*Rita
Or with use of two capture groups: .*(Hello\s*(.*?)\s*Rita)

或者,在捕获的空格中没有空格:.*Hello\s* *(*?)\s*Rita或使用两个捕获组:.*(你好\ * * * \s* *)

#3


0  

Your (.*?) is picking up too much text because .* matches any string of characters. So it grabs everything from the first "Hello" to "Rita" at the end.

您的(.*?)接收了太多的文本,因为.*匹配任何字符串。所以它从第一个“Hello”到“Rita”的最后一个环节。

One easy way you could get what you want is with this regular expression:

一个简单的方法就是用这个正则表达式:

Hello (\S+) Rita

\S matches any non-whitespace character, so \S+ matches any consecutive string of non-whitespace characters, i.e. a single word.

\S匹配任何非空白字符,因此\S+匹配任何连续的非空白字符字符串,例如一个单词。

This would be a bit more robust, allowing for multiple spaces or other whitespace between the words:

这将更加健壮,允许单词之间有多个空格或其他空格:

Hello\s+(\S+)\s+Rita

Demo

演示

#4


0  

you can use lookahead and lookbehind (?<=Hello).*?(?=Rita)

您可以使用lookahead和lookbehind (?<=Hello)。

#1


2  

Use a tempered greedy token:

使用调和的贪心令牌:

Hello((?:(?!Hello|Rita).)*)Rita
       ^^^^^^^^^^^^^^^^^^^

See regex demo here

看到regex演示

The (?:(?!Hello|Rita).)* is the tempered greedy token that only matches text that is not Hello or Rita. You may add word boundaries \b if you need to check for whole words.

(?)*是只匹配不是Hello或Rita的文本的缓和的贪心令牌。如果需要检查整个单词,可以添加单词边界\b。

In order to get a Ms without spaces on both ends, use this regex variation:

为了获得两端没有空格的Ms,使用这个regex变体:

Hello\s*((?:(?!Hello|Rita).)*?)\s*Rita

Adding the ? to * will form a lazy quantifier *? that matches as few characters as needed to find a match, and \s* will match zero or more whitespaces.

添加?到*将形成一个懒量词*?这将匹配尽可能少的字符以找到匹配,而\s*将匹配零个或更多的空格。

#2


1  

To get the closest match towards ending word, let a greedy dot in front of the initial word consume.

为了得到与结尾词最接近的匹配,让一个贪婪的点在第一个单词前面消费。

.*Hello(.*?)Rita

See demo at regex101

看演示regex101

Or without whitespace in captured: .*Hello\s*(.*?)\s*Rita
Or with use of two capture groups: .*(Hello\s*(.*?)\s*Rita)

或者,在捕获的空格中没有空格:.*Hello\s* *(*?)\s*Rita或使用两个捕获组:.*(你好\ * * * \s* *)

#3


0  

Your (.*?) is picking up too much text because .* matches any string of characters. So it grabs everything from the first "Hello" to "Rita" at the end.

您的(.*?)接收了太多的文本,因为.*匹配任何字符串。所以它从第一个“Hello”到“Rita”的最后一个环节。

One easy way you could get what you want is with this regular expression:

一个简单的方法就是用这个正则表达式:

Hello (\S+) Rita

\S matches any non-whitespace character, so \S+ matches any consecutive string of non-whitespace characters, i.e. a single word.

\S匹配任何非空白字符,因此\S+匹配任何连续的非空白字符字符串,例如一个单词。

This would be a bit more robust, allowing for multiple spaces or other whitespace between the words:

这将更加健壮,允许单词之间有多个空格或其他空格:

Hello\s+(\S+)\s+Rita

Demo

演示

#4


0  

you can use lookahead and lookbehind (?<=Hello).*?(?=Rita)

您可以使用lookahead和lookbehind (?<=Hello)。