String is given below from which i want to extract the text.
下面给出了字符串,我想从中提取文本。
String:
字符串:
Hello Mr John and Hello Ms Rita
Regex
正则表达式
Hello(.*?)Rita
I am try to get text between 2 strings which "Hello" and "Rita" I am using the above given regex, but its is giving me
我正在尝试在两个字符串之间获取文本,我正在使用上面给定的regex,但是它给了我
Mr John and Hello Ms
which is wrong. I need only "Ms" Can anyone help me out to write proper regex for this situation?
这是错误的。我只需要“Ms”,有人能帮我写一份合适的正则表达式吗?
4 个解决方案
#1
2
Use a tempered greedy token:
使用调和的贪心令牌:
Hello((?:(?!Hello|Rita).)*)Rita
^^^^^^^^^^^^^^^^^^^
See regex demo here
看到regex演示
The (?:(?!Hello|Rita).)*
is the tempered greedy token that only matches text that is not Hello
or Rita
. You may add word boundaries \b
if you need to check for whole words.
(?)*是只匹配不是Hello或Rita的文本的缓和的贪心令牌。如果需要检查整个单词,可以添加单词边界\b。
In order to get a Ms
without spaces on both ends, use this regex variation:
为了获得两端没有空格的Ms,使用这个regex变体:
Hello\s*((?:(?!Hello|Rita).)*?)\s*Rita
Adding the ?
to *
will form a lazy quantifier *?
that matches as few characters as needed to find a match, and \s*
will match zero or more whitespaces.
添加?到*将形成一个懒量词*?这将匹配尽可能少的字符以找到匹配,而\s*将匹配零个或更多的空格。
#2
1
To get the closest match towards ending word, let a greedy dot in front of the initial word consume.
为了得到与结尾词最接近的匹配,让一个贪婪的点在第一个单词前面消费。
.*Hello(.*?)Rita
看演示regex101
Or without whitespace in captured: .*Hello\s*(.*?)\s*Rita
Or with use of two capture groups: .*(Hello\s*(.*?)\s*Rita)
或者,在捕获的空格中没有空格:.*Hello\s* *(*?)\s*Rita或使用两个捕获组:.*(你好\ * * * \s* *)
#3
0
Your (.*?)
is picking up too much text because .*
matches any string of characters. So it grabs everything from the first "Hello" to "Rita" at the end.
您的(.*?)接收了太多的文本,因为.*匹配任何字符串。所以它从第一个“Hello”到“Rita”的最后一个环节。
One easy way you could get what you want is with this regular expression:
一个简单的方法就是用这个正则表达式:
Hello (\S+) Rita
\S
matches any non-whitespace character, so \S+
matches any consecutive string of non-whitespace characters, i.e. a single word.
\S匹配任何非空白字符,因此\S+匹配任何连续的非空白字符字符串,例如一个单词。
This would be a bit more robust, allowing for multiple spaces or other whitespace between the words:
这将更加健壮,允许单词之间有多个空格或其他空格:
Hello\s+(\S+)\s+Rita
演示
#4
0
you can use lookahead and lookbehind (?<=Hello).*?(?=Rita)
您可以使用lookahead和lookbehind (?<=Hello)。
#1
2
Use a tempered greedy token:
使用调和的贪心令牌:
Hello((?:(?!Hello|Rita).)*)Rita
^^^^^^^^^^^^^^^^^^^
See regex demo here
看到regex演示
The (?:(?!Hello|Rita).)*
is the tempered greedy token that only matches text that is not Hello
or Rita
. You may add word boundaries \b
if you need to check for whole words.
(?)*是只匹配不是Hello或Rita的文本的缓和的贪心令牌。如果需要检查整个单词,可以添加单词边界\b。
In order to get a Ms
without spaces on both ends, use this regex variation:
为了获得两端没有空格的Ms,使用这个regex变体:
Hello\s*((?:(?!Hello|Rita).)*?)\s*Rita
Adding the ?
to *
will form a lazy quantifier *?
that matches as few characters as needed to find a match, and \s*
will match zero or more whitespaces.
添加?到*将形成一个懒量词*?这将匹配尽可能少的字符以找到匹配,而\s*将匹配零个或更多的空格。
#2
1
To get the closest match towards ending word, let a greedy dot in front of the initial word consume.
为了得到与结尾词最接近的匹配,让一个贪婪的点在第一个单词前面消费。
.*Hello(.*?)Rita
看演示regex101
Or without whitespace in captured: .*Hello\s*(.*?)\s*Rita
Or with use of two capture groups: .*(Hello\s*(.*?)\s*Rita)
或者,在捕获的空格中没有空格:.*Hello\s* *(*?)\s*Rita或使用两个捕获组:.*(你好\ * * * \s* *)
#3
0
Your (.*?)
is picking up too much text because .*
matches any string of characters. So it grabs everything from the first "Hello" to "Rita" at the end.
您的(.*?)接收了太多的文本,因为.*匹配任何字符串。所以它从第一个“Hello”到“Rita”的最后一个环节。
One easy way you could get what you want is with this regular expression:
一个简单的方法就是用这个正则表达式:
Hello (\S+) Rita
\S
matches any non-whitespace character, so \S+
matches any consecutive string of non-whitespace characters, i.e. a single word.
\S匹配任何非空白字符,因此\S+匹配任何连续的非空白字符字符串,例如一个单词。
This would be a bit more robust, allowing for multiple spaces or other whitespace between the words:
这将更加健壮,允许单词之间有多个空格或其他空格:
Hello\s+(\S+)\s+Rita
演示
#4
0
you can use lookahead and lookbehind (?<=Hello).*?(?=Rita)
您可以使用lookahead和lookbehind (?<=Hello)。