I have a sentence:
我有一句话:
"This 'is' just an example"
“这只是一个例子”
I need to cut the word between first ' ' characters.
我需要在第一个''字符之间剪切。
Up until now, I was using following Regex method:
到目前为止,我正在使用以下Regex方法:
string name_only = Regex.Match("This 'is' just an example", @"\'([^)]*)\'").Groups[1].Value;
Result: is
结果:是
and it worked perfectly fine, until another ' appeared:
它工作得很好,直到另一个'出现:
"This 'is' just an e'xample"
“这'只是一个例子”
now I'm getting:
现在我得到了:
Result: is' just an e
结果:'只是一个e
how do I fix this issue (other than iterating using the "for" cycle and finding first two inexes of character ' and then cutting the word using the substring) ?
我如何解决这个问题(除了使用“for”循环迭代并找到前两个字符'然后使用子字符串切割单词)?
3 个解决方案
#1
2
The problem is that your regex acts in a greedy way and if you change it to the following it will work:
问题是你的正则表达式是以贪婪的方式行事,如果你把它改成以下它会起作用:
@"\'([^)]*?)\'"
#2
1
By default regular expression follow the "leftmost longest rule": the match the leftmost, longest substring possible.
默认情况下,正则表达式遵循“最左边最长的规则”:匹配最左边,最长的子字符串。
I'd be inclined to make the regular expression more specific about what it should match, thus:
我倾向于使正则表达式更具体地说明它应该匹配的内容,因此:
'(([^']|(''))*)'
That should match:
这应该匹配:
- The lead-in single-quote character, followed by
- 引号单引号字符,后跟
- zero or more instances of
- a single character other than a single-quote character, or
- 单引号字符以外的单个字符,或
- an "escaped" single-quote character: two consecutive single-quote characters,
- 一个“转义”的单引号字符:两个连续的单引号字符,
- 除单引号字符之外的单个字符的零个或多个实例,或“转义”单引号字符:两个连续的单引号字符,
- followed by the lead-out single-quote character.
- 然后是引出单引号字符。
$0 then gives you the entire match, and $1 the contents of the matched quoted value, exclusive of the lead-in/lead-out quotes.
$ 0然后给你整个匹配,$ 1匹配的报价值的内容,不包括引入/引出报价。
#3
0
http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy
http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy
Greedy and Lazy Quantifiers
A number of the quantifiers have two versions:
许多量词有两个版本:
A greedy version.
一个贪婪的版本。
A greedy quantifier tries to match an element as many times as possible.
贪婪的量词尝试尽可能多地匹配元素。
A non-greedy (or lazy) version.
非贪婪(或懒惰)版本。
A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.
非贪婪量词尝试尽可能少地匹配元素。你只需添加一个?就可以将贪婪的量词变成一个懒惰的量词。
#1
2
The problem is that your regex acts in a greedy way and if you change it to the following it will work:
问题是你的正则表达式是以贪婪的方式行事,如果你把它改成以下它会起作用:
@"\'([^)]*?)\'"
#2
1
By default regular expression follow the "leftmost longest rule": the match the leftmost, longest substring possible.
默认情况下,正则表达式遵循“最左边最长的规则”:匹配最左边,最长的子字符串。
I'd be inclined to make the regular expression more specific about what it should match, thus:
我倾向于使正则表达式更具体地说明它应该匹配的内容,因此:
'(([^']|(''))*)'
That should match:
这应该匹配:
- The lead-in single-quote character, followed by
- 引号单引号字符,后跟
- zero or more instances of
- a single character other than a single-quote character, or
- 单引号字符以外的单个字符,或
- an "escaped" single-quote character: two consecutive single-quote characters,
- 一个“转义”的单引号字符:两个连续的单引号字符,
- 除单引号字符之外的单个字符的零个或多个实例,或“转义”单引号字符:两个连续的单引号字符,
- followed by the lead-out single-quote character.
- 然后是引出单引号字符。
$0 then gives you the entire match, and $1 the contents of the matched quoted value, exclusive of the lead-in/lead-out quotes.
$ 0然后给你整个匹配,$ 1匹配的报价值的内容,不包括引入/引出报价。
#3
0
http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy
http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy
Greedy and Lazy Quantifiers
A number of the quantifiers have two versions:
许多量词有两个版本:
A greedy version.
一个贪婪的版本。
A greedy quantifier tries to match an element as many times as possible.
贪婪的量词尝试尽可能多地匹配元素。
A non-greedy (or lazy) version.
非贪婪(或懒惰)版本。
A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.
非贪婪量词尝试尽可能少地匹配元素。你只需添加一个?就可以将贪婪的量词变成一个懒惰的量词。