正则表达式中的(?ms)是什么意思?

时间:2022-05-01 22:30:14

I have following Regex in Powershell :

我在Powershell中关注了Regex:

[regex]$regex = 
@'
(?ms).*?<DIV class=row>.*?
'@

What does (?ms) mean here.

(?ms)在这里是什么意思。

3 个解决方案

#1


14  

(?m) is the modifier for multi-line mode. It makes ^ and $ match the beginning and end of a line, respectively, instead of matching the beginning and end of the input.

(?m)是多行模式的修饰符。它使^和$分别匹配行的开头和结尾,而不是匹配输入的开头和结尾。

For example, given the input:

例如,给定输入:

ABC DEF
GHI

ABC DEF GHI

The regex ^[A-Z]{3} will match:

正则表达式^ [A-Z] {3}将匹配:

  1. "ABC"
  2. “ABC”

Meanwhile, the regex (?m)^[A-Z]{3} will match:

同时,正则表达式(?m)^ [A-Z] {3}将匹配:

  1. "ABC"
  2. “ABC”
  3. "GHI"
  4. “GHI”

(?s) is the modifier for single-line mode. It adds linebreaks and newlines to the list of characters that . will match.

(?s)是单行模式的修饰符。它将换行符和换行符添加到字符列表中。会匹配。

Given the same input as before, the regex [A-Z]{3}. will match (note the inclusion of the space character):

给定与之前相同的输入,正则表达式[A-Z] {3}。将匹配(注意包含空格字符):

  1. "ABC "
  2. “ABC”

While the regex (?s)[A-Z]{3}. will match:

而正则表达式(?s)[A-Z] {3}。将匹配:

  1. "ABC "
  2. “ABC”
  3. "DEF\n"
  4. “DEF \ n”

Despite their names, the two modes aren't necessarily mutually exclusive. In some implementations they cancel out, but, for the most part, they can be used in concert. You can use both at once by writing (?m)(?s) or, in shorter form, (?ms).

尽管他们的名字,这两种模式不一定是相互排斥的。在一些实施方式中,它们取消,但是,在大多数情况下,它们可以一起使用。您可以通过写(?m)(?s)或以较短的形式(?ms)一次使用两者。

EDIT:

编辑:

There are certain situations where you might want to use (?ms). The following examples are a bit contrived, but I think they serve our purpose. Given the input (note the space after "ABC"):

在某些情况下您可能想要使用(?ms)。以下示例有点人为,但我认为它们符合我们的目的。给定输入(注意“ABC”之后的空格):

ABC
DEF
GHI

ABC DEF GHI

The regex (?ms)^[A-Z]{3}. matches:

正则表达式(?ms)^ [A-Z] {3}。火柴:

  1. "ABC "
  2. “ABC”
  3. "DEF\n"
  4. “DEF \ n”

While both (?m)^[A-Z]{3}. and (?s)^[A-Z]{3}. match:

而两者(?m)^ [A-Z] {3}。和(?s)^ [A-Z] {3}。比赛:

  1. "ABC "
  2. “ABC”

#2


5  

Sometimes people say (?s) is single line mode. Its not, there is no such thing.
It means the Dot meta-char . matches any newline, meaning the Dot matches any character.
The default is usually the Dot does Not match newline, so you have to specifically
set the Dot-All modifier through the regex options constant, or the inline modifier (?s).

有时人们说(?s)是单线模式。它没有,没有这样的事情。它意味着Dot元字符。匹配任何换行符,意味着Dot匹配任何字符。默认值通常是Dot不匹配换行符,因此您必须通过正则表达式选项常量或内联修饰符(?s)专门设置Dot-All修饰符。

(?m) is the multi-line modifier. It lets the anchors ^$ match beginning/end of lines, as
well as beginning/end of string.

(?m)是多行修饰符。它允许锚点^ $匹配行的开头/结尾,以及字符串的开头/结尾。

How/when/should (?ms) be used together?
The answer is that sometimes you want to use the Dot to span newlines, while at the same time
need ^ to match at beginning of line. And you are not to sure about anything inbetween.

如何/何时/应该(?ms)一起使用?答案是,有时你想使用Dot来跨越换行符,同时需要^来匹配行的开头。你不确定中间的任何事情。

Example:

例:

(?ms)^BlockStart.*?BlockEnd

(?MS)^ BlockStart。*?BlockEnd

where the input is:

输入的位置是:

StringStart aasdfasdffasdf
asgasgasgw fasfggasfgaag
BlockStart asgdfasggafsdgadsfg aaaasfgaafdsgasfg afbaadsf afdsgadsfg BlockEnd afsbgafsdgasfg
aaaaaafrgasfgaadsfgg

StringStart aasdfasdffasdf asgasgasgw fasfggasfgaag BlockStart asgdfasggafsdgadsfg aaaasfgaafdsgasfg afbaadsf afdsgadsfg BlockEnd afsbgafsdgasfg aaaaaafrgasfgaadsfgg

#3


0  

I think these are mode modifiers

我认为这些是模式修饰符

From the site linked it states:

从链接的网站上说:

  • (?s) for "single line mode" makes the dot match all characters, including line breaks. Not supported by Ruby or JavaScript. In Tcl, (?s) also makes the caret and dollar match at the start and end of the string only.
  • (?s)对于“单行模式”,使点匹配所有字符,包括换行符。 Ruby或JavaScript不支持。在Tcl中,(?s)也只在字符串的开头和结尾处使插入符号和美元匹配。
  • (?m) for "multi-line mode" makes the caret and dollar match at the start and end of each line in the subject string. In Ruby, (?m) makes the dot match all characters, without affecting the caret and dollar which always match at the start and end of each line in Ruby. In Tcl, (?m) also prevents the dot from matching line breaks.
  • (?m)对于“多行模式”,使得插入符号和美元匹配主题字符串中每行的开头和结尾。在Ruby中,(?m)使得点匹配所有字符,而不会影响在Ruby中每行的开头和结尾总是匹配的插入符号和美元。在Tcl中,(?m)还可以防止点匹配换行符。

I'm not 100% certain why you would want to specify multiline and single line mode at the same time, but the example on the page does it as well so maybe its valid...

我不是100%肯定你为什么要同时指定多行和单行模式,但页面上的例子也是如此,所以也许它有效...

#1


14  

(?m) is the modifier for multi-line mode. It makes ^ and $ match the beginning and end of a line, respectively, instead of matching the beginning and end of the input.

(?m)是多行模式的修饰符。它使^和$分别匹配行的开头和结尾,而不是匹配输入的开头和结尾。

For example, given the input:

例如,给定输入:

ABC DEF
GHI

ABC DEF GHI

The regex ^[A-Z]{3} will match:

正则表达式^ [A-Z] {3}将匹配:

  1. "ABC"
  2. “ABC”

Meanwhile, the regex (?m)^[A-Z]{3} will match:

同时,正则表达式(?m)^ [A-Z] {3}将匹配:

  1. "ABC"
  2. “ABC”
  3. "GHI"
  4. “GHI”

(?s) is the modifier for single-line mode. It adds linebreaks and newlines to the list of characters that . will match.

(?s)是单行模式的修饰符。它将换行符和换行符添加到字符列表中。会匹配。

Given the same input as before, the regex [A-Z]{3}. will match (note the inclusion of the space character):

给定与之前相同的输入,正则表达式[A-Z] {3}。将匹配(注意包含空格字符):

  1. "ABC "
  2. “ABC”

While the regex (?s)[A-Z]{3}. will match:

而正则表达式(?s)[A-Z] {3}。将匹配:

  1. "ABC "
  2. “ABC”
  3. "DEF\n"
  4. “DEF \ n”

Despite their names, the two modes aren't necessarily mutually exclusive. In some implementations they cancel out, but, for the most part, they can be used in concert. You can use both at once by writing (?m)(?s) or, in shorter form, (?ms).

尽管他们的名字,这两种模式不一定是相互排斥的。在一些实施方式中,它们取消,但是,在大多数情况下,它们可以一起使用。您可以通过写(?m)(?s)或以较短的形式(?ms)一次使用两者。

EDIT:

编辑:

There are certain situations where you might want to use (?ms). The following examples are a bit contrived, but I think they serve our purpose. Given the input (note the space after "ABC"):

在某些情况下您可能想要使用(?ms)。以下示例有点人为,但我认为它们符合我们的目的。给定输入(注意“ABC”之后的空格):

ABC
DEF
GHI

ABC DEF GHI

The regex (?ms)^[A-Z]{3}. matches:

正则表达式(?ms)^ [A-Z] {3}。火柴:

  1. "ABC "
  2. “ABC”
  3. "DEF\n"
  4. “DEF \ n”

While both (?m)^[A-Z]{3}. and (?s)^[A-Z]{3}. match:

而两者(?m)^ [A-Z] {3}。和(?s)^ [A-Z] {3}。比赛:

  1. "ABC "
  2. “ABC”

#2


5  

Sometimes people say (?s) is single line mode. Its not, there is no such thing.
It means the Dot meta-char . matches any newline, meaning the Dot matches any character.
The default is usually the Dot does Not match newline, so you have to specifically
set the Dot-All modifier through the regex options constant, or the inline modifier (?s).

有时人们说(?s)是单线模式。它没有,没有这样的事情。它意味着Dot元字符。匹配任何换行符,意味着Dot匹配任何字符。默认值通常是Dot不匹配换行符,因此您必须通过正则表达式选项常量或内联修饰符(?s)专门设置Dot-All修饰符。

(?m) is the multi-line modifier. It lets the anchors ^$ match beginning/end of lines, as
well as beginning/end of string.

(?m)是多行修饰符。它允许锚点^ $匹配行的开头/结尾,以及字符串的开头/结尾。

How/when/should (?ms) be used together?
The answer is that sometimes you want to use the Dot to span newlines, while at the same time
need ^ to match at beginning of line. And you are not to sure about anything inbetween.

如何/何时/应该(?ms)一起使用?答案是,有时你想使用Dot来跨越换行符,同时需要^来匹配行的开头。你不确定中间的任何事情。

Example:

例:

(?ms)^BlockStart.*?BlockEnd

(?MS)^ BlockStart。*?BlockEnd

where the input is:

输入的位置是:

StringStart aasdfasdffasdf
asgasgasgw fasfggasfgaag
BlockStart asgdfasggafsdgadsfg aaaasfgaafdsgasfg afbaadsf afdsgadsfg BlockEnd afsbgafsdgasfg
aaaaaafrgasfgaadsfgg

StringStart aasdfasdffasdf asgasgasgw fasfggasfgaag BlockStart asgdfasggafsdgadsfg aaaasfgaafdsgasfg afbaadsf afdsgadsfg BlockEnd afsbgafsdgasfg aaaaaafrgasfgaadsfgg

#3


0  

I think these are mode modifiers

我认为这些是模式修饰符

From the site linked it states:

从链接的网站上说:

  • (?s) for "single line mode" makes the dot match all characters, including line breaks. Not supported by Ruby or JavaScript. In Tcl, (?s) also makes the caret and dollar match at the start and end of the string only.
  • (?s)对于“单行模式”,使点匹配所有字符,包括换行符。 Ruby或JavaScript不支持。在Tcl中,(?s)也只在字符串的开头和结尾处使插入符号和美元匹配。
  • (?m) for "multi-line mode" makes the caret and dollar match at the start and end of each line in the subject string. In Ruby, (?m) makes the dot match all characters, without affecting the caret and dollar which always match at the start and end of each line in Ruby. In Tcl, (?m) also prevents the dot from matching line breaks.
  • (?m)对于“多行模式”,使得插入符号和美元匹配主题字符串中每行的开头和结尾。在Ruby中,(?m)使得点匹配所有字符,而不会影响在Ruby中每行的开头和结尾总是匹配的插入符号和美元。在Tcl中,(?m)还可以防止点匹配换行符。

I'm not 100% certain why you would want to specify multiline and single line mode at the same time, but the example on the page does it as well so maybe its valid...

我不是100%肯定你为什么要同时指定多行和单行模式,但页面上的例子也是如此,所以也许它有效...