如何将正则表达式分组到第9次回引用之后?

时间:2021-02-02 18:23:32

Ok so I am trying to group past the 9th backreference in notepad++. The wiki says that I can use group naming to go past the 9th reference. However, I can't seem to get the syntax right to do the match. I am starting off with just two groups to make it simple.

好的,我正试图将notepad++中的第9个回引用分组。wiki说我可以使用组命名来通过第9个引用。然而,我似乎没有正确的语法来进行匹配。我从两组开始,让它变得简单。

Sample Data

样本数据

1000,1000

Regex.

正则表达式。

(?'a'[0-9]*),([0-9]*)

According to the docs I need to do the following.

根据文档,我需要做以下工作。

(?<some name>...), (?'some name'...),(?(some name)...)
Names this group some name.

However, the result is that it can't find my text. Any suggestions?

然而,结果是它找不到我的文本。有什么建议吗?

3 个解决方案

#1


26  

You can simply reference groups > 9 in the same way as those < 10

您可以简单地引用>9组,其方式与< 10组相同

i.e $10 is the tenth group.

我。e $10是第10组。

For (naive) example:

String:

(天真)例子:字符串:

abcdefghijklmnopqrstuvwxyz

abcdefghijklmnopqrstuvwxyz

Regex find:

正则表达式:

(?:a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)(o)(p)

(?:a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)(o)(p)

Replace:

替换:

$10

10美元

Result:

结果:

kqrstuvwxyz

kqrstuvwxyz

My test was performed in Notepad++ v6.1.2 and gave the result I expected.

我的测试是在Notepad+ v6.1.2中进行的,并给出了我预期的结果。

Update: This still works as of v7.5.6

更新:这仍然适用于v7.5.6。


SarcasticSully resurrected this to ask the question:

讽刺的复活了这个问题

"What if you want to replace with the 1st group followed by the character '0'?"

“如果你想替换第一个组,然后换成‘0’这个字符呢?”

To do this change the replace to:

为此,将替换为:

$1\x30

1美元\ x30

Which is replacing with group 1 and the hex character 30 - which is a 0 in ascii.

它用组1和十六进制字符30替换,这是ascii中的0。

#2


2  

A very belated answer to help others who land here from Google (as I did). Named backreferences in notepad++ substitutions look like this: $+{name}. For whatever reason.

这是一个姗姗来迟的回答,用来帮助那些从谷歌降落到这里的人(就像我一样)。在notepad++置换中命名的反向引用如下:$+{name}。不管出于什么原因。

There's a deviation from standard regex gotcha here, though... named backreferences are also given numbers. In standard regex, if you have (.*)(?<name> & )(.*), you'd replace with $1${name}$2 to get the exact same line you started with. In notepad++, you would have to use $1$+{name}$3.

这里有一个与标准regex gotcha的偏差,尽管…命名的反向引用也给出了数字。在标准正则表达式中,如果有(.*)(? &)(.*),您将用$1${name}$2替换,以获得与开始时相同的行。在notepad++中,必须使用$1$+{name}$3。


Example: I needed to clean up a Visual Studio .sln file for mismatched configurations. The text I needed to replace looked like this:

示例:我需要清理一个Visual Studio .sln文件,查找不匹配的配置。我需要替换的文本如下:

    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.ActiveCfg = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.Build.0 = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.ActiveCfg = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.Build.0 = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.ActiveCfg = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.Build.0 = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.ActiveCfg = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.Build.0 = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.ActiveCfg = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.Build.0 = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.ActiveCfg = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.Build.0 = Release|Any CPU


My search RegEx:

我搜索正则表达式:

  ^(\s*\{[^}]*\}\.)(?<config>[a-zA-Z0-9]+\|[a-zA-Z0-9 ]+)*(\..+=\s*)(.*)$

My replacement RegEx:

我替换正则表达式:

  $1$+{config}$3$+{config}

The result:

结果:

    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.ActiveCfg = Dev|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.Build.0 = Dev|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.ActiveCfg = Dev|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.Build.0 = Dev|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.ActiveCfg = Dev|x86
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.Build.0 = Dev|x86
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.ActiveCfg = QA|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.Build.0 = QA|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.ActiveCfg = QA|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.Build.0 = QA|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.ActiveCfg = QA|x86
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.Build.0 = QA|x86

Hope this helps someone.

希望这可以帮助别人。

#3


1  

OK, matching is no problem, your example matches for me in the current Notepad++. This is an important point. To use PCRE regex in Notepad++, you need a Version >= 6.0.

好的,匹配没问题,您的示例在当前的Notepad++中匹配我。这是很重要的一点。要在Notepad++中使用PCRE regex,您需要一个>= 6.0版本。

The other point is, where do you want to use the backreference? I can use named backreferences without problems within the regex, but not in the replacement string.

另一点是,你想在哪里使用反向引用?我可以在regex中使用命名的反向引用,而不会在替换字符串中出现问题。

means

意味着

(?'a'[0-9]*),([0-9]*),\g{a}

will match

将匹配

1000,1001,1000

But I don't know a way to use named groups or groups > 9 in the replacement string.

但是我不知道在替换字符串中使用命名组或组> 9的方法。

Do you really need more than 9 backreferences in the replacement string? If you just need more than 9 groups, but not all of them in the replacement, then make the groups you don't need to reuse non-capturing groups, by adding a ?: at the start of the group.

替换字符串中是否真的需要超过9个反向引用?如果您只需要9个以上的组,但在替换过程中并不是所有的组,那么通过在组的开头添加一个?:,使您不需要重用非捕获组的组。

(?:[0-9]*),([0-9]*),(?:[0-9]*),([0-9]*)
           group 1             group 2

#1


26  

You can simply reference groups > 9 in the same way as those < 10

您可以简单地引用>9组,其方式与< 10组相同

i.e $10 is the tenth group.

我。e $10是第10组。

For (naive) example:

String:

(天真)例子:字符串:

abcdefghijklmnopqrstuvwxyz

abcdefghijklmnopqrstuvwxyz

Regex find:

正则表达式:

(?:a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)(o)(p)

(?:a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)(o)(p)

Replace:

替换:

$10

10美元

Result:

结果:

kqrstuvwxyz

kqrstuvwxyz

My test was performed in Notepad++ v6.1.2 and gave the result I expected.

我的测试是在Notepad+ v6.1.2中进行的,并给出了我预期的结果。

Update: This still works as of v7.5.6

更新:这仍然适用于v7.5.6。


SarcasticSully resurrected this to ask the question:

讽刺的复活了这个问题

"What if you want to replace with the 1st group followed by the character '0'?"

“如果你想替换第一个组,然后换成‘0’这个字符呢?”

To do this change the replace to:

为此,将替换为:

$1\x30

1美元\ x30

Which is replacing with group 1 and the hex character 30 - which is a 0 in ascii.

它用组1和十六进制字符30替换,这是ascii中的0。

#2


2  

A very belated answer to help others who land here from Google (as I did). Named backreferences in notepad++ substitutions look like this: $+{name}. For whatever reason.

这是一个姗姗来迟的回答,用来帮助那些从谷歌降落到这里的人(就像我一样)。在notepad++置换中命名的反向引用如下:$+{name}。不管出于什么原因。

There's a deviation from standard regex gotcha here, though... named backreferences are also given numbers. In standard regex, if you have (.*)(?<name> & )(.*), you'd replace with $1${name}$2 to get the exact same line you started with. In notepad++, you would have to use $1$+{name}$3.

这里有一个与标准regex gotcha的偏差,尽管…命名的反向引用也给出了数字。在标准正则表达式中,如果有(.*)(? &)(.*),您将用$1${name}$2替换,以获得与开始时相同的行。在notepad++中,必须使用$1$+{name}$3。


Example: I needed to clean up a Visual Studio .sln file for mismatched configurations. The text I needed to replace looked like this:

示例:我需要清理一个Visual Studio .sln文件,查找不匹配的配置。我需要替换的文本如下:

    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.ActiveCfg = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.Build.0 = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.ActiveCfg = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.Build.0 = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.ActiveCfg = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.Build.0 = Debug|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.ActiveCfg = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.Build.0 = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.ActiveCfg = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.Build.0 = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.ActiveCfg = Release|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.Build.0 = Release|Any CPU


My search RegEx:

我搜索正则表达式:

  ^(\s*\{[^}]*\}\.)(?<config>[a-zA-Z0-9]+\|[a-zA-Z0-9 ]+)*(\..+=\s*)(.*)$

My replacement RegEx:

我替换正则表达式:

  $1$+{config}$3$+{config}

The result:

结果:

    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.ActiveCfg = Dev|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|Any CPU.Build.0 = Dev|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.ActiveCfg = Dev|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x64.Build.0 = Dev|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.ActiveCfg = Dev|x86
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.Dev|x86.Build.0 = Dev|x86
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.ActiveCfg = QA|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|Any CPU.Build.0 = QA|Any CPU
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.ActiveCfg = QA|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x64.Build.0 = QA|x64
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.ActiveCfg = QA|x86
    {CDDB12FE-885F-4FB7-9724-1A4279573DE5}.QA|x86.Build.0 = QA|x86

Hope this helps someone.

希望这可以帮助别人。

#3


1  

OK, matching is no problem, your example matches for me in the current Notepad++. This is an important point. To use PCRE regex in Notepad++, you need a Version >= 6.0.

好的,匹配没问题,您的示例在当前的Notepad++中匹配我。这是很重要的一点。要在Notepad++中使用PCRE regex,您需要一个>= 6.0版本。

The other point is, where do you want to use the backreference? I can use named backreferences without problems within the regex, but not in the replacement string.

另一点是,你想在哪里使用反向引用?我可以在regex中使用命名的反向引用,而不会在替换字符串中出现问题。

means

意味着

(?'a'[0-9]*),([0-9]*),\g{a}

will match

将匹配

1000,1001,1000

But I don't know a way to use named groups or groups > 9 in the replacement string.

但是我不知道在替换字符串中使用命名组或组> 9的方法。

Do you really need more than 9 backreferences in the replacement string? If you just need more than 9 groups, but not all of them in the replacement, then make the groups you don't need to reuse non-capturing groups, by adding a ?: at the start of the group.

替换字符串中是否真的需要超过9个反向引用?如果您只需要9个以上的组,但在替换过程中并不是所有的组,那么通过在组的开头添加一个?:,使您不需要重用非捕获组的组。

(?:[0-9]*),([0-9]*),(?:[0-9]*),([0-9]*)
           group 1             group 2