正则表达:什么都是无中生有?

时间:2021-08-07 20:25:30

So * by itself means repeat the previous item zero or more times. The output of * is nothing. What about **? This gives an output, but how does matching zero or more times of nothing give something? Could you also explain that please? Same for ?*: Nothing precedes ?, so that is nothing right? How does matching zero or more times of nothing give something?

因此*本身意味着重复前一项零次或多次。 *的输出没什么。关于什么 **?这给出了一个输出,但匹配零次或多次没有给出什么东西?你能解释一下吗?同样的?*:什么都没有?,所以这是不对的?如何匹配零次或多次没有任何东西?

mugbear:~# grep '*' emptyspace                                                  
mugbear:~# grep '**' emptyspace                                                 
line1
line2

line4
line5

line7
mugbear:~# grep '?' emptyspace
mugbear:~# grep '?*' emptyspace                                         
line1
line2

line4
line5

line7

3 个解决方案

#1


2  

A leading * is generally not magic because of its context

You are asking questions with answers that are not fully specified and as such are almost certain to depend on the specific RE implementation.

您提出的问题的答案并未完全指定,因此几乎肯定会依赖于具体的RE实施。

For that matter, there isn't even anything close to a single standard RE, and the variations are not slightly different interpretations but dramatically different pattern definitions.

就此而言,甚至没有任何接近单一标准RE的变化,并且变化不是略有不同的解释,而是显着不同的模式定义。

At first, there was classic grep / sed / ed / awk. A considerably expanded set of patterns eventually appeared and was made popular by Perl and other languages.

起初,有经典的grep / sed / ed / awk。最终出现了一组相当扩展的模式,并被Perl和其他语言所广泛使用。

Some of these implementations attempt to notice when a character could not be magic due to its position.

这些实现中的一些尝试注意到角色由于其位置而不能是魔术的。

So, a plain * might search for an actual * and ** then for 0 or more * characters. (And every string has 0 or more...)

因此,普通*可能会搜索实际*和**然后搜索0或更多*字符。 (并且每个字符串都有0或更多...)


Note: Yes, there is a Posix standard but it has so little influence that it can be disregarded.

注意:是的,有一个Posix标准,但影响很小,可以忽略它。

#2


1  

Every string contains 0 or more repetitions of every other string.

每个字符串包含0或更多的每个其他字符串的重复。

#3


0  

? or * by themselves will do nothing as they have nothing to process. ** and ?* are bad form and should not be used. Anything that compile regex strings properly should error out when presented with either. Strict compilers will error with ? or * alone as well.

?或者*他们自己什么也不做,因为他们没有什么可以处理的。 **和?*是不好的形式,不应该使用。正确编译正则表达式字符串的任何内容都应该在出现时出错。严格的编译器会出错吗?或*也是一个人。

#1


2  

A leading * is generally not magic because of its context

You are asking questions with answers that are not fully specified and as such are almost certain to depend on the specific RE implementation.

您提出的问题的答案并未完全指定,因此几乎肯定会依赖于具体的RE实施。

For that matter, there isn't even anything close to a single standard RE, and the variations are not slightly different interpretations but dramatically different pattern definitions.

就此而言,甚至没有任何接近单一标准RE的变化,并且变化不是略有不同的解释,而是显着不同的模式定义。

At first, there was classic grep / sed / ed / awk. A considerably expanded set of patterns eventually appeared and was made popular by Perl and other languages.

起初,有经典的grep / sed / ed / awk。最终出现了一组相当扩展的模式,并被Perl和其他语言所广泛使用。

Some of these implementations attempt to notice when a character could not be magic due to its position.

这些实现中的一些尝试注意到角色由于其位置而不能是魔术的。

So, a plain * might search for an actual * and ** then for 0 or more * characters. (And every string has 0 or more...)

因此,普通*可能会搜索实际*和**然后搜索0或更多*字符。 (并且每个字符串都有0或更多...)


Note: Yes, there is a Posix standard but it has so little influence that it can be disregarded.

注意:是的,有一个Posix标准,但影响很小,可以忽略它。

#2


1  

Every string contains 0 or more repetitions of every other string.

每个字符串包含0或更多的每个其他字符串的重复。

#3


0  

? or * by themselves will do nothing as they have nothing to process. ** and ?* are bad form and should not be used. Anything that compile regex strings properly should error out when presented with either. Strict compilers will error with ? or * alone as well.

?或者*他们自己什么也不做,因为他们没有什么可以处理的。 **和?*是不好的形式,不应该使用。正确编译正则表达式字符串的任何内容都应该在出现时出错。严格的编译器会出错吗?或*也是一个人。