\ b在Ruby正则表达式中真正意味着什么?

时间:2022-05-11 15:45:15

I have a file with phrases such as "Canyon St / 27th Way" that I am trying to turn into "Canyon St and 27th Way" with Ruby regular expressions.

我有一个带有诸如“Canyon St / 27th Way”之类的短语的文件,我试图用Ruby正则表达式变成“Canyon St和27th Way”。

I used file = file.gsub(/(\b) \/ (\b)/, "#{$1} and #{$2}") to make the match, but I am a little stumped about what \b really means and why $1 contains all of the characters before the word boundary that precedes the slash and why $2 contains all of the characters after the word boundary starting the next word.

我使用file = file.gsub(/(\ b)\ /(\ b)/,“#{$ 1}和#{$ 2}”来进行匹配,但我对于什么\ b真正意味着什么感到困惑为什么$ 1包含在斜线之前的单词边界之前的所有字符以及为什么$ 2包含单词边界开始下一个单词之后的所有字符。

Usually, I expect that whatever is in parentheses in a regular expression would be in $1 and $2, but I am not sure what parentheses around a word boundary would really mean because there really is nothing between the transition from a word character to a white space character.

通常,我希望正则表达式中括号中的任何内容都是1美元和2美元,但我不确定单词边界周围的括号是什么意思,因为从单词字符到空格的转换之间确实没有任何内容字符。

2 个解决方案

#1


6  

The $1 and $2 are not actually related to your regex match: a method's arguments are evaluated before the method is called, so

$ 1和$ 2实际上与你的正则表达式匹配无关:在调用方法之前评估方法的参数,所以

"#{$1} and #{$2}"

Is evaluated before the regex is matched against your string. If you haven't done earlier regex matches then these variables will be nil, so you're actually doing

在正则表达式与字符串匹配之前进行评估。如果你没有做过早期的正则表达式匹配,那么这些变量将是零,所以你实际上在做

file = file.gsub(/(\b) \/ (\b)/, " and ")

that is you are replacing a slash surrounded by spaces by "and", also surrounded by spaces. $1 and $2 will be updated to be empty strings, and so you'll see the same behaviour when you process the next string.

那就是你用“和”替换空格包围的斜线,也用空格包围。 $ 1和$ 2将更新为空字符串,因此当您处理下一个字符串时,您将看到相同的行为。

#2


8  

The parentheses aren't doing anything in this context. You could get the same result using /\b \/ \b/.

括号在这种情况下没有做任何事情。您可以使用/ \ b \ / \ b /获得相同的结果。

I think you are getting a little confused by $1 and $2. Those aren't actually doing anything either. They are nil because they are matching nothing (just a word boundry). What you have written is the logical equivalent of .gsub(/\b \/ \b/, " and ")

我认为你会因为1美元和2美元而感到困惑。那些实际上也没有做任何事情。它们是零,因为它们没有匹配(只是一个单词边界)。你写的是.gsub(/ \ b \ / \ b /,“和”)的逻辑等价物

#1


6  

The $1 and $2 are not actually related to your regex match: a method's arguments are evaluated before the method is called, so

$ 1和$ 2实际上与你的正则表达式匹配无关:在调用方法之前评估方法的参数,所以

"#{$1} and #{$2}"

Is evaluated before the regex is matched against your string. If you haven't done earlier regex matches then these variables will be nil, so you're actually doing

在正则表达式与字符串匹配之前进行评估。如果你没有做过早期的正则表达式匹配,那么这些变量将是零,所以你实际上在做

file = file.gsub(/(\b) \/ (\b)/, " and ")

that is you are replacing a slash surrounded by spaces by "and", also surrounded by spaces. $1 and $2 will be updated to be empty strings, and so you'll see the same behaviour when you process the next string.

那就是你用“和”替换空格包围的斜线,也用空格包围。 $ 1和$ 2将更新为空字符串,因此当您处理下一个字符串时,您将看到相同的行为。

#2


8  

The parentheses aren't doing anything in this context. You could get the same result using /\b \/ \b/.

括号在这种情况下没有做任何事情。您可以使用/ \ b \ / \ b /获得相同的结果。

I think you are getting a little confused by $1 and $2. Those aren't actually doing anything either. They are nil because they are matching nothing (just a word boundry). What you have written is the logical equivalent of .gsub(/\b \/ \b/, " and ")

我认为你会因为1美元和2美元而感到困惑。那些实际上也没有做任何事情。它们是零,因为它们没有匹配(只是一个单词边界)。你写的是.gsub(/ \ b \ / \ b /,“和”)的逻辑等价物