如何使用cson的正则表达式

时间:2022-09-27 19:28:37

I wanna capture logical operators from ooRexx with regex in a .cson file because I want support syntax highlighting of ooRexx with the Atom editor. Those are the operators I try to cover:

我想在.cson文件中使用regex从ooRexx中捕获逻辑运算符,因为我希望使用Atom编辑器支持语法高亮显示ooRexx。那些是我试图涵盖的运营商:

>= <= \> \< \= >< <> == \== // && || ** ¬> ¬< ¬= ¬== >> << >>= \<< ¬<< \>> ¬>> <<=

> = <= \> \ <\ => <<> == \ == // && || **¬>¬<¬=¬== >> << >> = \ <<¬<< \ >>¬>> << =

And this is the regex part in the cson file:

这是cson文件中的正则表达式部分:

'match': '\\+ | - | [\\\\] | \\/ | % | \\* | \\| | & |=|¬|>|<|
>= | <= | ([\\\\]>) | ([\\\\]<) | ([\\\\]=) | >< | <> | == | ([\\\\]==) | 
\\/\\/ | && | \\|\\| | \\*\\* | ¬> | ¬< | ¬= | ¬== | >> | << | >>= | ([\\\\]<<) | ¬<< |
([\\\\]>>) | ¬>> | <<='

I'm struggling with the slashes (forward and backward) and also with the double **My knowledge about regex is very basic, to say it nicely. Is there somebody who can help me with that?

我正在努力使用斜线(前进和后退)以及双重**我对正则表达式的了解非常基本,说得很好。有人可以帮助我吗?

1 个解决方案

#1


0  

You have spaces around the pipe bars: these spaces are counted in the regular expression. So when you write something like | \*\* |, the double asterisks get caught, but only if they are surrounded by a space on each side, and not if they're affixed to a word or at the beginning/end of a line. Same issue with the slashes — I have tested it, and it does seem to catch them for me, but only as long as your slashes (or asterisks) are between two spaces.

管柱周围有空格:这些空格在正则表达式中计算。所以当你写像| \ * \ * |,双星号被捕获,但只有当它们被每一侧的空间包围时,而不是它们被粘贴到一个单词或一行的开头/结尾。与斜杠相同的问题 - 我已经测试了它,它确实似乎为我捕获它们,但只要你的斜线(或星号)在两个空格之间。

A few other things to keep in mind:

还有一些要记住的事情:

  • You shouldn't need the square brackets around backslashes; they're useful to provide classes of possible characters to match. For instance, [<>]= will catch both >= and <=. Writing [\\] is equivalent to writing \\ directly because \\ counts as a single character, due to the first escaping backslash. Similarly, your parentheses here are not being used; see grouping.
  • 你不应该需要在反斜杠周围的方括号;它们对于提供匹配的可能字符类非常有用。例如,[<>] =将同时捕获> =和<=。写入[\\]等同于直接编写\\因为\\计为单个字符,因为第一次转义反斜杠。同样,这里的括号没有被使用;看分组。

  • Also think of using repetition operators like + and *. So \\>+ will catch both \> and \>>.
  • 还要考虑使用像+和*这样的重复运算符。所以\\> +会同时捕获\>和\ >>。

  • Finally, the question mark will help you avoid repetition, by marking the previous character (or group of characters, in square brackets) as optional. ==? will match both = and ==.
  • 最后,通过将前一个字符(或方括号中的字符组)标记为可选,问号将帮助您避免重复。 ==?将匹配=和==。

You can group together a LOT of your statements with these three tricks combined… I'll leave that exercise to you!

你可以将这三个技巧结合在一起你的很多陈述...我会把这个练习留给你!

Just another hint when developing long regular expressions — use a tester like Regex101 or similar with a test file to see your changes in real time, and debuggers like Regexper will help you understand how your regular expression is parsed.

在开发长正则表达式时只需另一个提示 - 使用像Regex101这样的测试器或类似的测试文件来实时查看您的更改,而像Regexper这样的调试器将帮助您了解正则表达式的解析方式。

#1


0  

You have spaces around the pipe bars: these spaces are counted in the regular expression. So when you write something like | \*\* |, the double asterisks get caught, but only if they are surrounded by a space on each side, and not if they're affixed to a word or at the beginning/end of a line. Same issue with the slashes — I have tested it, and it does seem to catch them for me, but only as long as your slashes (or asterisks) are between two spaces.

管柱周围有空格:这些空格在正则表达式中计算。所以当你写像| \ * \ * |,双星号被捕获,但只有当它们被每一侧的空间包围时,而不是它们被粘贴到一个单词或一行的开头/结尾。与斜杠相同的问题 - 我已经测试了它,它确实似乎为我捕获它们,但只要你的斜线(或星号)在两个空格之间。

A few other things to keep in mind:

还有一些要记住的事情:

  • You shouldn't need the square brackets around backslashes; they're useful to provide classes of possible characters to match. For instance, [<>]= will catch both >= and <=. Writing [\\] is equivalent to writing \\ directly because \\ counts as a single character, due to the first escaping backslash. Similarly, your parentheses here are not being used; see grouping.
  • 你不应该需要在反斜杠周围的方括号;它们对于提供匹配的可能字符类非常有用。例如,[<>] =将同时捕获> =和<=。写入[\\]等同于直接编写\\因为\\计为单个字符,因为第一次转义反斜杠。同样,这里的括号没有被使用;看分组。

  • Also think of using repetition operators like + and *. So \\>+ will catch both \> and \>>.
  • 还要考虑使用像+和*这样的重复运算符。所以\\> +会同时捕获\>和\ >>。

  • Finally, the question mark will help you avoid repetition, by marking the previous character (or group of characters, in square brackets) as optional. ==? will match both = and ==.
  • 最后,通过将前一个字符(或方括号中的字符组)标记为可选,问号将帮助您避免重复。 ==?将匹配=和==。

You can group together a LOT of your statements with these three tricks combined… I'll leave that exercise to you!

你可以将这三个技巧结合在一起你的很多陈述...我会把这个练习留给你!

Just another hint when developing long regular expressions — use a tester like Regex101 or similar with a test file to see your changes in real time, and debuggers like Regexper will help you understand how your regular expression is parsed.

在开发长正则表达式时只需另一个提示 - 使用像Regex101这样的测试器或类似的测试文件来实时查看您的更改,而像Regexper这样的调试器将帮助您了解正则表达式的解析方式。