我能用什么方法改进这个正则表达式?

时间:2021-08-02 06:43:32

I have written this regex that works, but honestly, it’s like 75% guesswork.

我已经编写了这个regex,它是有效的,但是老实说,它就像是75%的猜测。

The goal is this: I have lots of imports in Xcode, like so:

目标是这样的:我在Xcode中有很多导入,如下所示:

#import <UIKit/UIKit.h>
#import "NSString+MultilineFontSize.h"

and I only want to return the categories that contain +. There are also lots of lines of code throughout the source which include + in other contexts.

我只想返回包含+的类别。在整个源代码中还有很多行代码,在其他上下文中包含+。

Right now, this returns all of the proper lines throughout the Xcode project. But if there is one thing I’ve learned from googling and searching Stack Overflow for regex tutorials, it is that there are LOTS of different ways to do things. I’d love to see all of the different ways you guys can come up with that make it either more efficient or more bulletproof regarding potential spoofs or misses.

现在,它返回整个Xcode项目中的所有正确行。但是如果有一件事是我从google搜索和搜索Stack Overflow中学到的,那就是有很多不同的方法可以做事情。我很想看看你们能想出的所有不同的方法使它在潜在的欺骗或失误方面更有效或更防弹。

^\#import+.[\"]*+.(?:(?!\+).)*+.*[\"]

Thanks in advance for all of your help.

谢谢你的帮助。

Update

更新

Also I suppose I’ll accept the answer of whoever does this with the shortest string, without missing any possible spoofs. But again, thanks to everyone who participates in this learning experience.

而且我想我会接受用最短的字符串来做这个的人的答案,而不会遗漏任何可能的欺骗。再次感谢所有参与学习的人。

Resources from answers

资源从答案

This is an awesome resource for practicing regex from Dan Rasmussen: RegExr

这是Dan Rasmussen: RegExr中练习regex的绝佳资源

2 个解决方案

#1


3  

The first thing I notice is that your + characters are misplaced: t+. matches t one or more times, followed by a single character .. I'm assuming you wanted to match the end of import, followed by one or more of any character: import.+

我首先注意到你的+字符放错了地方:t+。匹配一个或多个t,后面跟着一个字符。我假设您希望匹配import的末尾,后跟一个或多个任意字符:import.+

Secondly, # doesn't need to be escaped.

其次,#不需要被转义。

Here's what I came up with: ^#import\s+(.*\+.*)$

这就是我想出了:^ #进口\ s +(. * \ + . *)美元

\s+ matches one or more whitespace character, so you're guaranteed that the line actually starts with #import and not #importbutnotreally or anything else.

\s+匹配一个或多个空格字符,因此可以保证行实际上以#import开始,而不是#importbut notrealor其他任何字符。

I'm not familiar with xcode syntax, but the following part of the expression, (.*\+.*), simply matches any string with a + character somewhere in it. This means invalid imports may be matched, but I'm working under the assumption your trying to match valid code. If not, this will need to be modified to validate the importer syntax as well.

我不熟悉xcode语法,但是表达式的以下部分(.*\+.*)只是将任何字符串与其中某个+字符匹配。这意味着可能会匹配无效的导入,但我在假定您试图匹配有效代码的情况下工作。如果不是,还需要修改它以验证导入器语法。


P.S. To test your expression, try RegExr. You can hover over characters to check what they do.

要测试您的表达式,请尝试RegExr。您可以将鼠标悬停在字符上,以检查它们的功能。

#2


0  

sed 's:^#import \(.*[+].*\):\1:' FILE

will display

将显示

"NSString+MultilineFontSize.h"

for your sample.

为你的样品。

#1


3  

The first thing I notice is that your + characters are misplaced: t+. matches t one or more times, followed by a single character .. I'm assuming you wanted to match the end of import, followed by one or more of any character: import.+

我首先注意到你的+字符放错了地方:t+。匹配一个或多个t,后面跟着一个字符。我假设您希望匹配import的末尾,后跟一个或多个任意字符:import.+

Secondly, # doesn't need to be escaped.

其次,#不需要被转义。

Here's what I came up with: ^#import\s+(.*\+.*)$

这就是我想出了:^ #进口\ s +(. * \ + . *)美元

\s+ matches one or more whitespace character, so you're guaranteed that the line actually starts with #import and not #importbutnotreally or anything else.

\s+匹配一个或多个空格字符,因此可以保证行实际上以#import开始,而不是#importbut notrealor其他任何字符。

I'm not familiar with xcode syntax, but the following part of the expression, (.*\+.*), simply matches any string with a + character somewhere in it. This means invalid imports may be matched, but I'm working under the assumption your trying to match valid code. If not, this will need to be modified to validate the importer syntax as well.

我不熟悉xcode语法,但是表达式的以下部分(.*\+.*)只是将任何字符串与其中某个+字符匹配。这意味着可能会匹配无效的导入,但我在假定您试图匹配有效代码的情况下工作。如果不是,还需要修改它以验证导入器语法。


P.S. To test your expression, try RegExr. You can hover over characters to check what they do.

要测试您的表达式,请尝试RegExr。您可以将鼠标悬停在字符上,以检查它们的功能。

#2


0  

sed 's:^#import \(.*[+].*\):\1:' FILE

will display

将显示

"NSString+MultilineFontSize.h"

for your sample.

为你的样品。