I have a 141 characters long regular expression in my Rails application and Rubocop doesn't like it.
我的Rails应用程序中有一个141个字符长的正则表达式,Rubocop不喜欢它。
My regular expression:
我的正则表达式:
URL_REGEX = /\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)\z/
This pattern checks for urls & one level path e.g. http(s)://example.com/path
此模式检查url和一个级别的路径,例如http(s)://example.com/path。
-
Can you safely split a regular expression in Ruby? What is the general mechanism for splitting a regular expression in Ruby?
您能安全地在Ruby中拆分正则表达式吗?在Ruby中分割正则表达式的一般机制是什么?
-
How do you tell Rubocop to take it easy on regular expressions?
如何告诉Rubocop在正则表达式上放松点?
Thanks a lot!
谢谢!
4 个解决方案
#1
3
You should try something like this:
你应该试试这样的:
regexp = %r{\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+
([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[\w.]+)\z}x
if 'http://example.com/path' =~ regexp
puts 'matches'
end
The "x" at the end is to ignore whitespace and comments in the pattern.
最后的“x”是在模式中忽略空白和注释。
Check the ruby style guide last example https://github.com/github/rubocop-github/blob/master/STYLEGUIDE.md#regular-expressions
检查ruby风格指南最后一个示例https://github.com/github/rubocop-github/blob/master/STYLEGUIDE.md#常规表达式
#2
2
How do you tell Rubocop to take it easy on regular expressions?
如何告诉Rubocop在正则表达式上放松点?
The cop that is complaining about this is likely Metrics/LineLength
. There is no configuration option to ignore regular expressions, but you can inline disable it if you are okay with the regexp being that long:
抱怨这一点的警察很有可能是标准/线人。没有配置选项可以忽略正则表达式,但是如果您对regexp很满意,那么可以内联禁用它:
# rubocop:disable Metrics/LineLength
URL_REGEX = /\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)\z/
# rubocop:enable Metrics/LineLength
It is also possible to put just a trailing rubocop:disable
at the end of the line, but since the line is already very long, it could easily be missed, so the enable-disable combo might be better here.
也可以在行尾添加一个末尾的rubocop:disable,但是因为行已经很长了,所以很容易被忽略,所以启用-禁用组合在这里可能更好。
#3
2
Yes. you can create parts of regexes, and use them within the final regex you want.
是的。您可以创建regex的部分,并在您想要的最终regex中使用它们。
prefix = %w(http://www. https://www. https://)
prefix = Regexp.union(*prefix.map{|e| Regexp.escape(e)})
letters = "[a-z\d]+"
URL_REGEX = /\A(#{prefix})?#{letters}([-.]#{letters)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-.\w]+)\z/
#4
2
Another option would be to use a more concise regex. There are several places where you are repeating patterns when you don't need to.
另一个选择是使用更简洁的regex。有几个地方在不需要的时候重复模式。
/\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)\z/
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(https?:\/\/(www.)?)?
With that and a few more alterations, I got your regex down to:
有了这些,再做一些改动,我把您的regex简化为:
/^(https?:\/\/(www.)?)?[-a-z0-9.]+\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)$/
It's not exactly equivalent, but here's my test.
这不是完全等价的,但这是我的测试。
#1
3
You should try something like this:
你应该试试这样的:
regexp = %r{\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+
([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[\w.]+)\z}x
if 'http://example.com/path' =~ regexp
puts 'matches'
end
The "x" at the end is to ignore whitespace and comments in the pattern.
最后的“x”是在模式中忽略空白和注释。
Check the ruby style guide last example https://github.com/github/rubocop-github/blob/master/STYLEGUIDE.md#regular-expressions
检查ruby风格指南最后一个示例https://github.com/github/rubocop-github/blob/master/STYLEGUIDE.md#常规表达式
#2
2
How do you tell Rubocop to take it easy on regular expressions?
如何告诉Rubocop在正则表达式上放松点?
The cop that is complaining about this is likely Metrics/LineLength
. There is no configuration option to ignore regular expressions, but you can inline disable it if you are okay with the regexp being that long:
抱怨这一点的警察很有可能是标准/线人。没有配置选项可以忽略正则表达式,但是如果您对regexp很满意,那么可以内联禁用它:
# rubocop:disable Metrics/LineLength
URL_REGEX = /\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)\z/
# rubocop:enable Metrics/LineLength
It is also possible to put just a trailing rubocop:disable
at the end of the line, but since the line is already very long, it could easily be missed, so the enable-disable combo might be better here.
也可以在行尾添加一个末尾的rubocop:disable,但是因为行已经很长了,所以很容易被忽略,所以启用-禁用组合在这里可能更好。
#3
2
Yes. you can create parts of regexes, and use them within the final regex you want.
是的。您可以创建regex的部分,并在您想要的最终regex中使用它们。
prefix = %w(http://www. https://www. https://)
prefix = Regexp.union(*prefix.map{|e| Regexp.escape(e)})
letters = "[a-z\d]+"
URL_REGEX = /\A(#{prefix})?#{letters}([-.]#{letters)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-.\w]+)\z/
#4
2
Another option would be to use a more concise regex. There are several places where you are repeating patterns when you don't need to.
另一个选择是使用更简洁的regex。有几个地方在不需要的时候重复模式。
/\A(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)\z/
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(https?:\/\/(www.)?)?
With that and a few more alterations, I got your regex down to:
有了这些,再做一些改动,我把您的regex简化为:
/^(https?:\/\/(www.)?)?[-a-z0-9.]+\.[a-z]{2,5}(:[0-9]{1,5})?(\/[-\w.]+)$/
It's not exactly equivalent, but here's my test.
这不是完全等价的,但这是我的测试。