I have a string input field in a form. I get that value in params hash. How should I remove all characters except alphabets and numbers from that string.
我在表单中有一个字符串输入字段。我在params hash中得到了这个值。如何从该字符串中删除除字母和数字之外的所有字符。
3 个解决方案
#1
52
Just to remind people of good 'ol tr
:
只是为了提醒人们好的'ol tr:
asdf.tr('^A-Za-z0-9', '')
which is finding the complement of the character ranges and translating the characters to ''.
这是找到字符范围的补充并将字符转换为''。
I was curious whether using a \W
character class was faster than ranges and gsub
vs. tr
:
我很好奇是否使用\ W字符类比范围和gsub与tr更快:
require 'benchmark'
asdf = [('A'..'z').to_a, ('0'..'9').to_a].join
puts asdf
puts asdf.tr( '^A-Za-z0-9', '' )
puts asdf.gsub( /[\W_]+/, '' )
puts asdf.gsub( /\W+/, '' )
puts asdf.gsub( /\W/, '' )
puts asdf.gsub( /[^A-Za-z0-9]+/, '' )
puts asdf.scan(/[a-z\d]/i).join
n = 100_000
Benchmark.bm(7) do |x|
x.report("tr:") { n.times do; asdf.tr('^A-Za-z0-9', ''); end }
x.report("gsub1:") { n.times do; asdf.gsub(/[\W_]+/, ''); end }
x.report("gsub2:") { n.times do; asdf.gsub(/\W+/, ''); end }
x.report("gsub3:") { n.times do; asdf.gsub(/\W/, ''); end }
x.report("gsub4:") { n.times do; asdf.gsub(/[^A-Za-z0-9]+/, ''); end }
x.report("scan:") { n.times do; asdf.scan(/[a-z\d]/i).join; end }
end
>> ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> user system total real
>> tr: 0.560000 0.000000 0.560000 ( 0.557883)
>> gsub1: 0.510000 0.000000 0.510000 ( 0.513244)
>> gsub2: 0.820000 0.000000 0.820000 ( 0.823816)
>> gsub3: 0.960000 0.000000 0.960000 ( 0.955848)
>> gsub4: 0.900000 0.000000 0.900000 ( 0.902166)
>> scan: 5.630000 0.010000 5.640000 ( 5.630990)
You can see a couple of the patterns aren't catching the '_', which is part of \w
, and, as a result not meeting the OP's request.
您可以看到一些模式没有捕获'_',这是\ w的一部分,因此不符合OP的请求。
#2
21
Without a regular expression:
没有正则表达式:
garbage = 'ab_c<>?AB C!@#123'
puts garbage.delete("^a-zA-Z0-9") #=> abcABC123
In which the '^' negates everything after it.
其中'^'否定了它之后的一切。
#3
7
=> '^/how/now#(Brown) Cow'.gsub /\W/, '' # or /[\W_]/
=> "hownowBrownCow"
...updated based on the comments...
...根据评论更新...
#1
52
Just to remind people of good 'ol tr
:
只是为了提醒人们好的'ol tr:
asdf.tr('^A-Za-z0-9', '')
which is finding the complement of the character ranges and translating the characters to ''.
这是找到字符范围的补充并将字符转换为''。
I was curious whether using a \W
character class was faster than ranges and gsub
vs. tr
:
我很好奇是否使用\ W字符类比范围和gsub与tr更快:
require 'benchmark'
asdf = [('A'..'z').to_a, ('0'..'9').to_a].join
puts asdf
puts asdf.tr( '^A-Za-z0-9', '' )
puts asdf.gsub( /[\W_]+/, '' )
puts asdf.gsub( /\W+/, '' )
puts asdf.gsub( /\W/, '' )
puts asdf.gsub( /[^A-Za-z0-9]+/, '' )
puts asdf.scan(/[a-z\d]/i).join
n = 100_000
Benchmark.bm(7) do |x|
x.report("tr:") { n.times do; asdf.tr('^A-Za-z0-9', ''); end }
x.report("gsub1:") { n.times do; asdf.gsub(/[\W_]+/, ''); end }
x.report("gsub2:") { n.times do; asdf.gsub(/\W+/, ''); end }
x.report("gsub3:") { n.times do; asdf.gsub(/\W/, ''); end }
x.report("gsub4:") { n.times do; asdf.gsub(/[^A-Za-z0-9]+/, ''); end }
x.report("scan:") { n.times do; asdf.scan(/[a-z\d]/i).join; end }
end
>> ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
>> user system total real
>> tr: 0.560000 0.000000 0.560000 ( 0.557883)
>> gsub1: 0.510000 0.000000 0.510000 ( 0.513244)
>> gsub2: 0.820000 0.000000 0.820000 ( 0.823816)
>> gsub3: 0.960000 0.000000 0.960000 ( 0.955848)
>> gsub4: 0.900000 0.000000 0.900000 ( 0.902166)
>> scan: 5.630000 0.010000 5.640000 ( 5.630990)
You can see a couple of the patterns aren't catching the '_', which is part of \w
, and, as a result not meeting the OP's request.
您可以看到一些模式没有捕获'_',这是\ w的一部分,因此不符合OP的请求。
#2
21
Without a regular expression:
没有正则表达式:
garbage = 'ab_c<>?AB C!@#123'
puts garbage.delete("^a-zA-Z0-9") #=> abcABC123
In which the '^' negates everything after it.
其中'^'否定了它之后的一切。
#3
7
=> '^/how/now#(Brown) Cow'.gsub /\W/, '' # or /[\W_]/
=> "hownowBrownCow"
...updated based on the comments...
...根据评论更新...