I'm working in Ruby and I want to split a string and its punctuation into an array, but I want to consider apostrophes and hyphens as parts of words. For example,
我在Ruby中工作,我想把一个字符串和它的标点符号分割成一个数组,但是我想把撇号和连字符作为单词的一部分。例如,
s = "here...is a happy-go-lucky string that I'm writing"
should become
应该成为
["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"].
The closest I've gotten is still inadequate because it doesn't properly consider hyphens and apostrophes as part of the word.
我得到的最接近的词仍然是不够的,因为它没有正确地将连字符和撇号作为词的一部分。
This is the closest I've gotten so far:
这是我到目前为止最接近的一次:
s.scan(/\w+|\W+/).select {|x| x.match(/\S/)}
which yields
的收益率
["here", "...", "is", "a", "happy", "-", "go", "-", "lucky", "string", "that", "I", "'", "m", "writing"]
.
。
4 个解决方案
#1
7
You can try the following:
你可以试试以下方法:
s.scan(/[\w'-]+|[[:punct:]]+/)
#=> ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
#2
2
You were close:
你是亲密:
s.scan(/[\w'-]+|[.,!?]+/)
The idea is we match either words with possibly '
/-
in them or punctuation characters.
我们的想法是用可能的“/”或标点符号来匹配单词。
#3
1
After nearly giving up then tinkering some more, I appear to have solved the puzzle. This seems to work: s.scan(/[\w'-]+|\W+/).select {|x| x.match(/\S/)}
. It yields ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
.
在几乎放弃之后,我似乎已经解决了这个难题。这似乎是可行的:扫描(/[\w'-]+|\ w +/)。选择{ x | | x.match(\ S /)}。它的收益率(“在这里”、“……”“”、“”、“随遇而安的”、“字符串”、“那”,“我”、“写作”)。
Is there an even cleaner way to do it though, without having to use #select
?
有没有一种更简洁的方法,不用使用#select?
#4
0
Use the split
method.
使用splitmethod。
Example:
例子:
str = "word, anotherWord, foo"
puts str.split(",")
It returns
它返回
word
anotherWord
foo
Hope it works for you!
希望对你有用!
Also you can chek this http://ruby.about.com/od/advancedruby/a/split.htm
您还可以访问http://ruby.about.com/od/advancedruby/a/split.htm
#1
7
You can try the following:
你可以试试以下方法:
s.scan(/[\w'-]+|[[:punct:]]+/)
#=> ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
#2
2
You were close:
你是亲密:
s.scan(/[\w'-]+|[.,!?]+/)
The idea is we match either words with possibly '
/-
in them or punctuation characters.
我们的想法是用可能的“/”或标点符号来匹配单词。
#3
1
After nearly giving up then tinkering some more, I appear to have solved the puzzle. This seems to work: s.scan(/[\w'-]+|\W+/).select {|x| x.match(/\S/)}
. It yields ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
.
在几乎放弃之后,我似乎已经解决了这个难题。这似乎是可行的:扫描(/[\w'-]+|\ w +/)。选择{ x | | x.match(\ S /)}。它的收益率(“在这里”、“……”“”、“”、“随遇而安的”、“字符串”、“那”,“我”、“写作”)。
Is there an even cleaner way to do it though, without having to use #select
?
有没有一种更简洁的方法,不用使用#select?
#4
0
Use the split
method.
使用splitmethod。
Example:
例子:
str = "word, anotherWord, foo"
puts str.split(",")
It returns
它返回
word
anotherWord
foo
Hope it works for you!
希望对你有用!
Also you can chek this http://ruby.about.com/od/advancedruby/a/split.htm
您还可以访问http://ruby.about.com/od/advancedruby/a/split.htm