在Ruby中拆分包含多个分隔符的字符串

时间:2021-09-08 21:36:11

Take for instance, I have a string like this:

举个例子,我有一个像这样的字符串:

options = "Cake or pie, ice cream, or pudding"

I want to be able to split the string via or, ,, and , or.

我希望能够通过or,,,和,或分割字符串。

The thing is, is that I have been able to do it, but only by parsing , and , or first, and then splitting each array item at or, flattening the resultant array afterwards as such:

问题是,我能够做到这一点,但只能通过解析,或者首先,然后将每个数组项拆分为或者,然后将结果数组展平:

options = options.split(/(?:\s?or\s)*([^,]+)(?:,\s*)*/).reject(&:empty?);
options.each_index {|index| options[index] = options[index].sub("?","").split(" or "); }

The resultant array is as such: ["Cake", "pie", "ice cream", "pudding"]

结果数组是这样的:[“蛋糕”,“馅饼”,“冰淇淋”,​​“布丁”]

Is there a more efficient (or easier) way to split my string on those three delimiters?

是否有更有效(或更简单)的方法将我的字符串拆分为这三个分隔符?

3 个解决方案

#1


15  

What about the following:

以下内容如何:

options.gsub(/ or /i, ",").split(",").map(&:strip).reject(&:empty?)
  • replaces all delimiters but the ,
  • 替换所有分隔符,但是,
  • splits it at ,
  • 把它分开,
  • trims each characters, since stuff like ice cream with a leading space might be left
  • 修剪每个角色,因为可能会留下像领先空间的冰淇淋之类的东西
  • removes all blank strings
  • 删除所有空白字符串

#2


9  

First of all, your method could be simplified a bit with Array#flatten:

首先,使用Array#flatten可以简化您的方法:

>> options.split(',').map{|x|x.split 'or'}.flatten.map(&:strip).reject(&:empty?)
=> ["Cake", "pie", "ice cream", "pudding"]

I would prefer using a single regex:

我更喜欢使用单个正则表达式:

>> options.split /\s*, or\s+|\s*,\s*|\s+or\s+/
=> ["Cake", "pie", "ice cream", "pudding"]

You can use | in a regex to give alternatives, and putting , or first guarantees that it won’t produce an empty item. Capturing the whitespace with the regex is probably best for efficiency, since you don’t have to scan the array again.

你可以使用|在正则表达式中给出替代品,放置或首先保证它不会产生空项目。使用正则表达式捕获空白可能最有效,因为您不必再​​次扫描数组。

As Zabba points out, you may still want to reject empty items, prompting this solution:

正如Zabba指出的那样,你可能仍然想要拒绝空项目,提示这个解决方案:

>> options.split(/,|\sor\s/).map(&:strip).reject(&:empty?)
=> ["Cake", "pie", "ice cream", "pudding"]

#3


3  

As "or" and "," does the same thing, the best approach is to tell the regex that multiple cases should be treated the same as a single case:

由于“或”和“,”做同样的事情,最好的方法是告诉正则表达式,多个案例应该被视为与单个案例相同:

options = "Cake or pie, ice cream, or pudding"
regex = /(?:\s*(?:,|or)\s*)+/
options.split(regex)

#1


15  

What about the following:

以下内容如何:

options.gsub(/ or /i, ",").split(",").map(&:strip).reject(&:empty?)
  • replaces all delimiters but the ,
  • 替换所有分隔符,但是,
  • splits it at ,
  • 把它分开,
  • trims each characters, since stuff like ice cream with a leading space might be left
  • 修剪每个角色,因为可能会留下像领先空间的冰淇淋之类的东西
  • removes all blank strings
  • 删除所有空白字符串

#2


9  

First of all, your method could be simplified a bit with Array#flatten:

首先,使用Array#flatten可以简化您的方法:

>> options.split(',').map{|x|x.split 'or'}.flatten.map(&:strip).reject(&:empty?)
=> ["Cake", "pie", "ice cream", "pudding"]

I would prefer using a single regex:

我更喜欢使用单个正则表达式:

>> options.split /\s*, or\s+|\s*,\s*|\s+or\s+/
=> ["Cake", "pie", "ice cream", "pudding"]

You can use | in a regex to give alternatives, and putting , or first guarantees that it won’t produce an empty item. Capturing the whitespace with the regex is probably best for efficiency, since you don’t have to scan the array again.

你可以使用|在正则表达式中给出替代品,放置或首先保证它不会产生空项目。使用正则表达式捕获空白可能最有效,因为您不必再​​次扫描数组。

As Zabba points out, you may still want to reject empty items, prompting this solution:

正如Zabba指出的那样,你可能仍然想要拒绝空项目,提示这个解决方案:

>> options.split(/,|\sor\s/).map(&:strip).reject(&:empty?)
=> ["Cake", "pie", "ice cream", "pudding"]

#3


3  

As "or" and "," does the same thing, the best approach is to tell the regex that multiple cases should be treated the same as a single case:

由于“或”和“,”做同样的事情,最好的方法是告诉正则表达式,多个案例应该被视为与单个案例相同:

options = "Cake or pie, ice cream, or pudding"
regex = /(?:\s*(?:,|or)\s*)+/
options.split(regex)