在Ruby中解析大字符串值

时间:2021-11-15 02:07:20

Example of string I am working with:

我正在使用的字符串示例:

s = "{new {value1 value2 value3}} {old {value2 value1 value1}} {{old school} {value2 value3 value1}}"

s =“{new {value1 value2 value3}} {old {value2 value1 value1}} {{old school} {value2 value3 value1}}”

The {}'s are affected by spaces, which is why "old school" is surrounded while "new" and "old" are not.

{}的空间受到空间的影响,这就是为什么“旧学校”被包围而“新学校”被包围的原因。

Parsing the first two (new and old) are easily done using s.split[1] to access "new" and s.split[3..5] for the values. The problem comes when "new" or "old" has a space, in this case "old school". In the database I am accessing, these names with spaces occur randomly.

解析前两个(新旧)可以使用s.split [1]轻松完成访问“new”和s.split [3..5]的值。问题出现在“新”或“旧”有空间的情况下,在这种情况下是“老派”。在我访问的数据库中,这些带空格的名称是随机出现的。

How can I alter my parsing to account for these occurrences?

如何更改解析以解释这些事件?

2 个解决方案

#1


0  

You can do it with this one line:

你可以用这一行做到这一点:

s.split("}} {").map{|x| x.split(" {")}.map{|x| x.map{|y| y.gsub("{","").gsub("}","")}}

Kind of ugly but works with your example, returns:

有点丑,但适用于你的例子,返回:

[["new", "value1 value2 value3"], ["old", "value2 value1 value1"], ["old school", "value2 value3 value1"]]

You can then parse if further by breaking values into their own objects etc. If you want it as hash, you can get it like this:

然后,您可以通过将值分解为自己的对象来进一步解析。如果您希望它作为哈希,您可以这样得到它:

Hash[s.split("}} {").map{|x| x.split(" {")}.map{|x| x.map{|y| y.gsub("{","").gsub("}","")}}]

This will return:

这将返回:

{"new"=>"value1 value2 value3", "old"=>"value2 value1 value1", "old school"=>"value2 value3 value1"} 

#2


0  

You don't want to parse this with regular expressions, you should rather go character by character and remember your position in there bracket hierarchy.

你不想用正则表达式解析它,你应该逐字逐句地记住你在括号层次结构中的位置。

Here is a solution of mine: http://pastebin.com/kLLnS5qB

这是我的解决方案:http://pastebin.com/kLLnS5qB

(That's only a rought cut, some calls aren't really dry and it lacks testing.)

(这只是一个很好的决定,有些电话并不是真的很干,而且缺乏测试。)

$ ruby foo.rb 
[#<struct key="new", values=["value1", "value2", "value3"]>, #<struct key="old", values=["value2", "value1", "value1"]>, #<struct key="old school", values=["value2", "value3", "value1"]>]

#1


0  

You can do it with this one line:

你可以用这一行做到这一点:

s.split("}} {").map{|x| x.split(" {")}.map{|x| x.map{|y| y.gsub("{","").gsub("}","")}}

Kind of ugly but works with your example, returns:

有点丑,但适用于你的例子,返回:

[["new", "value1 value2 value3"], ["old", "value2 value1 value1"], ["old school", "value2 value3 value1"]]

You can then parse if further by breaking values into their own objects etc. If you want it as hash, you can get it like this:

然后,您可以通过将值分解为自己的对象来进一步解析。如果您希望它作为哈希,您可以这样得到它:

Hash[s.split("}} {").map{|x| x.split(" {")}.map{|x| x.map{|y| y.gsub("{","").gsub("}","")}}]

This will return:

这将返回:

{"new"=>"value1 value2 value3", "old"=>"value2 value1 value1", "old school"=>"value2 value3 value1"} 

#2


0  

You don't want to parse this with regular expressions, you should rather go character by character and remember your position in there bracket hierarchy.

你不想用正则表达式解析它,你应该逐字逐句地记住你在括号层次结构中的位置。

Here is a solution of mine: http://pastebin.com/kLLnS5qB

这是我的解决方案:http://pastebin.com/kLLnS5qB

(That's only a rought cut, some calls aren't really dry and it lacks testing.)

(这只是一个很好的决定,有些电话并不是真的很干,而且缺乏测试。)

$ ruby foo.rb 
[#<struct key="new", values=["value1", "value2", "value3"]>, #<struct key="old", values=["value2", "value1", "value1"]>, #<struct key="old school", values=["value2", "value3", "value1"]>]