I'm building a script to read and parse markdown files in Ruby. The script needs to be able to read and understand the multimarkdown header information at the top of the files so that it can perform additional actions on the output.
我正在构建一个脚本来读取和解析Ruby中的markdown文件。脚本需要能够读取和理解文件顶部的multimarkdown头信息,以便在输出上执行其他操作。
The header values look like this:
头值如下所示:
Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41
I can't figure out how to split the lines of text into a simple key-value dictionary. The built in split function doesn't seem to work in this case because I only want it to split on the first occurrence of a colon (:) in each line. Additional colons would be part of the value string.
我不知道如何将文本行分割成一个简单的键值字典。在这种情况下,内置的split函数似乎不起作用,因为我只想让它在每行中的第一个冒号(:)上进行分割。附加的冒号将是值字符串的一部分。
In case it matters I'm using Ruby 1.8.7 on OS X.
如果重要的话,我在OS X上使用Ruby 1.8.7。
5 个解决方案
#1
5
Use split
with an optional second parameter (thanks to @MichaelKohl)
使用可选的第二个参数split(感谢@MichaelKohl)
s = 'Author URL: http://somedudeswebsite.me/'
key, value = s.split ': ', 2
puts key
puts value
Output
输出
Author URL
http://somedudeswebsite.me/
#2
7
This does it:
这它:
s = <<EOS
Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41
EOS
h = Hash[s.each_line.map { |l| l.chomp.split(': ', 2) }]
p h
Output:
输出:
{"Title"=>"My Treatise on Kumquats", "Author"=>"Joe Schmoe", "Author URL"=>"http://somedudeswebsite.me/", "Host URL"=>"http://googlesnewthing.com/", "Created"=>"2012-01-01 09:41"}
#3
1
You can use regex to parse your text:
您可以使用regex解析您的文本:
str = "Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41"
matches = str.scan /^(.+?): (.+?)$/m
matches.each { |m|
key = m[0]
value = m[1]
}
This is multi-line regex (/<regex>/m
) - it will match each line into two groups (with indexes 0 and 1). First group will contain all characters before the first occurence of ": "
(colon + space). Second group will contain all the rest characters in this line (until regex encounter end of line $
).
这是多行regex (/
This is how you can convert result into Hash:
这就是如何将结果转换为散列的:
dictionary = matches.inject({}) do |dict, m|
dict[m[0]] = m[1]
dict
end
UPDATE
更新
As Michael Kohl mentioned, it is possible to write this in one line:
正如迈克尔·科尔(Michael Kohl)所提到的,可以用一句话来描述:
hash = Hash[str.scan /^(.+?): (.+?)$/m]
#4
0
You can simply do this by
你可以通过
>> s = 'Author URL: http://somedudeswebsite.me/'
>> first_idx = s.index(':')
>> key,value = s[0..first_idx-1],s[first_idx+1..s.length]
=> ["Author URL", " http://somedudeswebsite.me/"]
or to key value hash by
或键值哈希by
>> kv = Hash[*s[0..first_idx-1],s[first_idx+1..s.length]]
=> {"Author URL"=>" http://somedudeswebsite.me/"}
Hope this helps
希望这有助于
#5
0
Is line.split(':',2)
what you want?
你想要什么?
String#split accepts a second argument which specifies parts to be splited. it works in ruby 1.9.3, not sure in earlier versions. (but i'm almost sure it also works in 1.9.2 too)
String#split接受第二个参数,该参数指定要分割的部分。它在ruby 1.9.3中工作,在早期版本中不确定。(但我几乎可以肯定的是,在1.9.2版本中它也同样适用)
If this is not available, line.scan(%r{^([^:]*):(.*)})
should also work.
如果这不是可用,line.scan r(% { ^(^:*):(. *)})也应该工作。
#1
5
Use split
with an optional second parameter (thanks to @MichaelKohl)
使用可选的第二个参数split(感谢@MichaelKohl)
s = 'Author URL: http://somedudeswebsite.me/'
key, value = s.split ': ', 2
puts key
puts value
Output
输出
Author URL
http://somedudeswebsite.me/
#2
7
This does it:
这它:
s = <<EOS
Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41
EOS
h = Hash[s.each_line.map { |l| l.chomp.split(': ', 2) }]
p h
Output:
输出:
{"Title"=>"My Treatise on Kumquats", "Author"=>"Joe Schmoe", "Author URL"=>"http://somedudeswebsite.me/", "Host URL"=>"http://googlesnewthing.com/", "Created"=>"2012-01-01 09:41"}
#3
1
You can use regex to parse your text:
您可以使用regex解析您的文本:
str = "Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41"
matches = str.scan /^(.+?): (.+?)$/m
matches.each { |m|
key = m[0]
value = m[1]
}
This is multi-line regex (/<regex>/m
) - it will match each line into two groups (with indexes 0 and 1). First group will contain all characters before the first occurence of ": "
(colon + space). Second group will contain all the rest characters in this line (until regex encounter end of line $
).
这是多行regex (/
This is how you can convert result into Hash:
这就是如何将结果转换为散列的:
dictionary = matches.inject({}) do |dict, m|
dict[m[0]] = m[1]
dict
end
UPDATE
更新
As Michael Kohl mentioned, it is possible to write this in one line:
正如迈克尔·科尔(Michael Kohl)所提到的,可以用一句话来描述:
hash = Hash[str.scan /^(.+?): (.+?)$/m]
#4
0
You can simply do this by
你可以通过
>> s = 'Author URL: http://somedudeswebsite.me/'
>> first_idx = s.index(':')
>> key,value = s[0..first_idx-1],s[first_idx+1..s.length]
=> ["Author URL", " http://somedudeswebsite.me/"]
or to key value hash by
或键值哈希by
>> kv = Hash[*s[0..first_idx-1],s[first_idx+1..s.length]]
=> {"Author URL"=>" http://somedudeswebsite.me/"}
Hope this helps
希望这有助于
#5
0
Is line.split(':',2)
what you want?
你想要什么?
String#split accepts a second argument which specifies parts to be splited. it works in ruby 1.9.3, not sure in earlier versions. (but i'm almost sure it also works in 1.9.2 too)
String#split接受第二个参数,该参数指定要分割的部分。它在ruby 1.9.3中工作,在早期版本中不确定。(但我几乎可以肯定的是,在1.9.2版本中它也同样适用)
If this is not available, line.scan(%r{^([^:]*):(.*)})
should also work.
如果这不是可用,line.scan r(% { ^(^:*):(. *)})也应该工作。