I have this HTML:
我有这个HTML:
<div class="date">
<h3 class="date-title">Today</h3>
<div class="film">
<img class="poster" src="film1" />
<h4 class="title">Film 1</h4>
<ul class="session-times">
<li>
<a href="#">
<time>12:00 PM</time>
</a>
</li>
</ul>
</div><!-- /.film -->
<div class="film">
<img class="poster" src="film2" />
<h4 class="title">Film 2</h4>
<ul class="session-times">
<li>
<a href="#">
<time>3:00 PM</time>
</a>
</li>
</ul>
</div><!-- /.film -->
<div class="film">
<img class="poster" src="film3" />
<h4 class="title">Film 3</h4>
<ul class="session-times">
<li>
<a href="#">
<time>6:00 PM</time>
</a>
</li>
</ul>
</div><!-- /.film -->
</div><!-- /.date -->
<div class="date">
<h3 class="date-title">Tomorrow</h3>
<div class="film">
<img class="poster" src="film1" />
<h4 class="title">Film 1</h4>
<ul class="session-times">
<li>
<a href="#">
<time>2:00 PM</time>
</a>
</li>
</ul>
</div><!-- /.film -->
<div class="film">
<img class="poster" src="film2" />
<h4 class="title">Film 2</h4>
<ul class="session-times">
<li>
<a href="#">
<time>5:00 PM</time>
</a>
</li>
</ul>
</div><!-- /.film -->
<div class="film">
<img class="poster" src="film3" />
<h4 class="title">Film 3</h4>
<ul class="session-times">
<li>
<a href="#">
<time>8:00 PM</time>
</a>
</li>
</ul>
</div><!-- /.film -->
</div><!-- /.date -->
and I'm extracting data using this Ruby code:
我正在使用这个Ruby代码提取数据:
nokogiri_object.css('.date').each do |d|
date = d.css('.date-title').text
dates.push(date: date)
d.css('.film').each do |film|
title = film.css('.title')
title_en = title.text.strip
time = film.css('.session-times/li/a/time').text
end
end
This gives me:
这给了我:
[
{
"date": "Today"
},
{
"date": "Tomorrow"
}
]
but I'd like to loop over the three movies n
times in each .film
section and include them under each date in the output, so it should look more like this:
但我想在每个.film部分循环播放三部电影,并将它们包含在输出中的每个日期下,所以看起来应该更像:
[
{
"Today": {
"films": [
{
"film": "Film1",
"time": "12:00 PM"
},
{
"film": "Film2",
"time": "15:00 PM"
},
{
"film": "Film3",
"time": "6:00 PM"
}
]
},
{
"Tomorrow": {
"films": [
{
"film": "Film1",
"time": "14:00 PM"
},
{
"film": "Film2",
"time": "5:00 PM"
},
{
"film": "Film3",
"time": "8:00 PM"
}
]
},
I can't figure out where to build my array within the nested loop.
我无法弄清楚在嵌套循环中构建数组的位置。
1 个解决方案
#1
1
The idea here is first find the nodes with class date
(an array of Nokogiri nodes). And transform this array (with map
method) in the structure you want. The result will be an array (because map
) of hashes (because is what I return in the external map
). To create the structure you want in any hash I use the same concepts: find nokogiri nodes with the css
method and map
every result in what you want.
这里的想法是首先找到具有类日期的节点(Nokogiri节点的数组)。并在您想要的结构中转换此数组(使用map方法)。结果将是哈希的数组(因为map)(因为我在外部地图中返回)。要在任何哈希中创建所需的结构,我使用相同的概念:使用css方法查找nokogiri节点,并将每个结果映射到您想要的内容。
date_nodes = nokogiri_object.css('.date')
date_nodes.map do |date|
{
date.css('.date-title').text => {
"films" => date.css('.film').map do |film|
{
"film" => film.css('img.poster').attr('src').value,
"time" => film.css('time').text
}
end
}
}
end
=> [{"Today"=>{
"films"=>[
{"film"=>"film1", "time"=>"12:00 PM"},
{"film"=>"film2", "time"=>"3:00 PM"},
{"film"=>"film3", "time"=>"6:00 PM"}]}},
{"Tomorrow"=>{
"films"=>[
{"film"=>"film1", "time"=>"2:00 PM"},
{"film"=>"film2", "time"=>"5:00 PM"},
{"film"=>"film3", "time"=>"8:00 PM"}]}}
]
#1
1
The idea here is first find the nodes with class date
(an array of Nokogiri nodes). And transform this array (with map
method) in the structure you want. The result will be an array (because map
) of hashes (because is what I return in the external map
). To create the structure you want in any hash I use the same concepts: find nokogiri nodes with the css
method and map
every result in what you want.
这里的想法是首先找到具有类日期的节点(Nokogiri节点的数组)。并在您想要的结构中转换此数组(使用map方法)。结果将是哈希的数组(因为map)(因为我在外部地图中返回)。要在任何哈希中创建所需的结构,我使用相同的概念:使用css方法查找nokogiri节点,并将每个结果映射到您想要的内容。
date_nodes = nokogiri_object.css('.date')
date_nodes.map do |date|
{
date.css('.date-title').text => {
"films" => date.css('.film').map do |film|
{
"film" => film.css('img.poster').attr('src').value,
"time" => film.css('time').text
}
end
}
}
end
=> [{"Today"=>{
"films"=>[
{"film"=>"film1", "time"=>"12:00 PM"},
{"film"=>"film2", "time"=>"3:00 PM"},
{"film"=>"film3", "time"=>"6:00 PM"}]}},
{"Tomorrow"=>{
"films"=>[
{"film"=>"film1", "time"=>"2:00 PM"},
{"film"=>"film2", "time"=>"5:00 PM"},
{"film"=>"film3", "time"=>"8:00 PM"}]}}
]