匹配表行id与公共前缀

This might be merely a syntax question.

这可能只是一个语法问题。

I am unclear how to match only table rows whose id begins with rowId_

我不清楚如何只匹配id以rowId_开头的表行

agent = Mechanize.new
  pageC1 = agent.get("/customStrategyScreener!list.action")

The table has class=tableCellDT.

该表有class = tableCellDT。

 pageC1.search('table.tableCellDT tr[@id=rowId_]')  # parses OK but returns 0 rows since rowId_ is not matched exactly.

 pageC1.search('table.tableCellDT tr[@id=rowId_*]')  # Throws an error since * is not treated like a wildcard string match

EXAMPLE HTML:

示例HTML：

 <table id="row" cellpadding="5" class="tableCellDT" cellspacing="1">
<thead>
<tr>
<th class="tableHeaderDT">#</th>
<th class="tableHeaderDT sortable">
<a href="?d-16544-s=1&amp;d-16544-o=2&amp;d-16544-p=1">Screener</a></th>
<th class="tableHeaderDT sortable">
<a href="?d-16544-s=2&amp;d-16544-o=2&amp;d-16544-p=1">Strategy</a></th>
<th class="tableHeaderDT"> </th></tr></thead>
<tbody>
<tr id="rowId_BullPut" class="odd">
<td>   1  </td>
<td>   Bull</td>
<td></td>
<td><a href="link1?model.itemId=2262">Edit</a>&nbsp;&nbsp;
            <a href="javascript:deleteScreener('link2?model.itemId=2262');">Delete</a>&nbsp;&nbsp;
            <a href="link3?model.itemId=2262&amp;amp;model.source=list">View</a>&nbsp;&nbsp;
            </td></tr>

NOTE

注意

pageC1 is a Mechanize::Page object, not a Nokogiri anything. Sorry that wasn't clear at first. Mechanize::Page doesn't have #css or #xpath methods, but a Nokogiri doc can be extracted from it (used internally anyway).

pageC1是一个Mechanize :: Page对象，而不是Nokogiri。抱歉，一开始并不清楚。 Mechanize :: Page没有#css或#xpath方法，但可以从中提取Nokogiri doc（无论如何都在内部使用）。

3 个解决方案

#1

To get the tr elements that have an id starting with "rowId_":

要获取id为“rowId_”的tr元素：

pageC1.search('//tr[starts-with(@id, "rowId_")]')

#2

You want either the CSS3 attribute starts-with selector:

你想要CSS3属性starts-with选择器：

pageC1.css('table.tableCellDT tr[id^="rowId_"]')

or the XPath starts-with() function:

或者XPath的start-with（）函数：

pageC1.xpath('.//table[@class="tableCellDT"]//tr[starts-with(@id,"rowId_")]')

Although the Nokogiri Node#search method will intelligently pick between CSS or XPath selector syntax based on what you wrote, that does not mean that you can mix both CSS and XPath selector syntax in the same query.

虽然Nokogiri Node＃搜索方法将根据您编写的内容智能地在CSS或XPath选择器语法之间进行选择，但这并不意味着您可以在同一查询中混合使用CSS和XPath选择器语法。

In action:

在行动：

>> require 'nokogiri'
#=> true

>> doc = Nokogiri.HTML <<ENDHTML; true #hide output from IRB
">> <table class="foo"><tr id="rowId_nonono"><td>Nope</td></tr></table>
">> <table class="tableCellDT">
">>   <tr id="rowId_yesyes"><td>Yes1</td></tr>
">>   <tr id="rowId_andme2"><td>Yes2</td></tr>
">>   <tr id="rowIdNONONO"><td>Needs underscore</td></tr>
">> </table>
">> ENDHTML
#=> true

>> doc.css('table.tableCellDT tr[id^="rowId_"]').map(&:text)
#=> ["Yes1", "Yes2"]

>> doc.xpath('.//table[@class="tableCellDT"]//tr[starts-with(@id,"rowId_")]').map(&:text)
#=> ["Yes1", "Yes2"]

#3

Thanks to http://nokogiri.org/Nokogiri/XML/Node.html#method-i-css

感谢http://nokogiri.org/Nokogiri/XML/Node.html#method-i-css

and the answers above, here is the final code that solves my problem of getting just the rows I need, and then reading only certain information from each one:

和上面的答案，这里是最终的代码，解决了我只需要获取我需要的行的问题，然后只从每个行中读取某些信息：

 pageC1.search('//tr[starts-with(@id, "rowId_")]').each do |row|

# Read the string after _ in rowId_, part of the "id" in <tr>
    rid = row.attribute("id").text.split("_")[1] # => "BullPut"

# Get the URL of the 3rd <a> link in <td> cell 4    
    link = row.css("td[4] a[3]")[0].attributes["href"].text # => "link3?model.itemId=2262&amp;amp;model.source=list"
 end

#1