使用LINQ到XML遍历HTML表

时间:2023-01-29 09:56:04

So, I can easily use LINQ to XML to traverse a properly set-up XML document. But I'm having some issues figuring out how to apply it to an HTML table. Here is the setup:

因此,我可以轻松地使用LINQ to XML来遍历一个正确设置的XML文档。但是我有一些问题要弄清楚如何将它应用到HTML表中。这是设置:

<table class='inner'
       width='100%'>
    <tr>
        <th>Area</th>
        <th>Date</th>
        <th>ID</th>
        <th>Name</th>
        <th>Email</th>
        <th>Zip Code</th>
        <th>Type</th>
        <th>Amount</th>
    </tr>
    <tr>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
    </tr>
    <tr>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
        <td>Data</td>
    </tr>
</table>

Essentially, there can be an endless number of rows, I want to be able to go row-by-row to check the data accordingly. Can anyone point me in the right direction? Should I be using tools other than LINQ for this?

本质上,可能有无数行,我希望能够逐行检查数据。谁能给我指出正确的方向吗?我应该使用LINQ以外的工具吗?

EDIT: Sorry about the confusion, my issue is the fact that the page I am trying to gather data from is HTML, not XML. The exact extension is ".aspx.htm". This doesnt seem to load properly, and even if it did I'm not certain how to traverse the HTML page, given that there is one table before the table I'm trying to get data from.

编辑:不好意思,我的问题是我试图从HTML而不是XML中收集数据的页面。确切的扩展名是“.aspx.htm”。这似乎加载不正确,即使加载正确,我也不确定如何遍历HTML页面,因为在我试图获取数据的表之前有一个表。

For example, here is the XPATH to the table I'm trying to get info from:

例如,下面是我试图从以下列表中获取信息的XPATH:

/html/body/form/div[3]/table/tbody/tr[5]/td/table

3 个解决方案

#1


5  

XElement myTable = xdoc.Descendants("table").FirstOrDefault(xelem => xelem.Attribute("class").Value == "inner");
IEnumerable<IEnumerable<XElement>> myRows = myTable.Elements().Select(xelem => xelem.Elements());

foreach(IEnumerable<XElement> tableRow in myRows)
{
    foreach(XElement rowCell in tableRow)
    {
        // tada..
    }
}

#2


1  

Once you have an XElement with the <table>, you can loop through its child Elements().

一旦有了具有

的XElement,就可以循环遍历它的子元素()。

#3


0  

linq is like sql it performs set based operations.

linq类似于sql,它执行基于集合的操作。

You want to focus on using a foreach loop to iterate over the selected set of xelements -

您希望使用foreach循环对所选的xelements -集进行迭代

#1


5  

XElement myTable = xdoc.Descendants("table").FirstOrDefault(xelem => xelem.Attribute("class").Value == "inner");
IEnumerable<IEnumerable<XElement>> myRows = myTable.Elements().Select(xelem => xelem.Elements());

foreach(IEnumerable<XElement> tableRow in myRows)
{
    foreach(XElement rowCell in tableRow)
    {
        // tada..
    }
}

#2


1  

Once you have an XElement with the <table>, you can loop through its child Elements().

一旦有了具有

的XElement,就可以循环遍历它的子元素()。

#3


0  

linq is like sql it performs set based operations.

linq类似于sql,它执行基于集合的操作。

You want to focus on using a foreach loop to iterate over the selected set of xelements -

您希望使用foreach循环对所选的xelements -集进行迭代