使用PHP从HTML列表生成XML

时间:2022-10-16 13:07:50

I would like to convert the list structure in html:

我想转换html中的列表结构:

<ul>
    <li>Section 1</li>
    <li>Section 2
        <ul>
            <li>Section 2.1</li>
            <li>Section 2.2</li>
        </ul>
    </li>
    <li>Section 3</li>
</ul>

Into XML like this:

像这样进入XML:

<sections>
    <section>
        <caption>Section 1</caption>
        <level>0</level>
    </section>
    <section>
        <caption>Section 2</caption>
        <level>0</level>
    </section>
    <section>
        <caption>Section 2.1</caption>
        <level>1</level>
    </section>
    <section>
        <caption>Section 2.2</caption>
        <level>1</level>
    </section>
    <section>
        <caption>Section 3</caption>
        <level>0</level>
    </section>
</sections>

I tried to use PHP SimpleXML to read in the html but it seems to have problem when it encounters an <ul> tag inside a <li> tag.

我尝试使用PHP SimpleXML来读取html,但是当它遇到

  • 标签内的
      标签时似乎有问题。

  • I wonder if someone can kindly suggest what the simplest way is to get this done in PHP?

    我想知道是否有人可以建议用PHP完成这个最简单的方法是什么?

    Many thanks to you all.

    非常感谢大家。

    1 个解决方案

    #1


    3  

    You could always just parse that HTML into your XML structure. Something like this:

    您始终可以将该HTML解析为XML结构。像这样的东西:

    Let's assume your HTML is in a page called "sections.html". This is one way you could do what you're looking to do:

    我们假设您的HTML位于名为“sections.html”的页面中。这是你可以做你想做的事情的一种方式:

    <?php
    
    
      # Create new DOM object
      $domOb = new DOMDocument();
    
      # Grab your HTML file
      $html = $domOb->loadHTMLFile(sections.html);
    
      # Remove whitespace
      $domOb->preserveWhiteSpace = false; 
    
      # Set the container tag
      $container = $domOb->getElementsByTagName('ul'); 
    
      # Loop through UL values
      foreach ($container as $row) 
      { 
          # Grab all <li>
          $items = $row->getElementsByTagName('li'); 
    
          # echo the values  
          echo $items->item(0)->nodeValue.'<br />'; 
          echo $items->item(1)->nodeValue.'<br />'; 
          echo $items->item(2)->nodeValue;
    
          # You could write to your XML file, store in a string, anything here
        } 
    
    ?>
    

    I haven't tested this, but that's the general idea.

    我没有测试过这个,但这是一般的想法。

    Hope this helps.

    希望这可以帮助。

    #1


    3  

    You could always just parse that HTML into your XML structure. Something like this:

    您始终可以将该HTML解析为XML结构。像这样的东西:

    Let's assume your HTML is in a page called "sections.html". This is one way you could do what you're looking to do:

    我们假设您的HTML位于名为“sections.html”的页面中。这是你可以做你想做的事情的一种方式:

    <?php
    
    
      # Create new DOM object
      $domOb = new DOMDocument();
    
      # Grab your HTML file
      $html = $domOb->loadHTMLFile(sections.html);
    
      # Remove whitespace
      $domOb->preserveWhiteSpace = false; 
    
      # Set the container tag
      $container = $domOb->getElementsByTagName('ul'); 
    
      # Loop through UL values
      foreach ($container as $row) 
      { 
          # Grab all <li>
          $items = $row->getElementsByTagName('li'); 
    
          # echo the values  
          echo $items->item(0)->nodeValue.'<br />'; 
          echo $items->item(1)->nodeValue.'<br />'; 
          echo $items->item(2)->nodeValue;
    
          # You could write to your XML file, store in a string, anything here
        } 
    
    ?>
    

    I haven't tested this, but that's the general idea.

    我没有测试过这个,但这是一般的想法。

    Hope this helps.

    希望这可以帮助。