使用xslt将xhtml转换为wiki语法

时间:2021-08-15 14:17:01

i would like to convert xhtml to dokuwiki syntax using xslt.

我想使用xslt将xhtml转换为dokuwiki语法。

now, one thing i can not seem to work my head around is how to handle nested lists. the dokuwiki syntax uses an asterisk (*) for a list item which is prepended by two white spaces per nesting level (c.f. wiki syntax).

现在,我似乎无法解决的一件事是如何处理嵌套列表。 dokuwiki语法对列表项使用星号(*),每个嵌套级别(c.f. wiki语法)前面有两个空格。

my question: in the following example, how can the <xsl:template mach="li"> that matches the list item 2.1.1 be aware of it's nesting level, in order to prepend the right amount of white spaces?

我的问题:在下面的例子中,与列表项2.1.1匹配的

* list item 1
* list item 2
  * list item 2.1
    * list item 2.1.1
  * list item 2.2
  * list item 2.3
* list item 3

corresponds to

  • list item 1
  • 清单项目1

  • list item 2
    • list item 2.1
      • list item 2.1.1
      • 清单项目2.1.1

    • 清单项目2.1清单项目2.1.1

    • list item 2.2
    • 清单项目2.2

    • list item 2.3
    • 清单项目2.3

  • 清单项目2清单项目2.1清单项目2.1.1清单项目2.2清单项目2.3

  • list item 3
  • 清单项目3

which is how the following html is displayed:

这是以下html的显示方式:

<ul>
    <li>
        list item 1
    </li>
    <li>
        list item 2
        <ul>
            <li>
                list item 2.1
                <ul>
                    <li>list item 2.1.1</li>
                </ul>
            </li>
            <li>list item 2.2</li>
            <li>list item 2.3</li>
        </ul>
    </li>
    <li>
        list item 3
    </li>
</ul>

2 个解决方案

#1


The following transformation:

以下转型:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:strip-space elements="*"/>

 <xsl:variable name="vBlanks"
  select="'                                        '"/>
 <xsl:variable name="vnNestSpaces" select="2"/>


    <xsl:template match="li">
      <xsl:variable name="vNestLevel"
           select="count(ancestor::li)"/>
      <xsl:value-of select=
       "concat('&#xA;',
               substring($vBlanks,1,$vnNestSpaces*$vNestLevel),
               '*  ', normalize-space(text()[1])
               )"/>
      <xsl:apply-templates select="*"/>
    </xsl:template>
</xsl:stylesheet>

when applied on the original XML document:

当应用于原始XML文档时:

<ul>
    <li> list item 1
    </li>
    <li> list item 2        
        <ul>
            <li> list item 2.1                
                <ul>
                    <li>list item 2.1.1</li>
                </ul>
            </li>
            <li>list item 2.2</li>
            <li>list item 2.3</li>
        </ul>
    </li>
    <li> list item 3    </li>
</ul>

produces the desired result:

产生预期的结果:

*  list item 1
*  list item 2
  *  list item 2.1
    *  list item 2.1.1
  *  list item 2.2
  *  list item 2.3
*  list item 3

Do note the following:

请注意以下事项:

  1. The required indentation is determined by the value of count(ancesstor::li).

    所需的缩进由count(ancesstor :: li)的值确定。

  2. The space for indenting is taken directly from a sufficiently large blank line (contains enough blanks for 20 levels of nesting). There is no need to recursively output the spaces one by one.

    缩进的空间直接取自足够大的空白行(包含20个嵌套级别的足够空白)。无需逐个递归地输出空格。

  3. The transformation is more efficient, due to 2. above.

    由于上述2.转换效率更高。

  4. Note the use of the XPath substring() function.

    注意使用XPath substring()函数。

#2


Here is how I got it to work:

以下是我如何使用它:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>
    <xsl:template match="//li">
        <xsl:call-template name="loop">
            <xsl:with-param name="maxcount" select="count(ancestor::li)"/>
            <xsl:with-param name="initial-value" select="0"/>
        </xsl:call-template>
        <xsl:text>* </xsl:text>
        <xsl:value-of select="normalize-space(text())"/>
        <xsl:text>&#xd;</xsl:text>
        <xsl:apply-templates select="ul/li" />
    </xsl:template>
    <xsl:template name="loop">
        <xsl:param name="maxcount"/>
        <xsl:param name="initial-value"/>
        <xsl:if test="$initial-value &lt; $maxcount">
            <xsl:text>&#x9;</xsl:text>
            <xsl:call-template name="loop">
                <xsl:with-param name="maxcount" select="$maxcount"/>
                <xsl:with-param name="initial-value" select="$initial-value+1"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Here is how it breaks down:

以下是它如何分解:

<xsl:output method="text"/>
<xsl:strip-space elements="*"/>

You need to make sure that the output of the XSLT is text and you also want to strip any existing whitespace.

您需要确保XSLT的输出是文本,并且还要删除任何现有的空格。

<xsl:template match="//li">
    ...
</xsl:template>

This is your main template and will match every single <li> in the document. The first step in this template is to output the appropriate number of tab characters (feel free to adjust this to be spaces or whatever you need). The way this is done is by calling a custom loop template that will recursively call itself, looping from initial-value to maxcount, outputting a tab character (&#x9;) on each iteration.

这是您的主模板,将匹配文档中的每个

  • 。此模板的第一步是输出适当数量的制表符(可随意将其调整为空格或您需要的任何内容)。这样做的方法是调用一个自定义循环模板,该模板将递归调用自身,从initial-value循环到maxcount,在每次迭代时输出制表符( )。

  • <xsl:text>* </xsl:text>
    <xsl:value-of select="normalize-space(text())"/>
    <xsl:text>&#xd;</xsl:text>
    

    This chunk simply outputs the text with the * in front and a newline (&#xd;) after. Note that I used the text() function instead of . to retrieve the value of the node. If you don't the output of the parent node will (as it should according to the W3C recommendation) concatenate all child text nodes with the parent.

    这个块只输出前面带*的文本和后面的换行符( )。请注意,我使用text()函数而不是。检索节点的值。如果不这样做,父节点的输出将(根据W3C建议)将所有子文本节点与父节点连接起来。

    <xsl:apply-templates select="ul/li" />
    

    Finally we recursively call the current template but explicitly reference the next <li> that is a direct child of a <ul> - this keeps us from accidentally calling the template twice on the same parent element.

    最后,我们递归调用当前模板,但显式引用下一个

  • ,它是
      的直接子项 - 这使我们不会在同一个父元素上意外调用模板两次。

  • ,它是的直接子项 - 这使我们不会在同一个父元素上意外调用模板两次。

    #1


    The following transformation:

    以下转型:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output method="text"/>
    
     <xsl:strip-space elements="*"/>
    
     <xsl:variable name="vBlanks"
      select="'                                        '"/>
     <xsl:variable name="vnNestSpaces" select="2"/>
    
    
        <xsl:template match="li">
          <xsl:variable name="vNestLevel"
               select="count(ancestor::li)"/>
          <xsl:value-of select=
           "concat('&#xA;',
                   substring($vBlanks,1,$vnNestSpaces*$vNestLevel),
                   '*  ', normalize-space(text()[1])
                   )"/>
          <xsl:apply-templates select="*"/>
        </xsl:template>
    </xsl:stylesheet>
    

    when applied on the original XML document:

    当应用于原始XML文档时:

    <ul>
        <li> list item 1
        </li>
        <li> list item 2        
            <ul>
                <li> list item 2.1                
                    <ul>
                        <li>list item 2.1.1</li>
                    </ul>
                </li>
                <li>list item 2.2</li>
                <li>list item 2.3</li>
            </ul>
        </li>
        <li> list item 3    </li>
    </ul>
    

    produces the desired result:

    产生预期的结果:

    *  list item 1
    *  list item 2
      *  list item 2.1
        *  list item 2.1.1
      *  list item 2.2
      *  list item 2.3
    *  list item 3
    

    Do note the following:

    请注意以下事项:

    1. The required indentation is determined by the value of count(ancesstor::li).

      所需的缩进由count(ancesstor :: li)的值确定。

    2. The space for indenting is taken directly from a sufficiently large blank line (contains enough blanks for 20 levels of nesting). There is no need to recursively output the spaces one by one.

      缩进的空间直接取自足够大的空白行(包含20个嵌套级别的足够空白)。无需逐个递归地输出空格。

    3. The transformation is more efficient, due to 2. above.

      由于上述2.转换效率更高。

    4. Note the use of the XPath substring() function.

      注意使用XPath substring()函数。

    #2


    Here is how I got it to work:

    以下是我如何使用它:

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="text"/>
        <xsl:strip-space elements="*"/>
        <xsl:template match="//li">
            <xsl:call-template name="loop">
                <xsl:with-param name="maxcount" select="count(ancestor::li)"/>
                <xsl:with-param name="initial-value" select="0"/>
            </xsl:call-template>
            <xsl:text>* </xsl:text>
            <xsl:value-of select="normalize-space(text())"/>
            <xsl:text>&#xd;</xsl:text>
            <xsl:apply-templates select="ul/li" />
        </xsl:template>
        <xsl:template name="loop">
            <xsl:param name="maxcount"/>
            <xsl:param name="initial-value"/>
            <xsl:if test="$initial-value &lt; $maxcount">
                <xsl:text>&#x9;</xsl:text>
                <xsl:call-template name="loop">
                    <xsl:with-param name="maxcount" select="$maxcount"/>
                    <xsl:with-param name="initial-value" select="$initial-value+1"/>
                </xsl:call-template>
            </xsl:if>
        </xsl:template>
    </xsl:stylesheet>
    

    Here is how it breaks down:

    以下是它如何分解:

    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>
    

    You need to make sure that the output of the XSLT is text and you also want to strip any existing whitespace.

    您需要确保XSLT的输出是文本,并且还要删除任何现有的空格。

    <xsl:template match="//li">
        ...
    </xsl:template>
    

    This is your main template and will match every single <li> in the document. The first step in this template is to output the appropriate number of tab characters (feel free to adjust this to be spaces or whatever you need). The way this is done is by calling a custom loop template that will recursively call itself, looping from initial-value to maxcount, outputting a tab character (&#x9;) on each iteration.

    这是您的主模板,将匹配文档中的每个

  • 。此模板的第一步是输出适当数量的制表符(可随意将其调整为空格或您需要的任何内容)。这样做的方法是调用一个自定义循环模板,该模板将递归调用自身,从initial-value循环到maxcount,在每次迭代时输出制表符( )。

  • <xsl:text>* </xsl:text>
    <xsl:value-of select="normalize-space(text())"/>
    <xsl:text>&#xd;</xsl:text>
    

    This chunk simply outputs the text with the * in front and a newline (&#xd;) after. Note that I used the text() function instead of . to retrieve the value of the node. If you don't the output of the parent node will (as it should according to the W3C recommendation) concatenate all child text nodes with the parent.

    这个块只输出前面带*的文本和后面的换行符( )。请注意,我使用text()函数而不是。检索节点的值。如果不这样做,父节点的输出将(根据W3C建议)将所有子文本节点与父节点连接起来。

    <xsl:apply-templates select="ul/li" />
    

    Finally we recursively call the current template but explicitly reference the next <li> that is a direct child of a <ul> - this keeps us from accidentally calling the template twice on the same parent element.

    最后,我们递归调用当前模板,但显式引用下一个

  • ,它是
      的直接子项 - 这使我们不会在同一个父元素上意外调用模板两次。

  • ,它是的直接子项 - 这使我们不会在同一个父元素上意外调用模板两次。