如何使用bash脚本编辑XML?

时间:2021-10-09 00:25:57
<root>
<tag>1</tag>
<tag1>2</tag1>
</root>

Need to change values 1 and 2 from bash

需要从bash更改值1和2

4 个解决方案

#1


10  

You can use the xsltproc command (from package xsltproc on Debian-based distros) with the following XSLT sheet:

您可以使用xsltproc命令(来自基于Debian的发行版上的xsltproc包)和以下XSLT表:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:param name="tagReplacement"/>
  <xsl:param name="tag1Replacement"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>

  </xsl:template>
  <xsl:template match="tag">
    <xsl:copy>
      <xsl:value-of select="$tagReplacement"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="tag1">
    <xsl:copy>
      <xsl:value-of select="$tag1Replacement"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Then use the command:

然后使用命令:

xsltproc --stringparam tagReplacement polop \
         --stringparam tag1Replacement palap \
         transform.xsl input.xml

Or you could also use regexes, but modifying XML through regexes is pure evil :)

或者你也可以使用正则表达式,但通过正则表达式修改XML是纯粹的邪恶:)

#2


8  

To change tag's value to 2 and tag1's value to 3, using XMLStarlet:

要使用XMLStarlet将tag的值更改为2并将tag1的值更改为3:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <old.xml >new.xml

Using your sample input:

使用您的示例输入:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <<<'<root><tag>1</tag><tag1>2</tag1></root>'

...emits as output:

...作为输出发出:

<?xml version="1.0"?>
<root>
  <tag>2</tag>
  <tag1>3</tag1>
</root>

#3


4  

my $0.02 in python because its on every server you will ever log in to

我在python中的$ 0.02,因为它会在你登录的每台服务器上

import sys, xml.etree.ElementTree as ET

data = ""
for line in sys.stdin:
    data += line

tree = ET.fromstring(data)

nodeA = tree.find('.//tag')
nodeB = tree.find('.//tag1')

tmp = nodeA.text
nodeA.text = nodeB.text
nodeB.text = tmp 

print ET.tostring(tree)

this reads from stdin so you can use it like this:

这从stdin读取,所以你可以像这样使用它:

$ echo '<node><tag1>hi!</tag1><tag>this</tag></node>' | python xml_process.py 
<node><tag1>this</tag1><tag>hi!</tag></node>

EDIT - challenge accepted

编辑 - 接受挑战

Here's a working xmllib implementation (should work back to python 1.6). As I thought it would be more fun to stab my eyes with a fork. The only think I will say about this is it works for the given use case.

这是一个有效的xmllib实现(应该回到python 1.6)。我觉得用叉子刺我的眼睛会更有趣。我唯一想到的就是它适用于给定的用例。

import sys, xmllib

class Bag:
    pass

class NodeSwapper(xmllib.XMLParser):
    def __init__(self):
    print 'making a NodeSwapper'
    xmllib.XMLParser.__init__(self)
    self.result = ''
    self.data_tags = {}
    self.current_tag = ''
    self.finished = False

    def handle_data(self, data):
    print 'data: ' + data

    self.data_tags[self.current_tag] = data
    if self.finished:
       return

    if 'tag1' in self.data_tags.keys() and 'tag' in self.data_tags.keys():
        b = Bag()
        b.tag1 = self.data_tags['tag1']
        b.tag = self.data_tags['tag']
        b.t1_start_idx = self.rawdata.find(b.tag1)
        b.t1_end_idx = len(b.tag1) + b.t1_start_idx
        b.t_start_idx = self.rawdata.find(b.tag)
        b.t_end_idx = len(b.tag) +  b.t_start_idx 
        # swap
        if b.t1_start_idx < b.t_start_idx:
           self.result = self.rawdata[:b.t_start_idx] + b.tag + self.rawdata[b.t_end_idx:]
           self.result = self.result[:b.t1_start_idx] + b.tag1 + self.result[b.t1_end_idx:]
        else:
           self.result = self.rawdata[:b.t1_start_idx] + b.tag1 + self.rawdata[t1_end_idx:]
           self.result = self.result[:b.t_start_idx] + b.tag + self.rresult[t_end_idx:]
        self.finished = True

    def unknown_starttag(self, tag, attrs):
    print 'starttag is: ' + tag
    self.current_tag = tag

data = ""
for line in sys.stdin:
    data += line

print 'data is: ' + data

parser = NodeSwapper()
parser.feed(data)
print parser.result
parser.close()

#4


1  

Since you give a sed example in one of the comments, I imagine you want a pure bash solution?

既然你在其中一条评论中给出了一个sed示例,我想你想要一个纯粹的bash解决方案?

while read input; do
  for field in tag tag1; do
    case $input in
      *"<$field>"*"</$field>"* )
        pre=${input#*"<$field>"}
        suf=${input%"</$field>"*}
        # Where are we supposed to be getting the replacement text from?
        input="${input%$pre}SOMETHING${input#$suf}"
        ;;
    esac
  done
  echo "$input"
done

This is completely unintelligent, and obviously only works on well-formed input with the start tag and the end tag on the same line, you can't have multiple instances of the same tag on the same line, the list of tags to substitute is hard-coded, etc.

这完全不是智能的,显然只能在结构良好的输入上使用同一行的开始标记和结束标记,你不能在同一行上有多个相同标记的实例,要替换的标记列表是硬编码等

I cannot imagine a situation where this would be actually useful, and preferable to either a script or a proper XML approach.

我无法想象这实际上有用的情况,并且比脚本或适当的XML方法更可取。

#1


10  

You can use the xsltproc command (from package xsltproc on Debian-based distros) with the following XSLT sheet:

您可以使用xsltproc命令(来自基于Debian的发行版上的xsltproc包)和以下XSLT表:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:param name="tagReplacement"/>
  <xsl:param name="tag1Replacement"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>

  </xsl:template>
  <xsl:template match="tag">
    <xsl:copy>
      <xsl:value-of select="$tagReplacement"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="tag1">
    <xsl:copy>
      <xsl:value-of select="$tag1Replacement"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Then use the command:

然后使用命令:

xsltproc --stringparam tagReplacement polop \
         --stringparam tag1Replacement palap \
         transform.xsl input.xml

Or you could also use regexes, but modifying XML through regexes is pure evil :)

或者你也可以使用正则表达式,但通过正则表达式修改XML是纯粹的邪恶:)

#2


8  

To change tag's value to 2 and tag1's value to 3, using XMLStarlet:

要使用XMLStarlet将tag的值更改为2并将tag1的值更改为3:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <old.xml >new.xml

Using your sample input:

使用您的示例输入:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <<<'<root><tag>1</tag><tag1>2</tag1></root>'

...emits as output:

...作为输出发出:

<?xml version="1.0"?>
<root>
  <tag>2</tag>
  <tag1>3</tag1>
</root>

#3


4  

my $0.02 in python because its on every server you will ever log in to

我在python中的$ 0.02,因为它会在你登录的每台服务器上

import sys, xml.etree.ElementTree as ET

data = ""
for line in sys.stdin:
    data += line

tree = ET.fromstring(data)

nodeA = tree.find('.//tag')
nodeB = tree.find('.//tag1')

tmp = nodeA.text
nodeA.text = nodeB.text
nodeB.text = tmp 

print ET.tostring(tree)

this reads from stdin so you can use it like this:

这从stdin读取,所以你可以像这样使用它:

$ echo '<node><tag1>hi!</tag1><tag>this</tag></node>' | python xml_process.py 
<node><tag1>this</tag1><tag>hi!</tag></node>

EDIT - challenge accepted

编辑 - 接受挑战

Here's a working xmllib implementation (should work back to python 1.6). As I thought it would be more fun to stab my eyes with a fork. The only think I will say about this is it works for the given use case.

这是一个有效的xmllib实现(应该回到python 1.6)。我觉得用叉子刺我的眼睛会更有趣。我唯一想到的就是它适用于给定的用例。

import sys, xmllib

class Bag:
    pass

class NodeSwapper(xmllib.XMLParser):
    def __init__(self):
    print 'making a NodeSwapper'
    xmllib.XMLParser.__init__(self)
    self.result = ''
    self.data_tags = {}
    self.current_tag = ''
    self.finished = False

    def handle_data(self, data):
    print 'data: ' + data

    self.data_tags[self.current_tag] = data
    if self.finished:
       return

    if 'tag1' in self.data_tags.keys() and 'tag' in self.data_tags.keys():
        b = Bag()
        b.tag1 = self.data_tags['tag1']
        b.tag = self.data_tags['tag']
        b.t1_start_idx = self.rawdata.find(b.tag1)
        b.t1_end_idx = len(b.tag1) + b.t1_start_idx
        b.t_start_idx = self.rawdata.find(b.tag)
        b.t_end_idx = len(b.tag) +  b.t_start_idx 
        # swap
        if b.t1_start_idx < b.t_start_idx:
           self.result = self.rawdata[:b.t_start_idx] + b.tag + self.rawdata[b.t_end_idx:]
           self.result = self.result[:b.t1_start_idx] + b.tag1 + self.result[b.t1_end_idx:]
        else:
           self.result = self.rawdata[:b.t1_start_idx] + b.tag1 + self.rawdata[t1_end_idx:]
           self.result = self.result[:b.t_start_idx] + b.tag + self.rresult[t_end_idx:]
        self.finished = True

    def unknown_starttag(self, tag, attrs):
    print 'starttag is: ' + tag
    self.current_tag = tag

data = ""
for line in sys.stdin:
    data += line

print 'data is: ' + data

parser = NodeSwapper()
parser.feed(data)
print parser.result
parser.close()

#4


1  

Since you give a sed example in one of the comments, I imagine you want a pure bash solution?

既然你在其中一条评论中给出了一个sed示例,我想你想要一个纯粹的bash解决方案?

while read input; do
  for field in tag tag1; do
    case $input in
      *"<$field>"*"</$field>"* )
        pre=${input#*"<$field>"}
        suf=${input%"</$field>"*}
        # Where are we supposed to be getting the replacement text from?
        input="${input%$pre}SOMETHING${input#$suf}"
        ;;
    esac
  done
  echo "$input"
done

This is completely unintelligent, and obviously only works on well-formed input with the start tag and the end tag on the same line, you can't have multiple instances of the same tag on the same line, the list of tags to substitute is hard-coded, etc.

这完全不是智能的,显然只能在结构良好的输入上使用同一行的开始标记和结束标记,你不能在同一行上有多个相同标记的实例,要替换的标记列表是硬编码等

I cannot imagine a situation where this would be actually useful, and preferable to either a script or a proper XML approach.

我无法想象这实际上有用的情况,并且比脚本或适当的XML方法更可取。