awk one liner：替换xml标签

I have got an xml file containing some attributes like

我有一个包含一些属性的xml文件

<string name="my/ attribute" optional="true">
  <description>some text</description>
  <value>some text again</value>
</string>

I would like to change the value (which does not necessarily have to be "some text again") by the string "none". I tried the following on the command line:

我想通过字符串“none”更改值（不一定必须是“某些文本”）。我在命令行上尝试了以下操作：

 awk '/<string name="my\/ attribute" optional="true">/,/<\/string>/ {sub(/<value>(.*)<\/value>/,"<value>none</value>")}1' my.xml > my_new.xml

This somehow works ok, but the result is as follows:

这种方式有效，但结果如下：

<string name="my/ attribute" optional="true">
  <description>some text</description>
  <value>some text again<\/value>
</string>

Why is the / (slash) in the tag escaped?

为什么标签中的/（斜杠）转义了？

Thanks a lot for your help,

非常感谢你的帮助，

Daniela.

丹妮拉。

2 个解决方案

#1

Assuming the inconsistencies in your question that Richard pointed out are accidental:

假设理查德指出的问题中的不一致是偶然的：

$ cat input.xml
<string name="my/ attribute" optional="true">
  <description>some text</description>
  <value>some text again</value>
</string>

$ awk '/<string/{doit=1} doit{sub(/<value>[^<]+<\/value>/, "<value>none</value>"); print} /<\/string>/{doit=0}' input.xml 
<string name="my/ attribute" optional="true">
  <description>some text</description>
  <value>none</value>
</string>

$

This is WEE bit safer than your script, in that it will handle minified XML (i.e. whitespace removed, all on e line), but it won't handle <value> that is split over multiple lines.

这比你的脚本更安全，因为它将处理缩小的XML（即删除空格，全部在e线上），但它不会处理分割成多行的。

I do recommend looking in to Perl's XML::Simple or PHP's SimpleXML. It won't be a one-liner, but it will work MUCH more reliably.

我建议您查看Perl的XML :: Simple或PHP的SimpleXML。它不会是一个单行，但它将更可靠地工作。

#2

Don't use standard text tools to process XML - always use XML tools. Otherwise you (or your customers) will end up among the hundreds of people who post questions on this list asking what to do about the fact that they have ill-formed XML to process. It's simply too hard to get it right by hand, catering for all the edge cases that can arise. For example, do you know the rules for where whitespace is allowed within start and end tags? Judging from your sample code, you don't appear to.

不要使用标准文本工具来处理XML - 始终使用XML工具。否则，您（或您的客户）将最终成为在此列表中发布问题的数百人之一，询问如何处理他们处理格式错误的XML这一事实。手动操作太难了，迎合可能出现的所有边缘情况。例如，您是否知道开始和结束标记中允许空格的规则？根据您的示例代码判断，您似乎没有。

#1