awk:在XML中搜索关键字并写入另一个文件

时间:2021-12-09 16:03:23

My input XML is below. I need to search in my input XML if "SEARCH" keyword is present. If present, I need to copy the contents from <record> to </record> and write to another XML file.

我的输入XML如下。如果存在“SEARCH”关键字,我需要在输入XML中搜索。如果存在,我需要将内容从 复制到 并写入另一个XML文件。

Input XML

输入XML

<XML>
<record category="xyz">
<person ssn="" e-i="E">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<details>
<names>
<first_name/>
<last_name></last_name>
</names>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</person>
</record>
<record category="abc">
<person ssn="" e-i="F">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<details>
<names>
<first_name/>
<last_name></last_name>
</names>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>DONTSEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is not present in abc for xyz reason</detail>
</external_sources>
</details>
</person>
</record>
</XML>

My present code:

我现在的代码:

NR==FNR {
keywordArray[NR]=$0;
next;
}

/<record / { i=1 }
i { a[i++]=$0 }
/<\/record>/ {
    if (found) {
        for (i=1; i<=length(a); ++i) print a[i] >> output.xml
    }
    i=0;
    found=0
}
$0 ~ "<keyword>"SEARCH"</keyword>" { found=1 }

Issue with current code:

当前代码问题:

The code is not searching for "SEARCH" and it is not writing anything to output.xml

代码没有搜索“SEARCH”,也没有向output.xml写入任何内容

Expected output:

预期产量:

<record category="xyz">
<person ssn="" e-i="E">
<title xsi:nil="true"/>
<position xsi:nil="true"/>
<details>
<names>
<first_name/>
<last_name></last_name>
</names>
<aliases>
<alias>CDP</alias>
</aliases>
<keywords>
<keyword xsi:nil="true"/>
<keyword>SEARCH</keyword>
</keywords>
<external_sources>
<uri>http://www.google.com</uri>
<detail>SEARCH is present in abc for xyz reason</detail>
</external_sources>
</details>
</person>
</record>

2 个解决方案

#1


1  

Well, it's not perfect but maybe you can improve this:

嗯,这不完美,但也许你可以改善这个:

BEGIN {
  FS="\n"        # field separator to enter
  OFS="\n"       # output separator as well
  RS="</record>" # records end at </record>
} 
$0 ~ /<keyword>SEARCH<\/keyword>/'     # print record if SEARCH matched

#2


1  

With xmlstarlet, you could use this:

使用xmlstarlet,您可以使用:

 xmlstarlet sel -t -c "//record[.//keyword/text()='SEARCH']" foo.xml

#1


1  

Well, it's not perfect but maybe you can improve this:

嗯,这不完美,但也许你可以改善这个:

BEGIN {
  FS="\n"        # field separator to enter
  OFS="\n"       # output separator as well
  RS="</record>" # records end at </record>
} 
$0 ~ /<keyword>SEARCH<\/keyword>/'     # print record if SEARCH matched

#2


1  

With xmlstarlet, you could use this:

使用xmlstarlet,您可以使用:

 xmlstarlet sel -t -c "//record[.//keyword/text()='SEARCH']" foo.xml