BASH SHELL SCRIPT将一个大的xml文件拆分成多个小文件

时间:2023-02-07 21:45:00

I have a an XML file in below format

我有一个以下格式的XML文件

<?xml version="1.0" encoding="utf-8" ?>
<parent>
    <child>
        <code></code>
        <text></text>
    </child>
    <child>
        <code></code>
        <text></text>
    </child>
 </parent>

I need a BASH SHELL script to split this main xml file into multiple small XML files which should have contents from the <child> to </child> tag. File names could be parent file name plus a running serial number such as _1 for ex:20110721_1.xml etc.. Please help me with the script.

我需要一个BASH SHELL脚本将这个主要的xml文件拆分成多个小的XML文件,这些文件应包含从 到 标记的内容。文件名可以是父文件名加上正在运行的序列号,例如_1,例如:20110721_1.xml等。请帮我编写脚本。

2 个解决方案

#1


9  

Not pure answer but you can tune this yourself:

不是纯粹的答案,但你可以自己调整:

csplit -ksf part. src.xml /\<child\>/ "{100}" 2>/dev/null

This command will split src.xml using regexp /\<child\>/ as a delimiter and produce 1..100 part.* files. You need to play with regexp though...

此命令将使用regexp / \ /作为分隔符拆分src.xml并生成1..100部分。*文件。你需要玩regexp虽然......

#2


3  

One solution is to write a XSL file and use xsltproc with the stylesheet and the xml file to generate the single files.

一种解决方案是编写XSL文件并将xsltproc与样式表和xml文件一起使用以生成单个文件。

See How to split XML file into many XML files using XSLT for an example.

有关示例,请参见如何使用XSLT将XML文件拆分为多个XML文件。

#1


9  

Not pure answer but you can tune this yourself:

不是纯粹的答案,但你可以自己调整:

csplit -ksf part. src.xml /\<child\>/ "{100}" 2>/dev/null

This command will split src.xml using regexp /\<child\>/ as a delimiter and produce 1..100 part.* files. You need to play with regexp though...

此命令将使用regexp / \ /作为分隔符拆分src.xml并生成1..100部分。*文件。你需要玩regexp虽然......

#2


3  

One solution is to write a XSL file and use xsltproc with the stylesheet and the xml file to generate the single files.

一种解决方案是编写XSL文件并将xsltproc与样式表和xml文件一起使用以生成单个文件。

See How to split XML file into many XML files using XSLT for an example.

有关示例,请参见如何使用XSLT将XML文件拆分为多个XML文件。