I have many instances following this format in .xml file:
我在.xml文件中有这么多格式的实例:
<FFFFF>
<BBBBB>
"good B data"
</BBBBB>
<BBBBB>
"more good B data"
</BBBBB>
</FFFFF>
<AAAAA>
<BBBBB>
"some data"
</BBBBB>
<BBBBB>
"more B data"
</BBBBB>
</AAAAA>
I am trying to remove the A tags, and rename the B tags that are in the A tags; so the final result would be: (please note, renaming the B tags to any tags would also be fine, they just cannot be B anymore)
我试图删除A标签,并重命名A标签中的B标签;所以最终的结果是:(请注意,将B标签重命名为任何标签也没问题,它们就不能再为B了)
<FFFFF>
<BBBBB>
"good B data"
</BBBBB>
<BBBBB>
"more good B data"
</BBBBB>
</FFFFF>
<AAAAA>
"some data"
</AAAAA>
<AAAAA>
"more B data"
</AAAAA>
I have been messing around with sed, but I cannot figure out how to do it. There is no set number of B tags in each A (some have none, some may have 20, etc.). The other issue is that I don't want to remove the B tags that are present elsewhere; so I cant do a simple find and replace on B tags as that would alter the ones embedded in .
我一直在搞乱sed,但我无法弄清楚如何做到这一点。每个A中没有固定数量的B标签(有些没有,有些可能有20个,等等)。另一个问题是我不想删除其他地方存在的B标记;所以我不能在B标签上进行简单的查找和替换,因为这会改变嵌入的标签。
Any assistance appreciated, thanks!
任何帮助表示赞赏,谢谢!
2 个解决方案
#1
1
$ cat file
<FFFFF>
<BBBBB>
"good B data"
</BBBBB>
<BBBBB>
"more good B data"
</BBBBB>
</FFFFF>
<AAAAA>
<BBBBB>
"some data"
</BBBBB>
<BBBBB>
"more B data"
</BBBBB>
</AAAAA>
$ cat tst.awk
BEGIN{ remove="AAAAA"; changeFrom="BBBBB"; changeTo="XXXXX" }
$1 ~ "^<" remove ">$" {
inRemove = 1
next
}
inRemove {
if ($1 ~ "^</" remove ">$") {
inRemove = 0
next
}
else if ($1 ~ "^</?" changeFrom ">$") {
sub(changeFrom,changeTo)
}
sub(/^ /,"")
}
{ print }
$ awk -f tst.awk file
<FFFFF>
<BBBBB>
"good B data"
</BBBBB>
<BBBBB>
"more good B data"
</BBBBB>
</FFFFF>
<XXXXX>
"some data"
</XXXXX>
<XXXXX>
"more B data"
</XXXXX>
#2
0
sed '/^<AAAAA>/,/^<\/AAAAA>/ {
/^<\/*AAAAA>/ s/^<\/*AAAAA>//
/^<\/*AAAAA>/ !{
s/^\([[:space:]]*\)<\(\/*\)BBBBB>/\1<\2AAAAA>/
}
}' YourFile
- This is for your sample so maybe it could be usefull to use a variable for the TAG to search/modify
- Space in front of modified tag (indent) is unchanged
- Line containing old are just empty but still there
这适用于您的样本,因此使用变量进行搜索/修改TAG可能非常有用
修改后的标签(缩进)前面的空格不变
包含旧的行只是空的但仍然存在
#1
1
$ cat file
<FFFFF>
<BBBBB>
"good B data"
</BBBBB>
<BBBBB>
"more good B data"
</BBBBB>
</FFFFF>
<AAAAA>
<BBBBB>
"some data"
</BBBBB>
<BBBBB>
"more B data"
</BBBBB>
</AAAAA>
$ cat tst.awk
BEGIN{ remove="AAAAA"; changeFrom="BBBBB"; changeTo="XXXXX" }
$1 ~ "^<" remove ">$" {
inRemove = 1
next
}
inRemove {
if ($1 ~ "^</" remove ">$") {
inRemove = 0
next
}
else if ($1 ~ "^</?" changeFrom ">$") {
sub(changeFrom,changeTo)
}
sub(/^ /,"")
}
{ print }
$ awk -f tst.awk file
<FFFFF>
<BBBBB>
"good B data"
</BBBBB>
<BBBBB>
"more good B data"
</BBBBB>
</FFFFF>
<XXXXX>
"some data"
</XXXXX>
<XXXXX>
"more B data"
</XXXXX>
#2
0
sed '/^<AAAAA>/,/^<\/AAAAA>/ {
/^<\/*AAAAA>/ s/^<\/*AAAAA>//
/^<\/*AAAAA>/ !{
s/^\([[:space:]]*\)<\(\/*\)BBBBB>/\1<\2AAAAA>/
}
}' YourFile
- This is for your sample so maybe it could be usefull to use a variable for the TAG to search/modify
- Space in front of modified tag (indent) is unchanged
- Line containing old are just empty but still there
这适用于您的样本,因此使用变量进行搜索/修改TAG可能非常有用
修改后的标签(缩进)前面的空格不变
包含旧的行只是空的但仍然存在