I have a xml in which I have to search for a tag and replace the value of tag with a new values. For example,
我有一个xml,在这个xml中,我必须搜索一个标签,并用一个新的值替换标签的值。例如,
<tag-Name>oldName</tag-Name>
and replace oldName
to newName
like
并将旧名替换为newName like
<tag-Name>newName</tag-Name>
and save the xml. How can I do that without using thirdparty lib like BeautifulSoup
并保存xml。我怎么能做到不使用像漂亮的汤这样的三党*
Thank you
谢谢你!
3 个解决方案
#1
5
The best option from the standard lib is (I think) the xml.etree package.
标准库的最佳选项(我认为)是xml。etree包。
Assuming that your example tag occurs only once somewhere in the document:
假设示例标记只出现在文档中的某个地方一次:
import xml.etree.ElementTree as etree
# or for a faster C implementation
# import xml.etree.cElementTree as etree
tree = etree.parse('input.xml')
elem = tree.find('//tag-Name') # finds the first occurrence of element tag-Name
elem.text = 'newName'
tree.write('output.xml')
Or if there are multiple occurrences of tag-Name, and you want to change them all if they have "oldName" as content:
或者,如果有多个标记名出现,如果它们的内容是“oldName”,那么您希望将它们全部更改:
import xml.etree.cElementTree as etree
tree = etree.parse('input.xml')
for elem in tree.findall('//tag-Name'):
if elem.text == 'oldName':
elem.text = 'newName'
# some output options for example
tree.write('output.xml', encoding='utf-8', xml_declaration=True)
#2
0
Python has 'builtin' libraries for working with xml. For this simple task I'd look into minidom. You can find the docs here:
Python有用于处理xml的“内置”库。对于这个简单的任务,我将研究minidom。你可以在这里找到医生:
http://docs.python.org/library/xml.dom.minidom.html
http://docs.python.org/library/xml.dom.minidom.html
#3
0
If you are certain, I mean, completely 100% positive that the string <tag-Name>
will never appear inside that tag and the XML will always be formatted like that, you can always use good old string manipulation tricks like:
如果您确信,我的意思是,100%肯定的是,字符串
xmlstring = xmlstring.replace('<tag-Name>oldName</tag-Name>', '<tag-Name>newName</tag-Name>')
If the XML isn't always as conveniently formatted as <tag>value</tag>
, you could write something like:
如果XML的格式不总是像
a = """<tag-Name>
oldName
</tag-Name>"""
def replacetagvalue(xmlstring, tag, oldvalue, newvalue):
start = 0
while True:
try:
start = xmlstring.index('<%s>' % tag, start) + 2 + len(tag)
except ValueError:
break
end = xmlstring.index('</%s>' % tag, start)
value = xmlstring[start:end].strip()
if value == oldvalue:
xmlstring = xmlstring[:start] + newvalue + xmlstring[end:]
return xmlstring
print replacetagvalue(a, 'tag-Name', 'oldName', 'newName')
Others have already mentioned xml.dom.minidom
which is probably a better idea if you're not absolutely certain beyond any doubt that your XML will be so simple. If you can guarantee that, though, remember that XML is just a big chunk of text that you can manipulate as needed.
其他人已经提到了xml.dom。如果您不确定您的XML会如此简单,那么minidom可能是一个更好的主意。但是,如果您能保证这一点,请记住XML只是一大块文本,您可以根据需要进行操作。
I use similar tricks in some production code where simple checks like if "somevalue" in htmlpage
are much faster and more understandable than invoking BeautifulSoup, lxml, or other *ML libraries.
在一些生产代码中,我使用了类似的技巧,比如在htmlpage中“somevalue”这样的简单检查比调用漂亮的soup、lxml或其他*ML库要快得多,也更容易理解。
#1
5
The best option from the standard lib is (I think) the xml.etree package.
标准库的最佳选项(我认为)是xml。etree包。
Assuming that your example tag occurs only once somewhere in the document:
假设示例标记只出现在文档中的某个地方一次:
import xml.etree.ElementTree as etree
# or for a faster C implementation
# import xml.etree.cElementTree as etree
tree = etree.parse('input.xml')
elem = tree.find('//tag-Name') # finds the first occurrence of element tag-Name
elem.text = 'newName'
tree.write('output.xml')
Or if there are multiple occurrences of tag-Name, and you want to change them all if they have "oldName" as content:
或者,如果有多个标记名出现,如果它们的内容是“oldName”,那么您希望将它们全部更改:
import xml.etree.cElementTree as etree
tree = etree.parse('input.xml')
for elem in tree.findall('//tag-Name'):
if elem.text == 'oldName':
elem.text = 'newName'
# some output options for example
tree.write('output.xml', encoding='utf-8', xml_declaration=True)
#2
0
Python has 'builtin' libraries for working with xml. For this simple task I'd look into minidom. You can find the docs here:
Python有用于处理xml的“内置”库。对于这个简单的任务,我将研究minidom。你可以在这里找到医生:
http://docs.python.org/library/xml.dom.minidom.html
http://docs.python.org/library/xml.dom.minidom.html
#3
0
If you are certain, I mean, completely 100% positive that the string <tag-Name>
will never appear inside that tag and the XML will always be formatted like that, you can always use good old string manipulation tricks like:
如果您确信,我的意思是,100%肯定的是,字符串
xmlstring = xmlstring.replace('<tag-Name>oldName</tag-Name>', '<tag-Name>newName</tag-Name>')
If the XML isn't always as conveniently formatted as <tag>value</tag>
, you could write something like:
如果XML的格式不总是像
a = """<tag-Name>
oldName
</tag-Name>"""
def replacetagvalue(xmlstring, tag, oldvalue, newvalue):
start = 0
while True:
try:
start = xmlstring.index('<%s>' % tag, start) + 2 + len(tag)
except ValueError:
break
end = xmlstring.index('</%s>' % tag, start)
value = xmlstring[start:end].strip()
if value == oldvalue:
xmlstring = xmlstring[:start] + newvalue + xmlstring[end:]
return xmlstring
print replacetagvalue(a, 'tag-Name', 'oldName', 'newName')
Others have already mentioned xml.dom.minidom
which is probably a better idea if you're not absolutely certain beyond any doubt that your XML will be so simple. If you can guarantee that, though, remember that XML is just a big chunk of text that you can manipulate as needed.
其他人已经提到了xml.dom。如果您不确定您的XML会如此简单,那么minidom可能是一个更好的主意。但是,如果您能保证这一点,请记住XML只是一大块文本,您可以根据需要进行操作。
I use similar tricks in some production code where simple checks like if "somevalue" in htmlpage
are much faster and more understandable than invoking BeautifulSoup, lxml, or other *ML libraries.
在一些生产代码中,我使用了类似的技巧,比如在htmlpage中“somevalue”这样的简单检查比调用漂亮的soup、lxml或其他*ML库要快得多,也更容易理解。