在Python中处理XML的最佳方法是什么?

时间:2022-02-22 23:14:21

I am working on an XML file that is very large (I think that since about it is 45 GB) I have to search through the document of OpenStreetMap data and find a particular path that satisfies a certain criteria. I am currently using XML element tree however I think it is a little slow since I need to search the XML for specific Geographic coordinates. Is there a better way to do this ? I have also seen a little bit of lxml however, I'd like to know if there's a better choice between the two ? Thank you!

我正在研究一个非常大的XML文件(我认为因为它是45 GB)我必须搜索OpenStreetMap数据的文档并找到满足某个标准的特定路径。我目前正在使用XML元素树,但我认为它有点慢,因为我需要在XML中搜索特定的地理坐标。有一个更好的方法吗 ?我也看过一点lxml然而,我想知道两者之间是否有更好的选择?谢谢!

1 个解决方案

#1


0  

The sax package is well suited for parsing huge xml files that can't be loaded into memory all at once. The parser will go "line-by-line" and notify you when it encountered the beginning or ending of an element, instead of loading the file into memory, parsing it, and giving you the whole tree.

sax包非常适合解析无法一次性加载到内存中的大型xml文件。解析器将“逐行”并在遇到元素的开头或结尾时通知您,而不是将文件加载到内存中,解析它,并为您提供整个树。

#1


0  

The sax package is well suited for parsing huge xml files that can't be loaded into memory all at once. The parser will go "line-by-line" and notify you when it encountered the beginning or ending of an element, instead of loading the file into memory, parsing it, and giving you the whole tree.

sax包非常适合解析无法一次性加载到内存中的大型xml文件。解析器将“逐行”并在遇到元素的开头或结尾时通知您,而不是将文件加载到内存中,解析它,并为您提供整个树。