本文实例讲述了Python3基于sax解析xml操作。分享给大家供大家参考,具体如下:
python使用SAX解析xml
SAX是一种基于事件驱动的API。
利用SAX解析XML文档牵涉到两个部分:解析器和事件处理器。
解析器负责读取XML文档,并向事件处理器发送事件,如元素开始跟元素结束事件;
而事件处理器则负责对事件作出相应,对传递的XML数据进行处理。
1、对大型文件进行处理;
2、只需要文件的部分内容,或者只需从文件中得到特定信息。
3、想建立自己的对象模型的时候。
在python中使用sax方式处理xml要先引入xml.sax
中的parse
函数,还有xml.sax.handler
中的ContentHandler
。
saxDemo.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
|
# -*- coding:utf-8 -*-
#!/usr/bin/python3
import xml.sax
class MovieHandler( xml.sax.ContentHandler ):
def __init__( self ):
self .CurrentData = ""
self . type = ""
self . format = ""
self .year = ""
self .rating = ""
self .stars = ""
self .description = ""
# 元素开始调用
def startElement( self , tag, attributes):
self .CurrentData = tag
if tag = = "movie" :
print ( "*****Movie*****" )
title = attributes[ "title" ]
print ( "Title:" , title)
# 元素结束调用
def endElement( self , tag):
if self .CurrentData = = "type" :
print ( "Type:" , self . type )
elif self .CurrentData = = "format" :
print ( "Format:" , self . format )
elif self .CurrentData = = "year" :
print ( "Year:" , self .year)
elif self .CurrentData = = "rating" :
print ( "Rating:" , self .rating)
elif self .CurrentData = = "stars" :
print ( "Stars:" , self .stars)
elif self .CurrentData = = "description" :
print ( "Description:" , self .description)
self .CurrentData = ""
# 读取字符时调用
def characters( self , content):
if self .CurrentData = = "type" :
self . type = content
elif self .CurrentData = = "format" :
self . format = content
elif self .CurrentData = = "year" :
self .year = content
elif self .CurrentData = = "rating" :
self .rating = content
elif self .CurrentData = = "stars" :
self .stars = content
elif self .CurrentData = = "description" :
self .description = content
if ( __name__ = = "__main__" ):
# 创建一个 XMLReader
parser = xml.sax.make_parser()
# turn off namepsaces
parser.setFeature(xml.sax.handler.feature_namespaces, 0 )
# 重写 ContextHandler
Handler = MovieHandler()
parser.setContentHandler( Handler )
parser.parse( "movies.xml" )
|
执行结果
*****Movie*****
Title: Enemy Behind
Type: love中国
Format: DVD
Year: 2003
Rating: PG
Stars: 10
Description: Talk about a US-Japan war
*****Movie*****
Title: Transformers
Type: Anime, Science Fiction
Format: DVD
Year: 1989
Rating: R
Stars: 8
Description: A schientific fiction
运行结果如下图所示:
movies.xml内容:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
<? xml version = "1.0" encoding = "utf-8" ?>
< collection shelf = "New Arrivals" >
< movie title = "Enemy Behind" >
< type >love中国</ type >
< format >DVD</ format >
< year >2003</ year >
< rating >PG</ rating >
< stars >10</ stars >
< description >Talk about a US-Japan war</ description >
</ movie >
< movie title = "Transformers" >
< type >Anime, Science Fiction</ type >
< format >DVD</ format >
< year >1989</ year >
< rating >R</ rating >
< stars >8</ stars >
< description >A schientific fiction</ description >
</ movie >
</ collection >
|
希望本文所述对大家Python程序设计有所帮助。
原文链接:https://blog.csdn.net/nuli888/article/details/51970788