
urlparse模块:
1、urlparse()
具体程序及结果如下:
>>> url = 'http://i.cnblogs.com/EditPosts.aspx?opt=1'
>>> from urlparse import urlparse
>>> parsed = urlparse(url)
>>> print parsed
ParseResult(scheme='http', netloc='i.cnblogs.com', path='/EditPosts.aspx', params='', query='opt=1', fragment='')
也可以如下:
>>> from urlparse import urlparse
>>> url = "http://localhost:8080"
>>> name = urlparse(url)[1]
>>> print name
localhost:8080
这样就可以通过对name进行split获取到后面的端口号
2、urlsplit()
具体程序及结果如下:
>>> from urlparse import urlsplit
>>> parsed2 = urlsplit(url)
>>> print parsed2
SplitResult(scheme='http', netloc='i.cnblogs.com', path='/EditPosts.aspx', query='opt=1', fragment='')
注:urlsplit结果比urlparse少了一项:params