我想要下载一个网站上的图片,但是该网站的图片地址是通过浏览器执行网页的JavaScript脚本后才返回的,使用urllib2的urlopen只能得到带有JavaScript代码的网页源代码。
想要用wx模块的wx.lib.iewin实现,但是不知道怎么用,也没有找到相关资料,求高手相助!
谢谢!
提供测试一个地址:http://www.cnbeta.com/articles/105666.htm
如何通过python获得该网页上的用户评论!
6 个解决方案
#1
我打算通过使用wxPython中的wx.lib.iewin.IEHtmlWindow来打开网页,执行网页JavaScript代码后去的网页的源代码。但是却在获取元代码那里出了问题。
执行下面的代码,总是显示如下错误:
> "C:\Python25\python.exe" -u "d:\My Documents\Python\个人\图片采集\继续尝试.py"
Traceback (most recent call last):
File "C:\Python25\Lib\site-packages\comtypes\client\_events.py", line 111, in error_printer
return func(*args, **kw)
File "d:\My Documents\Python\个人\图片采集\继续尝试.py", line 22, in DocumentComplete
s = self.html.GetText()
File "C:\Python25\Lib\site-packages\wx-2.8-msw-unicode\wx\lib\iewin.py", line 111, in GetText
if self.ctrl.Document is None:
File "C:\Python25\Lib\site-packages\comtypes\__init__.py", line 267, in __getattr__
raise AttributeError(name)
AttributeError: Document
求高手解救!拜托了,谢谢!
执行下面的代码,总是显示如下错误:
> "C:\Python25\python.exe" -u "d:\My Documents\Python\个人\图片采集\继续尝试.py"
Traceback (most recent call last):
File "C:\Python25\Lib\site-packages\comtypes\client\_events.py", line 111, in error_printer
return func(*args, **kw)
File "d:\My Documents\Python\个人\图片采集\继续尝试.py", line 22, in DocumentComplete
s = self.html.GetText()
File "C:\Python25\Lib\site-packages\wx-2.8-msw-unicode\wx\lib\iewin.py", line 111, in GetText
if self.ctrl.Document is None:
File "C:\Python25\Lib\site-packages\comtypes\__init__.py", line 267, in __getattr__
raise AttributeError(name)
AttributeError: Document
求高手解救!拜托了,谢谢!
#!/usr/bin/env python
#coding=utf-8
import wx.lib.iewin
import wx,time
class MyFrame(wx.Frame):
def __init__(self):
wx.Frame.__init__(self,parent = None,id = -1,pos = wx.DefaultPosition,title = u'iewin窗口')
panel = wx.Panel(self)
self.html = wx.lib.iewin.IEHtmlWindow(panel,-1,pos = wx.DefaultPosition,style = 0,name = 'OK')
self.html.LoadUrl('http://www.python.org')
self.html
sizer = wx.BoxSizer(wx.HORIZONTAL)
sizer.Add(self.html,1, wx.ALL|wx.EXPAND,0)
panel.SetSizer(sizer)
sizer.Fit(self)
self.html.AddEventSink(self)
def DocumentComplete(self,pDisp,URL):
s = self.html.GetText()
print s
if __name__=='__main__':
app= wx.PySimpleApp()
frame = MyFrame()
frame.Show()
app.MainLoop()
#2
慢慢找答案……
慢慢找答案……
慢慢找答案……
居然说内容太短了!
慢慢找答案……
慢慢找答案……
居然说内容太短了!
#3
奇怪,是我的wxpython安装了有问题吗?把Python和wxPython卸载了重新装好后,并把地址改为:http://www.wxpython.org。上面的代码居然没有报错,源代码是拿到了,不过现在要面对的是中文显示问题!
天哪!
天哪!
#4
中文显示问题好对付了,解码就行了
#5
是的,中文问题好解决,我只是在等人回复然后我才可以结贴,恭喜你~!
我把解决问题的代码写到博客里了,地址:http://blog.csdn.net/dongnanyanhai/archive/2010/03/07/5353684.aspx
结贴了!
#6
从博客到论坛,从论坛到博客,楼主,可不可以解释一下其中各个函数的意思啊,尤其下边这一节
self.html = wx.lib.iewin.IEHtmlWindow(panel,-1,pos = wx.DefaultPosition,style = 0,name = 'OK')
self.html.LoadUrl('http://www.python.org')
self.html
sizer = wx.BoxSizer(wx.HORIZONTAL)
sizer.Add(self.html,1, wx.ALL|wx.EXPAND,0)
panel.SetSizer(sizer)
sizer.Fit(self)
self.html.AddEventSink(self)
还有我的gettext()为什么得不到东西啊,我只想获取源代码保存到字符串里,不要求显示
self.html = wx.lib.iewin.IEHtmlWindow(panel,-1,pos = wx.DefaultPosition,style = 0,name = 'OK')
self.html.LoadUrl('http://www.python.org')
self.html
sizer = wx.BoxSizer(wx.HORIZONTAL)
sizer.Add(self.html,1, wx.ALL|wx.EXPAND,0)
panel.SetSizer(sizer)
sizer.Fit(self)
self.html.AddEventSink(self)
还有我的gettext()为什么得不到东西啊,我只想获取源代码保存到字符串里,不要求显示
#1
我打算通过使用wxPython中的wx.lib.iewin.IEHtmlWindow来打开网页,执行网页JavaScript代码后去的网页的源代码。但是却在获取元代码那里出了问题。
执行下面的代码,总是显示如下错误:
> "C:\Python25\python.exe" -u "d:\My Documents\Python\个人\图片采集\继续尝试.py"
Traceback (most recent call last):
File "C:\Python25\Lib\site-packages\comtypes\client\_events.py", line 111, in error_printer
return func(*args, **kw)
File "d:\My Documents\Python\个人\图片采集\继续尝试.py", line 22, in DocumentComplete
s = self.html.GetText()
File "C:\Python25\Lib\site-packages\wx-2.8-msw-unicode\wx\lib\iewin.py", line 111, in GetText
if self.ctrl.Document is None:
File "C:\Python25\Lib\site-packages\comtypes\__init__.py", line 267, in __getattr__
raise AttributeError(name)
AttributeError: Document
求高手解救!拜托了,谢谢!
执行下面的代码,总是显示如下错误:
> "C:\Python25\python.exe" -u "d:\My Documents\Python\个人\图片采集\继续尝试.py"
Traceback (most recent call last):
File "C:\Python25\Lib\site-packages\comtypes\client\_events.py", line 111, in error_printer
return func(*args, **kw)
File "d:\My Documents\Python\个人\图片采集\继续尝试.py", line 22, in DocumentComplete
s = self.html.GetText()
File "C:\Python25\Lib\site-packages\wx-2.8-msw-unicode\wx\lib\iewin.py", line 111, in GetText
if self.ctrl.Document is None:
File "C:\Python25\Lib\site-packages\comtypes\__init__.py", line 267, in __getattr__
raise AttributeError(name)
AttributeError: Document
求高手解救!拜托了,谢谢!
#!/usr/bin/env python
#coding=utf-8
import wx.lib.iewin
import wx,time
class MyFrame(wx.Frame):
def __init__(self):
wx.Frame.__init__(self,parent = None,id = -1,pos = wx.DefaultPosition,title = u'iewin窗口')
panel = wx.Panel(self)
self.html = wx.lib.iewin.IEHtmlWindow(panel,-1,pos = wx.DefaultPosition,style = 0,name = 'OK')
self.html.LoadUrl('http://www.python.org')
self.html
sizer = wx.BoxSizer(wx.HORIZONTAL)
sizer.Add(self.html,1, wx.ALL|wx.EXPAND,0)
panel.SetSizer(sizer)
sizer.Fit(self)
self.html.AddEventSink(self)
def DocumentComplete(self,pDisp,URL):
s = self.html.GetText()
print s
if __name__=='__main__':
app= wx.PySimpleApp()
frame = MyFrame()
frame.Show()
app.MainLoop()
#2
慢慢找答案……
慢慢找答案……
慢慢找答案……
居然说内容太短了!
慢慢找答案……
慢慢找答案……
居然说内容太短了!
#3
奇怪,是我的wxpython安装了有问题吗?把Python和wxPython卸载了重新装好后,并把地址改为:http://www.wxpython.org。上面的代码居然没有报错,源代码是拿到了,不过现在要面对的是中文显示问题!
天哪!
天哪!
#4
中文显示问题好对付了,解码就行了
#5
是的,中文问题好解决,我只是在等人回复然后我才可以结贴,恭喜你~!
我把解决问题的代码写到博客里了,地址:http://blog.csdn.net/dongnanyanhai/archive/2010/03/07/5353684.aspx
结贴了!
#6
从博客到论坛,从论坛到博客,楼主,可不可以解释一下其中各个函数的意思啊,尤其下边这一节
self.html = wx.lib.iewin.IEHtmlWindow(panel,-1,pos = wx.DefaultPosition,style = 0,name = 'OK')
self.html.LoadUrl('http://www.python.org')
self.html
sizer = wx.BoxSizer(wx.HORIZONTAL)
sizer.Add(self.html,1, wx.ALL|wx.EXPAND,0)
panel.SetSizer(sizer)
sizer.Fit(self)
self.html.AddEventSink(self)
还有我的gettext()为什么得不到东西啊,我只想获取源代码保存到字符串里,不要求显示
self.html = wx.lib.iewin.IEHtmlWindow(panel,-1,pos = wx.DefaultPosition,style = 0,name = 'OK')
self.html.LoadUrl('http://www.python.org')
self.html
sizer = wx.BoxSizer(wx.HORIZONTAL)
sizer.Add(self.html,1, wx.ALL|wx.EXPAND,0)
panel.SetSizer(sizer)
sizer.Fit(self)
self.html.AddEventSink(self)
还有我的gettext()为什么得不到东西啊,我只想获取源代码保存到字符串里,不要求显示