I am trying to scrap a picture from the link and put it into a image file. The request response is returning a byte stream. So I am using decode('utf-8') to convert to unicode stream however, I am facing the following error:
我正在尝试从链接中删除一张图片并将它放入一个图像文件中。请求响应返回一个字节流。因此,我正在使用decode(‘utf-8’)将其转换为unicode流,但我面临以下错误:
print (info.decode(('utf-8')))
print(info.decode((utf - 8)))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
UnicodeDecodeError:“utf-8”编解码器在0位置无法解码字节0xff:无效的开始字节
from urllib import request
img = request.urlopen('http://www.py4inf.com/cover.jpg')
fhand = open('cover.jpg', 'w')
size = 0
while True:
info = img.read(100000)
if len(info) < 1 : break
size = size + len(info)
print (info.decode(('utf-8')))
fhand.write(info.decode(('utf-8')))
print (size,'characters copied.')
fhand.close()
Please let me know how can I proceed. Thanks.
请告诉我该如何进行。谢谢。
2 个解决方案
#1
2
The file should be opened in binary mode and then you can copy the stream byte for byte. Since shutil
already has a handy helper utility, you can
该文件应该以二进制模式打开,然后您可以将流字节复制到字节。因为shutil已经有了一个方便的助手实用程序,所以您可以这样做
import shutil
import os
from urllib import request
img = request.urlopen('http://www.py4inf.com/cover.jpg')
with open('cover.jpg', 'wb') as fhand:
shutil.copyfileobj(img, fhand)
print(os.stat('cover.jpg').st_size, 'characters copied')
#2
3
Don't use Unicode transformations for JPG images.
不要为JPG图像使用Unicode转换。
Unicode is for text. What you are downloading is not text, it is something else.
Unicode文本。你下载的不是文本,而是别的东西。
Try this:
试试这个:
from urllib import request
img = request.urlopen('http://www.py4inf.com/cover.jpg')
fhand = open('cover.jpg', 'wb')
size = 0
while True:
info = img.read(100000)
if len(info) < 1 : break
size = size + len(info)
fhand.write(info)
print (size,'characters copied.')
Or, more simply:
或者,更简单:
from urllib import request
request.urlretrieve('http://www.py4inf.com/cover.jpg', 'cover.jpg')
#1
2
The file should be opened in binary mode and then you can copy the stream byte for byte. Since shutil
already has a handy helper utility, you can
该文件应该以二进制模式打开,然后您可以将流字节复制到字节。因为shutil已经有了一个方便的助手实用程序,所以您可以这样做
import shutil
import os
from urllib import request
img = request.urlopen('http://www.py4inf.com/cover.jpg')
with open('cover.jpg', 'wb') as fhand:
shutil.copyfileobj(img, fhand)
print(os.stat('cover.jpg').st_size, 'characters copied')
#2
3
Don't use Unicode transformations for JPG images.
不要为JPG图像使用Unicode转换。
Unicode is for text. What you are downloading is not text, it is something else.
Unicode文本。你下载的不是文本,而是别的东西。
Try this:
试试这个:
from urllib import request
img = request.urlopen('http://www.py4inf.com/cover.jpg')
fhand = open('cover.jpg', 'wb')
size = 0
while True:
info = img.read(100000)
if len(info) < 1 : break
size = size + len(info)
fhand.write(info)
print (size,'characters copied.')
Or, more simply:
或者,更简单:
from urllib import request
request.urlretrieve('http://www.py4inf.com/cover.jpg', 'cover.jpg')