I'm trying to get a Python 3 program to do some manipulations with a text file filled with information. However, when trying to read the file I get the following error:
我想让一个Python 3程序对一个充满信息的文本文件进行一些操作。但是,当尝试读取文件时,我得到以下错误:
Traceback (most recent call last):
File "SCRIPT LOCATION", line NUMBER, in <module>
text = file.read()
File "C:\Python31\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 2907500: character maps to <undefined>
If anyone could give me any help to try and get past this problem I would be most grateful.
如果有谁能帮助我克服这个问题,我将非常感激。
2 个解决方案
#1
380
The file in question is not using the CP1252
encoding. It's using another encoding. Which one you have to figure out yourself. Common ones are Latin-1
and UTF-8
. Since 0x90 doesn't actually mean anything in Latin-1
, UTF-8
(where 0x90 is a continuation byte) is more likely.
该文件不使用CP1252编码。这是使用另一个编码。哪一个是你自己想出来的。常见的有Latin-1和UTF-8。因为0x90在Latin-1中实际上没有任何含义,所以UTF-8 (0x90是一个延续字节)更有可能出现。
You specify the encoding when you open the file:
您在打开文件时指定编码:
file = open(filename, encoding="utf8")
#2
17
As an extension to @LennartRegebro answer:
作为对@LennartRegebro的扩展:
If you can't tell what encoding it is and solution above does not work (it's not utf8
) and you found yourself merely guessing - there are online tools that you could use to identify what encoding that is. They aren't perfect but usually work just fine. After you figured out encoding you should be able to use solution above.
如果你不知道它是什么编码,上面的解决方案不能工作(它不是utf8),你发现自己只是在猜测——有一些在线工具可以用来识别它是什么编码。它们并不完美,但通常效果很好。找到编码之后,您应该能够使用上面的解决方案。
EDIT: (Copied from comment)
从评论编辑:(复制)
A quite popular text editor Sublime Text
has a command to display encoding if it has been set...
一个相当流行的文本编辑器崇高文本有一个命令显示编码,如果它被设置…
- Go to
View
->Show Console
(or Ctrl+`) - 点击查看->显示控制台(或Ctrl+)
- Type into field at the bottom
view.encoding()
and hope for the best (I was unable to get anything butUndefined
but maybe you will have better luck...) - 在底部视图.encoding()中输入字段,并希望得到最好的结果(除了未定义之外我什么都得不到,但是也许您会有更好的运气…)
#1
380
The file in question is not using the CP1252
encoding. It's using another encoding. Which one you have to figure out yourself. Common ones are Latin-1
and UTF-8
. Since 0x90 doesn't actually mean anything in Latin-1
, UTF-8
(where 0x90 is a continuation byte) is more likely.
该文件不使用CP1252编码。这是使用另一个编码。哪一个是你自己想出来的。常见的有Latin-1和UTF-8。因为0x90在Latin-1中实际上没有任何含义,所以UTF-8 (0x90是一个延续字节)更有可能出现。
You specify the encoding when you open the file:
您在打开文件时指定编码:
file = open(filename, encoding="utf8")
#2
17
As an extension to @LennartRegebro answer:
作为对@LennartRegebro的扩展:
If you can't tell what encoding it is and solution above does not work (it's not utf8
) and you found yourself merely guessing - there are online tools that you could use to identify what encoding that is. They aren't perfect but usually work just fine. After you figured out encoding you should be able to use solution above.
如果你不知道它是什么编码,上面的解决方案不能工作(它不是utf8),你发现自己只是在猜测——有一些在线工具可以用来识别它是什么编码。它们并不完美,但通常效果很好。找到编码之后,您应该能够使用上面的解决方案。
EDIT: (Copied from comment)
从评论编辑:(复制)
A quite popular text editor Sublime Text
has a command to display encoding if it has been set...
一个相当流行的文本编辑器崇高文本有一个命令显示编码,如果它被设置…
- Go to
View
->Show Console
(or Ctrl+`) - 点击查看->显示控制台(或Ctrl+)
- Type into field at the bottom
view.encoding()
and hope for the best (I was unable to get anything butUndefined
but maybe you will have better luck...) - 在底部视图.encoding()中输入字段,并希望得到最好的结果(除了未定义之外我什么都得不到,但是也许您会有更好的运气…)