解决error:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0

时间:2023-01-04 20:44:08

参考:

https://www.cnblogs.com/Alier/p/6794719.html

代码:

stopwords = pd.read_csv("stopwords.txt",index_col=False,quoting=3,sep=" ",names=['stopword'],encoding='UTF-8')

报错:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0

Python 编码中编码解码的问题,我这个错误就是‘utf-8’不能解码位置0的那个字节(0xa1),也就是这个字节超出了utf-8的表示范围了

解决办法:

stopwords = pd.read_csv("stopwords.txt",index_col=False,quoting=3,sep=" ",names=['stopword'],encoding='gb18030')

也就是在读取数据的时候,显式添加编码方式encoding='gb18030',别的编码也可以试试哟