使用Pandas导入CSV文件的时候出错,encoding = ‘UTF-8’
#-*- coding: utf-8 -*- import pandas as pd inputfile = 'data/huizong.csv' #评论汇总文件 outputfile = 'data/meidi_jd1.txt' #评论提取后保存路径 data = pd.read_csv(inputfile, encoding = 'utf-8') data = data[[u'评论']][data[u'品牌'] == u'美的'] data.to_csv(outputfile, index = False, header = False, encoding = 'utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 0: invalid continuation byte
修改 encoding = 'gb18030' 后导入正常,这个问题在使用Python导入中文内容经常出现,很多教程是在Python2.7上运行正常,但是实际用Python3跑的时候可能会出错。
#-*- coding: utf-8 -*- import pandas as pd inputfile = 'data/huizong.csv' #评论汇总文件 outputfile = 'data/meidi_jd1.txt' #评论提取后保存路径 data = pd.read_csv(inputfile, encoding = 'gb18030') data = data[[u'评论']][data[u'品牌'] == u'美的'] data.to_csv(outputfile, index = False, header = False, encoding = 'utf-8')