UnicodeEncodeError:“ascii”编解码器不能将字符u'\xe9编码在位置7:序号不在范围(128)[复制]

This question already has an answer here:

这个问题已经有了答案:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128) 18 answers
UnicodeEncodeError:“ascii”编解码器不能在位置20中对字符u'\xa0进行编码:序号不在range(128) 18中。

I have this code:

我有这段代码:

    printinfo = title + "\t" + old_vendor_id + "\t" + apple_id + '\n'
    # Write file
    f.write (printinfo + '\n')

But I get this error when running it:

但是我在运行它的时候会出错:

    f.write(printinfo + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

It's having toruble writing out this:

这是要写出来的

Identité secrète (Abduction) [VF]

Any ideas please, not sure how to fix.

有什么想法，不知道该怎么解决。

Cheers.

欢呼。

UPDATE: This is the bulk of my code, so you can see what I am doing:

更新:这是我的大部分代码，所以您可以看到我在做什么:

def runLookupEdit(self, event):
    newpath1 = pathindir + "/"
    errorFileOut = newpath1 + "REPORT.csv"
    f = open(errorFileOut, 'w')

global old_vendor_id

for old_vendor_id in vendorIdsIn.splitlines():
    writeErrorFile = 0
    from lxml import etree
    parser = etree.XMLParser(remove_blank_text=True) # makes pretty print work

    path1 = os.path.join(pathindir, old_vendor_id)
    path2 = path1 + ".itmsp"
    path3 = os.path.join(path2, 'metadata.xml')

    # Open and parse the xml file
    cantFindError = 0
    try:
        with open(path3): pass
    except IOError:
        cantFindError = 1
        errorMessage = old_vendor_id
        self.Error(errorMessage)
        break
    tree = etree.parse(path3, parser)
    root = tree.getroot()

    for element in tree.xpath('//video/title'):
        title = element.text
        while '\n' in title:
            title= title.replace('\n', ' ')
        while '\t' in title:
            title = title.replace('\t', ' ')
        while '  ' in title:
            title = title.replace('  ', ' ')
        title = title.strip()
        element.text = title
    print title

#########################################
######## REMOVE UNWANTED TAGS ########
#########################################

    # Remove the comment tags
    comments = tree.xpath('//comment()')
    q = 1
    for c in comments:
        p = c.getparent()
        if q == 3:
            apple_id = c.text
        p.remove(c)
        q = q+1

    apple_id = apple_id.split(':',1)[1]
    apple_id = apple_id.strip()
    printinfo = title + "\t" + old_vendor_id + "\t" + apple_id

    # Write file
    # f.write (printinfo + '\n')
    f.write(printinfo.encode('utf8') + '\n')
f.close()

1 个解决方案

#1

You need to encode Unicode explicitly before writing to a file, otherwise Python does it for you with the default ASCII codec.

您需要在写入文件之前显式地对Unicode编码，否则Python会为您使用默认的ASCII编解码器。

Pick an encoding and stick with it:

选择一个编码并坚持:

f.write(printinfo.encode('utf8') + '\n')

or use io.open() to create a file object that'll encode for you as you write to the file:

或者使用io.open()创建一个文件对象，当您写入文件时，它将为您编码:

import io

f = io.open(filename, 'w', encoding='utf8')

You may want to read:

你可能想读:

The Python Unicode HOWTO

Python Unicode HOWTO
Pragmatic Unicode by Ned Batchelder

内德·巴切尔德的实用统一码。
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky

绝对最小的软件开发人员绝对，肯定要知道关于Unicode和字符集(没有借口!)的Joel Spolsky。

before continuing.

在继续之前。

#1