Python3:UnicodeEncodeError仅在从crontab运行时

时间:2023-01-04 20:34:32

first post so be kind please, I have searched a lot around but most things I found are relevant to Python 2.

第一篇帖子请好心的,我已经搜索了很多,但我发现的大部分内容都与Python 2有关。

I have a Python3 script that builds a zip file from a file list; it fails with UnicodeEncodeError only when the script is run from crontab, but it works flawlessly when run from interactive console. I guess there must be something in the environment but I just can't seem to figure out what.

我有一个Python3脚本,可以从文件列表中构建一个zip文件;只有当脚本从crontab运行时,它才会失败并显示UnicodeEncodeError,但是当它从交互式控制台运行时它可以完美运行。我想环境中一定有东西,但我似乎无法弄清楚是什么。

This is the code excerpt:

这是代码摘录:

def zipFileList(self, rootfolder, filelist, zip_file, logger):
    count = 0

    logger.info("Generazione file zip {0}: da {1} files".format(zip_file, len(filelist)))
    zip = zipfile.ZipFile(zip_file, "w", compression=zipfile.ZIP_DEFLATED)

    for curfile in filelist:
        zip.write(os.path.join(rootfolder, curfile), curfile, zipfile.ZIP_DEFLATED)
        count = count + 1

    zip.close()
    logger.info("Scrittura terminata: {0} files".format(count))

And this is the log output for this code fragment:

这是此代码片段的日志输出:

2012-07-31 09:10:03,033: root - ERROR - Traceback (most recent call last):
  File "/usr/local/lib/python3.2/zipfile.py", line 365, in _encodeFilenameFlags
  return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode characters in position 56-57: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "XBE.py", line 45, in main
    pam.executeList(logger)
  File "/home/vte/vtebackup/vte41/scripts/ptActivityManager.py", line 62, in executeList
    self.executeActivity(act, logger)
  File "/home/vte/vtebackup/vte41/scripts/ptActivityManager.py", line 71, in executeActivity
    self.exAct_FileBackup(act, logger)
  File "/home/vte/vtebackup/vte41/scripts/ptActivityManager.py", line 112, in exAct_FileBackup
    ptfs.zipFileList(srcfolder, filelist, arcfilename, logger)
  File "/home/vte/vtebackup/vte41/scripts/ptFileManager.py", line 143, in zipFileList
    zip.write(os.path.join(rootfolder, curfile), curfile, zipfile.ZIP_DEFLATED)
  File "/usr/local/lib/python3.2/zipfile.py", line 1115, in write
    self.fp.write(zinfo.FileHeader())
  File "/usr/local/lib/python3.2/zipfile.py", line 355, in FileHeader
    filename, flag_bits = self._encodeFilenameFlags()
  File "/usr/local/lib/python3.2/zipfile.py", line 367, in _encodeFilenameFlags
    return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 56: surrogates not allowed

This is the crontab line:

这是crontab行:

10 9 * * * /home/vte/vtebackup/vte41/scripts/runbackup.sh >/dev/null 2>&1

And this is the content of runbackup.sh:

这是runbackup.sh的内容:

#! /bin/bash -l

cd /home/vte/vtebackup/vte41/scripts

/usr/local/bin/python3.2 XBE.py

The file on which the exception happens is always the same, but it doesn't seem to include any non ascii chars:

发生异常的文件总是相同的,但它似乎不包含任何非ascii字符:

/var/vhosts/vte41/http_docs/vtecrm41/storage/2012/July/week4/169933_Puccini_Gabriele.tif

OS is Ubuntu Linux LTS 10.04, Python version 3.2 (installed side by side as altinstall with other Python versions). All Python source files have this shebang

操作系统是Ubuntu Linux LTS 10.04,Python版本3.2(与其他Python版本并行安装)。所有Python源文件都有这个shebang

#!/usr/bin/env python3.2

as very first line

作为第一线

Can you help me finding what's wrong and how to fix this problem?

你能帮我找出什么问题以及如何解决这个问题吗?

3 个解决方案

#1


15  

A team member found the resolution in a Python bug thread.

团队成员在Python bug线程中找到了解决方案。

The issue was fixed by prepending a LANG directive to the script command:

通过将LANG指令添加到脚本命令来修复该问题:

* * * * * LANG=it_IT.UTF-8 /home/vte/vtebackup/vte41/scripts/runbackup.sh >/dev/null 2>&1

I hope this is useful for others because I got myself scratching my head for a while on this :)

我希望这对其他人有用,因为我让自己在这个问题上摸不着头脑:)

#2


4  

Check your locale. On the interactive console, run the command locale. Here is what I get:

检查您的区域设置。在交互式控制台上,运行命令locale。这是我得到的:

LANG=
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

Python determines how to interpret filenames based on either the LC_CTYPE or LANG environment variable, and I strongly suspect that one of these is set to a different encoding in your cron environment.

Python确定如何根据LC_CTYPE或LANG环境变量解释文件名,我强烈怀疑其中一个在cron环境中设置为不同的编码。

If that's the case, your filenames will have been decoded to unicode using a different encoding, one that then results in filenames that cannot be encoded to UTF-8 or ASCII.

如果是这种情况,您的文件名将使用不同的编码解码为unicode,然后导致无法编码为UTF-8或ASCII的文件名。

Simply set the LC_CTYPE variable in your cron definition, either on a line on it's own preceding the time entry, or as part of the command to execute:

只需在cron定义中设置LC_CTYPE变量,或者在时间条目之前的行上,或者作为要执行的命令的一部分:

LC_CTYPE="en_US.UTF-8"
* * * * * yourscriptcommand.py

As always with python Unicode issues, the answer lies in the Unicode HOWTO, section on filenames.

与python Unicode问题一样,答案在于Unicode HOWTO,文件名部分。

#3


1  

for chinese

export LANG="zh_CN.utf-8"                                                                            
export LC_CTYPE="zh_CN.utf-8"                                                                        
export PYTHONIOENCODING="utf-8"                                                                      

/export/zhangys/python3.5.2/bin/python3 diff_reporter.py > /home/admin/diff_script/cron_job.log 2>&1 

#1


15  

A team member found the resolution in a Python bug thread.

团队成员在Python bug线程中找到了解决方案。

The issue was fixed by prepending a LANG directive to the script command:

通过将LANG指令添加到脚本命令来修复该问题:

* * * * * LANG=it_IT.UTF-8 /home/vte/vtebackup/vte41/scripts/runbackup.sh >/dev/null 2>&1

I hope this is useful for others because I got myself scratching my head for a while on this :)

我希望这对其他人有用,因为我让自己在这个问题上摸不着头脑:)

#2


4  

Check your locale. On the interactive console, run the command locale. Here is what I get:

检查您的区域设置。在交互式控制台上,运行命令locale。这是我得到的:

LANG=
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

Python determines how to interpret filenames based on either the LC_CTYPE or LANG environment variable, and I strongly suspect that one of these is set to a different encoding in your cron environment.

Python确定如何根据LC_CTYPE或LANG环境变量解释文件名,我强烈怀疑其中一个在cron环境中设置为不同的编码。

If that's the case, your filenames will have been decoded to unicode using a different encoding, one that then results in filenames that cannot be encoded to UTF-8 or ASCII.

如果是这种情况,您的文件名将使用不同的编码解码为unicode,然后导致无法编码为UTF-8或ASCII的文件名。

Simply set the LC_CTYPE variable in your cron definition, either on a line on it's own preceding the time entry, or as part of the command to execute:

只需在cron定义中设置LC_CTYPE变量,或者在时间条目之前的行上,或者作为要执行的命令的一部分:

LC_CTYPE="en_US.UTF-8"
* * * * * yourscriptcommand.py

As always with python Unicode issues, the answer lies in the Unicode HOWTO, section on filenames.

与python Unicode问题一样,答案在于Unicode HOWTO,文件名部分。

#3


1  

for chinese

export LANG="zh_CN.utf-8"                                                                            
export LC_CTYPE="zh_CN.utf-8"                                                                        
export PYTHONIOENCODING="utf-8"                                                                      

/export/zhangys/python3.5.2/bin/python3 diff_reporter.py > /home/admin/diff_script/cron_job.log 2>&1