基于python的extract_msg模块提取outlook邮箱保存的msg文件中的附件

时间:2021-04-19 23:16:52

笔者保存了一些outlook邮箱中保存的一些msg格式的邮件文件,现需要将其中的附件提取出来,

当然直接在outlook中就可以另存附件,但outlook默认是不支持批量提取邮件中的附件的

思考过几种方案,其中之一就是使用python编程语言下的extract_msg模块,记录如下

1、安装extract_msg模块 pip install extract-msg ,笔者写此随笔时,最新版本为extract-msg 0.27.4

发布于Released: Sep 3, 2020,项目说明:https://pypi.org/project/extract-msg

2、安装后,最简单的使用,直接在命令行一条命令,即可将msg中的文件解压到当前目录下的一个子目录中(目录名与邮件信息有关)

#会在当前目录下,生成一个目录,然后将msg邮件文件中的附件和message.txt解压到其中
python -m extract_msg qq_5201351.msg

3、在py文件中,可以使用如下方法只提取其中的附件(需要先创建要保存附件的目录):

import extract_msg

msg = extract_msg.Message("qq_5201351.msg")

msg_attachment = msg.attachments

if msg_attachment:
for attachment in msg_attachment:
attachment.save(customPath="./qq_5201351_dir")

++++++未解决的问题>>>>:

1、使用上面的方法对于大多数msg都能够正常提取出附件,或者邮件内容,但是笔者有的mgs提取时会报如下错误,

目录未找到解决方法, 如有找到解决方法的,欢迎下方留言,非常感谢!

Traceback (most recent call last):
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 422, in named
return self.__namedProperties
AttributeError: 'Message' object has no attribute '_MSGFile__namedProperties' During handling of the above exception, another exception occurred: Traceback (most recent call last):
File "C:\Users\QQ5201351\Desktop\mail\test\test.py", line 5, in <module>
msg = extract_msg.Message("Important_msg_from_qq5201351.msg")
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\message.py", line 28, in __init__
MessageBase.__init__(self, path, prefix, attachmentClass, filename, delayAttachments, overrideEncoding)
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\message_base.py", line 61, in __init__
self.named
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 424, in named
self.__namedProperties = Named(self)
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\named.py", line 63, in __init__
self.__properties.append(StringNamedProperty(entry, names[entry['id']], msg._getTypedData(streamID)) if entry['pkind'] == constants.STRING_NAMED else NumericalNamedProperty(entry, msg._getTypedData(streamID)))
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 177, in _getTypedData
found, result = self._getTypedStream('__substg1.0_' + id, prefix, _type)
File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 246, in _getTypedStream
raise NotImplementedError('The stream specified is of type {}. We don\'t currently understand exactly how this type works. If it is mandatory that you have the contents of this stream, please create an issue labled "NotImplementedError: _getTypedStream {}".'.format(_type, _type))
NotImplementedError: The stream specified is of type 1014. We don't currently understand exactly how this type works. If it is mandatory that you have the contents of this stream, please create an issue labled "NotImplementedError: _getTypedStream 1014".

尊重别人的劳动成果 转载请务必注明出处:https://www.cnblogs.com/5201351/p/13695389.html