Python3笔记-电子邮件的收发

时间:2022-10-05 08:31:54

文章笔记参考:Python教程

一、电子邮件的发件过程:

发件人email->MUA(mail user agent:邮件用户代理)->MTA(mail transfer agent:邮件传输代理)->MDA(mail delivery agent:邮件投递代理,也就是服务器)<-MUA<-收件人
二、使用SMTP协议发送邮件
Python支持SMTP,同时,在email包内包含了关于邮件收发的一些API,可以使用Python进行纯文本、HTML、以及带有附件的邮件传输。下面介绍有关的 两个主要模块:email和smtplib,不过还是先讲一些套路吧:

使用python正常发送一个邮件的套路就是:
1)导入包:from email.mime.text import MIMEText
2)定义发送端地址,密码,接收端地址
3)定义一个SMTP服务器地址
4)导入smtplib
5)在25(默认)端口打开服务器
6)在改服务端登录,并发送邮件
7)退出SMTP

步骤详解:
1)定义发件人地址、密码、收件人地址:
from_addr = input('From: ')
password = input('Password: ')
to_addr = input('To: ')

2)定义一个SMTP服务器地址,同时在25端口打开SMTP服务器,进行发件人登录动作,这里以新浪的SMTP服务器地址为例
smtp_server = input('SMTP server')

import smtplib
server = smtplib.SMTP(smtp_server, 25) # the default protocol is 25
server.set_debuglevel(1) # 只是用来打印调试信息的,如果设置为0,就不打印
server.login(from_addr, password) # login SMTP

3)定义邮件内容,并且发送出去
msg = MIMEText('hello world!', 'plain', 'utf-8')
# 收件人可以有多个,所以这是一个list,msg是一个Message结构,所以转成string
server.sendmail(from_addr, [to_addr], msg.as_string()) 
# 一定要退出服务器
server.quit()

以上这封邮件发送出去的话,不会在收件人处查看到详细的发件人地址以及其他头信息,如果要添加的话,可以像下面这样:
from email import encoders
from email.header import Header
from email.mime.text import MIMEText
from email.utils import parseaddr, formataddr

import smtplib

# 考虑到编码的原因,这里统一将name属性值改成utf-8,地址的话一定是统一的邮箱地址结构,所以不考虑
def __format_addr(s):
	name, addr = parseaddr(s)
	return formataddr((Header(name, 'utf-8').encode(), addr))

# 一些基本信息的定义
from_addr = input('From:')
password = input('Password:')
to_addr = input('To: ')
smtp_server = input('SMTP server: ')

# 纯文本邮件定义
msg = MIMEText('hello world!', 'plain', 'utf-8')
# 定义发送人,接收人,以及描述信息(主题)
msg['From'] = __format_addr('Your Dady: <%s>' % from_addr)
msg['To'] = __format_addr('To son: <%s>' % to_addr)
msg['Subject'] = Header('A how are you from SMTP......', 'utf-8').encode()

print(msg)
# server = smtplib.SMTP(smtp_server, 25)
# server.set_debuglevel(1)
# server.login(from_addr, password)
# server.sendmail(from_addr, [to_addr], msg.as_string())
# server.quit()

关于MIMEText函数这里介绍一下,直接把Dash的文档解释靠过来吧:

class email.mime.text.MIMEText(_text, _subtype='plain', _charset=None)
Module: email.mime.text

A subclass of MIMENonMultipart, the MIMEText class is used to create MIME objects of major type text. _text is the string for the payload. _subtype is the minor type and defaults to plain. _charset is the character set of the text and is passed as an argument to the MIMENonMultipart constructor; it defaults to us-ascii if the string contains only ascii code points, and utf-8 otherwise. The _charset parameter accepts either a string or a Charset instance.

Unless the _charset argument is explicitly set to None, the MIMEText object created will have both a Content-Type header with a charset parameter, and a Content-Transfer-Endcoding header. This means that a subsequent set_payload call will not result in an encoded payload, even if a charset is passed in the set_payload command. You can “reset” this behavior by deleting the Content-Transfer-Encoding header, after which a set_payload call will automatically encode the new payload (and add a new Content-Transfer-Encoding header).

Changed in version 3.5: _charset also accepts Charset instances.



解释一下参数:_text(传输的邮件内容),_subtype(默认为纯文本格式,可以改成如HTML之类的),_charset(字符集默认为空,可以改成utf-8之类的,考虑到通用性,建议写成utf-8)


接下来的问题是,一般邮件还有添加附件的功能,然而python3也支持这个。如果要添加附件,相当于把邮件内容格式从单一改成了多种,这个时候初始化Message变量就不能再用MIMEText了,而是使用MIMEMultipart()。之后如果要添加不同的信息的话,就使用attach方法,使用实例如下:
from email import encoders
from email.header import Header
from email.mime.text import MIMEText
from email.utils import parseaddr, formataddr

import smtplib
#字符集转换
def __format_addr(s):
	name, addr = parseaddr(s)
	return formataddr((Header(name, 'utf-8').decode(), addr))
#基本信息初始化
from_addr = input('From: ')
password = input('Password: ')
to_addr = input('To: ')
#使用MIMEMultipart定义Message
msg = MIMEMultipart()
#初始化头
msg['From'] = __format_addr('Your Dady: <%s>' % from_addr)
msg['To'] = __format_addr('Son: <%s>' % to_addr)
msg['Subject'] = Header('A how are you from......').encode()
#将文本内容添加入邮件
msg.attach(MIMEText('Send with file......', 'plain', 'utf-8'))
#打开一个图片文件,将其添加到邮件中
with open('/Users/zhouming/Desktop/tupian.png', 'rb') as f:
	mime = MIMEBase('image', 'png', filename = 'tupian.png')
	#添加头部信息
	mime.add_header('Content-Disposition', 'attachment', filename = 'tupian.png')
	mime.add_header('Content-ID', '<0>')
	mime.add_header('X-Attachment-Id', '0')
	#读取信息,默认字符集为空,这里是图片就不用设置了
	mime.set_payload(f.read())
	#使用Base64对图片编码
	encoders.encode_base64(mime)
	msg.attach(mime)

关于之前的add_header和set_payload函数给出以下文档解释:

add_header(_name, _value, **_params)
Extended header setting. This method is similar to __setitem__() except that additional header parameters can be provided as keyword arguments. _name is the header field to add and _value is the primary value for the header.

For each item in the keyword argument dictionary _params, the key is taken as the parameter name, with underscores converted to dashes (since dashes are illegal in Python identifiers). Normally, the parameter will be added as key="value" unless the value is None, in which case only the key will be added. If the value contains non-ASCII characters, it can be specified as a three tuple in the format (CHARSET, LANGUAGE, VALUE), where CHARSET is a string naming the charset to be used to encode the value, LANGUAGE can usually be set to None or the empty string (see RFC 2231 for other possibilities), and VALUE is the string value containing non-ASCII code points. If a three tuple is not passed and the value contains non-ASCII characters, it is automatically encoded in RFC 2231 format using a CHARSET of utf-8 and a LANGUAGE of None.

Here’s an example:
msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')

This will add a header that looks like
Content-Disposition: attachment; filename="bud.gif"

An example with non-ASCII characters:
msg.add_header('Content-Disposition', 'attachment',filename=('iso-8859-1', '', 'Fußballer.ppt'))

Which produces

Content-Disposition: attachment; filename*="iso-8859-1''Fu%DFballer.ppt"


set_payload(payload, charset=None)

Set the entire message object’s payload to payload. It is the client’s responsibility to ensure the payload invariants. Optional charset sets the message’s default character set; see set_charset() for details.



如果要把图片嵌入邮件正文中,比如在HTML邮件中链接图片地址(虽然大多数邮件服务商会自动屏蔽带有外链的图片,因为不知道这些链接是否指向恶意网站),如果要添加到邮件正文中,可以按照发送附件的方式先将邮件作为附件添加,时把MIMEText的plain要改成HTML,比如:
msg.attach(MIMEText('<html><body><h1>Hello</h1>' +
    '<p><img src="cid:0"></p>' +
    '</body></html>', 'html', 'utf-8'))

这样就是一个成功的例子

有一些邮件接收端可能比较古老不支持HTML文件,那么可以考虑添加一份纯文本文件

加密SMTP
使用标准的25端口传输邮件,链接服务器,使用的是明文传输,这里是个不安全因素,要安全的进行邮件发送,可以加密SMTP,其实就是创建一个SSL安全链接,之后再进行SMTP邮件发送,这里给出使用Gmail的SMTP进行邮件发送的实例(Gmail的端口是587),建立SSL安全连接只要使用starttls()方法就可以:
smtp_server = 'smtp.gmail.com'
smtp_port = 587
server = smtplib.SMTP(smtp_server, smtp_port)
server.starttls()
# 剩下的代码和前面的一模一样:
server.set_debuglevel(1)
...

三、使用POP3进行邮件接收

python3内置了一个poplib模块,邮件接收的过程其实和发件是相反的,一个打包一个解析
主要步骤其实就两步:
1)使用poplib将邮件的原始件下载到本地
2)使用email解析原始文本,进行对象还原

实例如下:
import poplib
from email.parser import Parser
from email.header import decode_header
from email.utils import parseaddr

email = input('Email:')
password = input('Password: ')
pop3_server = input('POP3 server: ')
pop3_server = 'pop.sina.com'
#这是检测编码部分,有点不懂
def guess_charset(msg):
	charset = msg.get_charset()
	if charset is None:
		content_type = msg.get('Content-type', '').lower()
		pos = content_type.find('charset=')
		if pos >= 0:
			charset = content_type[pos + 8:].strip()
	return charset
#这里只取出第一发件人
def decode_str(s):
	value, charset = decode_header(s)[0]
	if charset:
		value = value.decode(charset)
	return value
#递归打印信息
def print_info(msg, indent = 0):
	if indent == 0:
		for header in ['From', 'To', 'Subject']:
			value = msg.get(header, '')
			if value:
				if header == 'Subject':
					value = decode_str(value)
				else:
					hdr, addr = parseaddr(value)
					name = decode_str(hdr)
					value = u'%s <%s>' % (name, addr)
			print('%s%s: %s' % ('  ' * indent, header, value))
	if (msg.is_multipart()):
		parts = msg.get_payload()
		for n, part in enumerate(parts):
			print('%spart %s' % ('  '*indent, n))
			print('%s--------------------' % ('   '*indent))
			print_info(part, indent + 1)
	else:
		content_type = msg.get_content_type()
		if content_type == 'text/plain' or content_type == 'text/html':
			content = msg.get_payload(decode = True)
			charset = guess_charset(msg)
			if charset:
				content = content.decode(charset)
			print('%sText: %s' % ('  '*indent, content + '...'))
		else:
			print('%sAttachment: %s' % ('  '*indent, content_type))

#下载原始邮件
server = poplib.POP3(pop3_server)
server.set_debuglevel(0)
print(server.getwelcome().decode('utf-8'))
server.user(email)
server.pass_(password)
#打印邮件数量和占用空间
print('Message: %s, Size: %s' % server.stat())
resp, mails, octets = server.list()
print(mails)

#解析邮件
index = len(mails)
resp, lines, octets = server.retr(index)
msg_content = b'\r\n'.join(lines).decode('utf-8')
msg = Parser().parsestr(msg_content)
print_info(msg)


server.quit()