一、协程

1.理论知识

协程，又称伪线程，是一种用户态的轻量级线程。

协程拥有自己的寄存器上下文和栈，协程调度切换时，将寄存器上下文和栈保存到其他地方，在切回来的时候，恢复先前保存的寄存器上下文和栈。因此：协程能保留上一次调用时的状态（即所有局部状态的一个特定组合），每次过程重入时，就相当于进入上一次调用的状态，换种说法：进入上一次离开时所处逻辑流的位置。

优点：

无需线程上下文切换的开销
无需原子操作锁定及同步的开销
- 　　"原子操作(atomic operation)是不需要synchronized"，所谓原子操作是指不会被线程调度机制打断的操作；这种操作一旦开始，就一直运行到结束，中间不会有任何 context switch （切换到另一个线程）。原子操作可以是一个步骤，也可以是多个操作步骤，但是其顺序是不可以被打乱，或者切割掉只执行部分。视作整体是原子性的核心。
方便切换控制流，简化编程模型
高并发+高扩展性+低成本：一个CPU支持上万的协程都不是问题。所以很适合用于高并发处理。

缺点：

无法利用多核资源：协程的本质是个单线程,它不能同时将单个CPU 的多个核用上,协程需要和进程配合才能运行在多CPU上.当然我们日常所编写的绝大部分应用都没有这个必要，除非是cpu密集型应用。
进行阻塞（Blocking）操作（如IO时）会阻塞掉整个程序

协程满足条件：

必须在只有一个单线程里实现并发
修改共享数据不需加锁
用户程序里自己保存多个控制流的上下文栈
一个协程遇到IO操作自动切换到其它协程

2.代码实例

Gevent 是一个第三方库，可以轻松通过gevent实现并发同步或异步编程，在gevent中用到的主要模式是Greenlet, 它是以C扩展模块形式接入Python的轻量级协程。 Greenlet全部运行在主程序操作系统进程的内部，但它们被协作式地调度。

 import gevent

 def func1():

     print('\033[31;1m李闯在跟海涛搞...\033[0m')

     gevent.sleep(2)

     print('\033[31;1m李闯又回去跟继续跟海涛搞...\033[0m')

 def func2():

     print('\033[32;1m李闯切换到了跟海龙搞...\033[0m')

     gevent.sleep(1)

     print('\033[32;1m李闯搞完了海涛，回来继续跟海龙搞...\033[0m')

 def func3():

     print("")

     gevent.sleep(1)

     print("")

 gevent.joinall([

     gevent.spawn(func1),

     gevent.spawn(func2),

     gevent.spawn(func3),

 ])

输出结果：

李闯在跟海涛搞...

李闯切换到了跟海龙搞...

33333

李闯搞完了海涛，回来继续跟海龙搞...

4444

李闯又回去跟继续跟海涛搞...

3.同步与异步的性能区别

import gevent

def task(pid):

    """

    Some non-deterministic task

    """

    gevent.sleep(0.5)

    print('Task %s done' % pid)

def synchronous():

    for i in range(1,10):

        task(i)

def asynchronous():

    threads = [gevent.spawn(task, i) for i in range(10)]

    gevent.joinall(threads)

print('Synchronous:')

synchronous()

print('Asynchronous:')

asynchronous()

4.遇到IO阻塞自动切换任务（爬虫实例）

 import gevent

 from gevent import monkey

 monkey.patch_all()

 from  urllib.request import urlopen

 import time

 def pa_web_page(url):

     print("GET url",url)

     req = urlopen(url)

     data =req.read()

     print(data)

     print('%d bytes received from %s.' % (len(data), url))

 t_start = time.time()

 pa_web_page("http://www.autohome.com.cn/beijing/")

 pa_web_page("http://www.xiaohuar.com/")

 print("time cost:",time.time()-t_start)

 t2_start = time.time()

 gevent.joinall([

         #gevent.spawn(pa_web_page, 'https://www.python.org/'),

         gevent.spawn(pa_web_page, 'http://www.autohome.com.cn/beijing/'),

         gevent.spawn(pa_web_page, 'http://www.xiaohuar.com/'),

         #gevent.spawn(pa_web_page, 'https://github.com/'),

 ])

 print("time cost t2:",time.time()-t2_start)

二、事件驱动与异步IO

事件驱动编程是一种编程范式，这里程序的执行流由外部事件来决定。它的特点是包含一个事件循环，当外部事件发生时使用回调机制来触发相应的处理。另外两种常见的编程范式是（单线程）同步以及多线程编程。

在单线程同步模型中，任务按照顺序执行。如果某个任务因为I/O而阻塞，其他所有的任务都必须等待，直到它完成之后它们才能依次执行。这种明确的执行顺序和串行化处理的行为是很容易推断得出的。如果任务之间并没有互相依赖的关系，但仍然需要互相等待的话这就使得程序不必要的降低了运行速度。

在多线程版本中，这3个任务分别在独立的线程中执行。这些线程由操作系统来管理，在多处理器系统上可以并行处理，或者在单处理器系统上交错执行。这使得当某个线程阻塞在某个资源的同时其他线程得以继续执行。与完成类似功能的同步程序相比，这种方式更有效率，但程序员必须写代码来保护共享资源，防止其被多个线程同时访问。多线程程序更加难以推断，因为这类程序不得不通过线程同步机制如锁、可重入函数、线程局部存储或者其他机制来处理线程安全问题，如果实现不当就会导致出现微妙且令人痛不欲生的bug。

在事件驱动版本的程序中，3个任务交错执行，但仍然在一个单独的线程控制中。当处理I/O或者其他昂贵的操作时，注册一个回调到事件循环中，然后当I/O操作完成时继续执行。回调描述了该如何处理某个事件。事件循环轮询所有的事件，当事件到来时将它们分配给等待处理事件的回调函数。这种方式让程序尽可能的得以执行而不需要用到额外的线程。事件驱动型程序比多线程程序更容易推断出行为，因为程序员不需要关心线程安全问题。

当我们面对如下的环境时，事件驱动模型通常是一个好的选择：

程序中有许多任务，而且…
任务之间高度独立（因此它们不需要互相通信，或者等待彼此）而且…
在等待事件到来时，某些任务会阻塞。

当应用程序需要在任务间共享可变的数据时，这也是一个不错的选择，因为这里不需要采用同步处理。

网络应用程序通常都有上述这些特点，这使得它们能够很好的契合事件驱动编程模型。

1.select多并发socket例子

 #_*_coding:utf-8_*_

 __author__ = 'Alex Li'

 import select

 import socket

 import sys

 import queue

 server = socket.socket()

 server.setblocking(0)

 server_addr = ('localhost',10000)

 print('starting up on %s port %s' % server_addr)

 server.bind(server_addr)

 server.listen(5)

 inputs = [server, ] #自己也要监测呀,因为server本身也是个fd

 outputs = []

 message_queues = {}

 while True:

     print("waiting for next event...")

     readable, writeable, exeptional = select.select(inputs,outputs,inputs) #如果没有任何fd就绪,那程序就会一直阻塞在这里

     for s in readable: #每个s就是一个socket

         if s is server: #别忘记,上面我们server自己也当做一个fd放在了inputs列表里,传给了select,如果这个s是server,代表server这个fd就绪了,

             #就是有活动了, 什么情况下它才有活动? 当然 是有新连接进来的时候 呀

             #新连接进来了,接受这个连接

             conn, client_addr = s.accept()

             print("new connection from",client_addr)

             conn.setblocking(0)

             inputs.append(conn) #为了不阻塞整个程序,我们不会立刻在这里开始接收客户端发来的数据, 把它放到inputs里, 下一次loop时,这个新连接

             #就会被交给select去监听,如果这个连接的客户端发来了数据 ,那这个连接的fd在server端就会变成就续的,select就会把这个连接返回,返回到

             #readable 列表里,然后你就可以loop readable列表,取出这个连接,开始接收数据了, 下面就是这么干 的

             message_queues[conn] = queue.Queue() #接收到客户端的数据后,不立刻返回 ,暂存在队列里,以后发送

         else: #s不是server的话,那就只能是一个 与客户端建立的连接的fd了

             #客户端的数据过来了,在这接收

             data = s.recv(1024)

             if data:

                 print("收到来自[%s]的数据:" % s.getpeername()[0], data)

                 message_queues[s].put(data) #收到的数据先放到queue里,一会返回给客户端

                 if s not  in outputs:

                     outputs.append(s) #为了不影响处理与其它客户端的连接 , 这里不立刻返回数据给客户端

             else:#如果收不到data代表什么呢? 代表客户端断开了呀

                 print("客户端断开了",s)

                 if s in outputs:

                     outputs.remove(s) #清理已断开的连接

                 inputs.remove(s) #清理已断开的连接

                 del message_queues[s] ##清理已断开的连接

     for s in writeable:

         try :

             next_msg = message_queues[s].get_nowait()

         except queue.Empty:

             print("client [%s]" %s.getpeername()[0], "queue is empty..")

             outputs.remove(s)

         else:

             print("sending msg to [%s]"%s.getpeername()[0], next_msg)

             s.send(next_msg.upper())

     for s in exeptional:

         print("handling exception for ",s.getpeername())

         inputs.remove(s)

         if s in outputs:

             outputs.remove(s)

         s.close()

         del message_queues[s]

 import socket

 import sys

 messages = [ b'This is the message. ',

              b'It will be sent ',

              b'in parts.',

              ]

 server_address = ('localhost', 10000)

 # Create a TCP/IP socket

 socks = [ socket.socket(socket.AF_INET, socket.SOCK_STREAM),

           socket.socket(socket.AF_INET, socket.SOCK_STREAM),

           ]

 # Connect the socket to the port where the server is listening

 print('connecting to %s port %s' % server_address)

 for s in socks:

     s.connect(server_address)

 for message in messages:

     # Send messages on both sockets

     for s in socks:

         print('%s: sending "%s"' % (s.getsockname(), message) )

         s.send(message)

     # Read responses on both sockets

     for s in socks:

         data = s.recv(1024)

         print( '%s: received "%s"' % (s.getsockname(), data) )

         if not data:

             print(sys.stderr, 'closing socket', s.getsockname() )

 select socket client

2.selectors模块

 import selectors

 import socket

 def accept(sock, mask):

     conn, addr = sock.accept()  # Should be ready

     print('accepted', conn, 'from', addr)

     conn.setblocking(False)#非阻塞，或者设置为0

     sel.register(conn, selectors.EVENT_READ, read)

 def read(conn, mask):

     try:

         data = conn.recv(1000)  # Should be ready

         if data:

             print('echoing', repr(data), 'to', conn)

             conn.send(data)  # Hope it won't block

         else:

             print('closing', conn)

             sel.unregister(conn)

             conn.close()

     except ConnectionResetError as e:

         sel.unregister(conn)

 sock = socket.socket()

 sock.bind(('localhost', 10000))

 sock.listen(100)

 sock.setblocking(False)

 sel = selectors.DefaultSelector()#生成实例

 sel.register(sock, selectors.EVENT_READ, accept)#注册sock连接，读事件，如果有请求调用accept

 #select.select(inputs,outputs...)

 while True:

     events = sel.select() #如果没有事件，一直等待，返回列表

     for key, mask in events: #有事件，循环events列表

         callback = key.data #accept内存地址,发送数据后变成read内存地址

         print("--->",key,mask)

         callback(key.fileobj, mask)#fileobj是conn,

         #fileobj=<socket.socket fd=220, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 10000)>,

三、RabbitMQ队列

1.安装

安装python rabbitMQ module

pip install pika

or

easy_install pika

or

源码

https://pypi.python.org/pypi/pika

2.最简单的通讯队列

send端

 #!/usr/bin/env python

 import pika

 connection = pika.BlockingConnection(pika.ConnectionParameters(

                'localhost'))

 channel = connection.channel()

 #声明queue

 channel.queue_declare(queue='hello')

 #n RabbitMQ a message can never be sent directly to the queue, it always needs to go through an exchange.

 channel.basic_publish(exchange='',

                       routing_key='hello',

                       body='Hello World!')

 print(" [x] Sent 'Hello World!'")

 connection.close()

receive端

 #_*_coding:utf-8_*_

 __author__ = 'Alex Li'

 import pika

 connection = pika.BlockingConnection(pika.ConnectionParameters(

                'localhost'))

 channel = connection.channel()

 #You may ask why we declare the queue again ‒ we have already declared it in our previous code.

 # We could avoid that if we were sure that the queue already exists. For example if send.py program

 #was run before. But we're not yet sure which program to run first. In such cases it's a good

 # practice to repeat declaring the queue in both programs.

 channel.queue_declare(queue='hello')

 def callback(ch, method, properties, body):

     print(" [x] Received %r" % body)

 channel.basic_consume(callback,

                       queue='hello',

                       no_ack=True)#这种情况下一旦被deliver出去，就已经被确认了，在consumer异常时会导致消息丢失。

  print(' [*] Waiting for messages. To exit press CTRL+C')  channel.start_consuming()

3.Work Queues模式

这种模式下，RabbitMQ会默认把消息依次分发给各个消费者，跟负载均衡差不多。

消息提供着代码(send)：

 #!/usr/bin/env python

 # -*- coding:utf-8 -*-

 # Author:Liumj

 import pika

 import time

 import sys

 connection = pika.BlockingConnection(pika.ConnectionParameters('127.0.0.1')) #建立socket连接

 channel = connection.channel() #打开一个通道

 channel.queue_declare(queue='hello') #声明queue，名称是hello

 message = ' '.join(sys.argv[1:]) or "Hello World! %s" % time.time()

 channel.basic_publish(exchange = '',

                       routing_key='hello', #queue名

                       body = message, #消息内容

                       properties=pika.BasicProperties(

                           delivery_mode=2

                       )  #basic_publist发消息

 )

 connection.close()

消费者代码（recv）:

 #!/usr/bin/env python

 # -*- coding:utf-8 -*-

 # Author:Liumj

 import pika,time

 connection = pika.BlockingConnection(pika.ConnectionParameters('127.0.0.1')) #建立连接

 channel = connection.channel() #建立通道

 channel.queue_declare(queue='hello') #如果确定hello存在，该条可以不写

 def callback(ch,method,properties,body): #channel对象，属性信息

     print("[x] Received %r" % body)

     #time.sleep(20)

     print("[x] Done")

     print("method.delivery_tag",method.delivery_tag)

     ch.basic_ack(delivery_tag=method.delivery_tag)

 channel.basic_consume(callback,  #从hello里面收消息，收到之后调用callback函数

                       queue='hello',

                       no_ack=True)

 print('[*] waiting for message. To exit press CTRL+C')

 channel.start_consuming()

消息会自动依次分配到各个消费者身上。

4.消息持久化和公平分发

为防止消息发送过程中出现异常需要将消息持久化，这样重启服务消息不会丢失。

消息公平分发：根据每个机器配置不同，处理的任务不同，配置perfetch = 1,告诉RabbitMQ,在这个消费者当前消息没有处理完之前，不要发送新的消息。

完整代码如下：

生产者(send):

 #!/usr/bin/env python

 # -*- coding:utf-8 -*-

 # Author:Liumj

 # !/usr/bin/env python

 import pika

 import sys

 connection = pika.BlockingConnection(pika.ConnectionParameters(

     host='127.0.0.1'))   #建立连接

 channel = connection.channel() #打开通道

 channel.queue_declare(queue='task_queue', durable=True) #声明queue、队列持久化

 message = ' '.join(sys.argv[1:]) or "Hello World!"#消息内容

 channel.basic_publish(exchange='',

                       routing_key='task_queue',

                       body=message,

                       properties=pika.BasicProperties(

                           delivery_mode=2,  # make message persistent

                       ))

 print(" [x] Sent %r" % message)

 connection.close()

消费者(recv):

 #!/usr/bin/env python

 # -*- coding:utf-8 -*-

 # Author:Liumj

 import pika

 import time

 connection = pika.BlockingConnection(pika.ConnectionParameters(

     host='127.0.0.1'))

 channel = connection.channel()

 channel.queue_declare(queue='task_queue', durable=True)

 print(' [*] Waiting for messages. To exit press CTRL+C')

 def callback(ch, method, properties, body):

     print(" [x] Received %r" % body)

     time.sleep(body.count(b'.'))

     print(" [x] Done")

     ch.basic_ack(delivery_tag=method.delivery_tag)

 channel.basic_qos(prefetch_count=1) #公平分发

 channel.basic_consume(callback,

                       queue='task_queue')  #从task_queue里面接收消息后调用callback函数

 channel.start_consuming()

5.Publish\Subscribe(消息发布\订阅)

类似于广播，只要符合条件都可以接收消息

fanout:所有bind到此exchange的queue都可以接收消息

direct：通过routingKey和exchange决定哪一个唯一的queue可以接收消息,队列绑定关键字，发送者讲根据数据关键字发送到消息exchange，exchange根据关键字判定应该将数据发送制定队列。

topic：所有符合routingKey所bind的queue可以接收消息

publisher_fanout:

 import pika

 import sys

 #credentials = pika.PlainCredentials('alex', 'alex3714')

 connection = pika.BlockingConnection(pika.ConnectionParameters(

     host='127.0.0.1'))

 channel = connection.channel()

 channel.exchange_declare(exchange='logs', type='fanout')

 message = ' '.join(sys.argv[1:]) or "info: Hello World!"

 channel.basic_publish(exchange='logs',

                       routing_key='',

                       body=message)

 print(" [x] Sent %r" % message)

 connection.close()

subscriber_fanout:

 import pika

 #credentials = pika.PlainCredentials('alex', 'alex3714')

 connection = pika.BlockingConnection(pika.ConnectionParameters(

     host='127.0.0.1'))

 channel = connection.channel()

 channel.exchange_declare(exchange='logs',type='fanout')

 result = channel.queue_declare(exclusive=True)  # 不指定queue名字,rabbit会随机分配一个名字,exclusive=True会在使用此queue的消费者断开后,自动将queue删除

 queue_name = result.method.queue

 channel.queue_bind(exchange='logs',queue=queue_name)

 print(' [*] Waiting for logs. To exit press CTRL+C')

 def callback(ch, method, properties, body):

     print(" [x] %r" % body)

 channel.basic_consume(callback,

                       queue=queue_name,

                       )

 channel.start_consuming()

publisher_direct:

 import pika

 import sys

 connection = pika.BlockingConnection(pika.ConnectionParameters(

         host='127.0.0.1'))

 channel = connection.channel()

 channel.exchange_declare(exchange='direct_logs',

                          type='direct')

 severity = sys.argv[1] if len(sys.argv) > 1 else 'info'

 message = ' '.join(sys.argv[2:]) or 'Hello World!'

 channel.basic_publish(exchange='direct_logs',

                       routing_key=severity,

                       body=message)

 print(" [x] Sent %r:%r" % (severity, message))

 connection.close()

subscriber_direct:

 import pika

 import sys

 connection = pika.BlockingConnection(pika.ConnectionParameters(

         host='127.0.0.1'))

 channel = connection.channel()

 channel.exchange_declare(exchange='direct_logs',

                          type='direct')

 result = channel.queue_declare(exclusive=True)

 queue_name = result.method.queue

 severities = sys.argv[1:]

 if not severities:

     sys.stderr.write("Usage: %s [info] [warning] [error]\n" % sys.argv[0])

     sys.exit(1)

 for severity in severities:

     channel.queue_bind(exchange='direct_logs',

                        queue=queue_name,

                        routing_key=severity)

 print(' [*] Waiting for logs. To exit press CTRL+C')

 def callback(ch, method, properties, body):

     print(" [x] %r:%r" % (method.routing_key, body))

 channel.basic_consume(callback,

                       queue=queue_name,

                       no_ack=True)

 channel.start_consuming()

publisher_topic:

 import pika

 import sys

 connection = pika.BlockingConnection(pika.ConnectionParameters(

         host='127.0.0.1'))

 channel = connection.channel()

 channel.exchange_declare(exchange='topic_logs',

                          type='topic')

 routing_key = sys.argv[1] if len(sys.argv) > 1 else 'anonymous.info'

 message = ' '.join(sys.argv[2:]) or 'Hello World!'

 channel.basic_publish(exchange='topic_logs',

                       routing_key=routing_key,

                       body=message)

 print(" [x] Sent %r:%r" % (routing_key, message))

 connection.close()

subscriber_topic:

 import pika

 import sys

 connection = pika.BlockingConnection(pika.ConnectionParameters(

         host='127.0.0.1'))

 channel = connection.channel()

 channel.exchange_declare(exchange='topic_logs',

                          type='topic')

 result = channel.queue_declare(exclusive=True)

 queue_name = result.method.queue

 binding_keys = sys.argv[1:]

 if not binding_keys:

     sys.stderr.write("Usage: %s [binding_key]...\n" % sys.argv[0])

     sys.exit(1)

 for binding_key in binding_keys:

     channel.queue_bind(exchange='topic_logs',

                        queue=queue_name,

                        routing_key=binding_key)

 print(' [*] Waiting for logs. To exit press CTRL+C')

 def callback(ch, method, properties, body):

     print(" [x] %r:%r" % (method.routing_key, body))

 channel.basic_consume(callback,

                       queue=queue_name,

                       no_ack=True)

 channel.start_consuming()

6.RPC

RabbitMQ_RPC_send:

 import pika

 import uuid

 class SSHRpcClient(object):

     def __init__(self):

 #        credentials = pika.PlainCredentials('alex', 'alex3714')

         self.connection = pika.BlockingConnection(pika.ConnectionParameters(

                             host='127.0.0.1'))

         self.channel = self.connection.channel()

         result = self.channel.queue_declare(exclusive=True) # 客户端的结果必须要返回到这个queue

         self.callback_queue = result.method.queue

         self.channel.basic_consume(self.on_response,queue=self.callback_queue) #声明从这个queue里收结果

     def on_response(self, ch, method, props, body):

         if self.corr_id == props.correlation_id: #任务标识符

             self.response = body

             print(body)

     def call(self, n):

         self.response = None

         self.corr_id = str(uuid.uuid4()) #唯一标识符

         self.channel.basic_publish(exchange='',

                                    routing_key='rpc_queue3',

                                    properties=pika.BasicProperties(

                                        reply_to=self.callback_queue,

                                        correlation_id=self.corr_id,

                                    ),

                                    body=str(n))

         print("start waiting for cmd result ")

         #self.channel.start_consuming()

         count = 0

         while self.response is None: #如果命令没返回结果

             print("loop ",count)

             count +=1

             self.connection.process_data_events() #以不阻塞的形式去检测有没有新事件

             #如果没事件，那就什么也不做， 如果有事件，就触发on_response事件

         return self.response

 ssh_rpc = SSHRpcClient()

 print(" [x] sending cmd")

 response = ssh_rpc.call("ipconfig")

 print(" [.] Got result ")

 print(response.decode("gbk"))

RabbitMQ_RPC_recv:

 import pika

 import time

 import subprocess

 #credentials = pika.PlainCredentials('alex', 'alex3714')

 connection = pika.BlockingConnection(pika.ConnectionParameters(

     host='127.0.0.1'))

 channel = connection.channel()

 channel.queue_declare(queue='rpc_queue3')

 def SSHRPCServer(cmd):

     # if n == 0:

     #     return 0

     # elif n == 1:

     #     return 1

     # else:

     #     return fib(n - 1) + fib(n - 2)

     print("recv cmd:",cmd)

     cmd_obj = subprocess.Popen(cmd.decode(),shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

     result = cmd_obj.stdout.read() or cmd_obj.stderr.read()

     return result

 def on_request(ch, method, props, body):

     #n = int(body)

     print(" [.] fib(%s)" % body)

     response = SSHRPCServer(body)

     ch.basic_publish(exchange='',

                      routing_key=props.reply_to,

                      properties=pika.BasicProperties(correlation_id= \

                                                          props.correlation_id),

                      body=response)

 channel.basic_consume(on_request, queue='rpc_queue3')

 print(" [x] Awaiting RPC requests")

 channel.start_consuming()

秒客网

Python网络编程学习_Day11