基础概念介绍

进程

通常表示计算机中正在运行的程序实例。

关键点：

进程包含代码段、数据段和堆栈
多个进程可以同时运行，相互隔离
每个进程独占计算机CPU和内存等资源，是获取资源的基本单位

线程

通常表示内核级的线程。计算机中最小可执行单元

关键点：

计算机中最小可调度、可执行单元
不独立拥有系统资源，共享进程内资源
可以充分利用多核进行并发、并行执行
创建、调度、销毁都需要CPU参与，实现用户态和内核态切换

协程

通常也被称为用户级线程，只存在于用户空间

Golang高并发的“灵魂”——GMP模型 关键点：

与线程是N：1（多对一）的关系
创建、调度、销毁都是在用户空间完成的，无需CPU参与，更轻量
都映射到同一个线程，无法实现并行，如果一个协程阻塞，则整组协程全部阻塞

Goroutine

经过Golang优化后的“协程”

Golang高并发的“灵魂”——GMP模型 关键点：

与线程是M：N（多对多）的关系
创建、调度、销毁都是在用户空间完成，对内核透明，更轻量
映射至多个线程，可实现并行
经过调度器（processor）的调度，能实现动态绑定和灵活调度
栈空间可动态扩缩容

GM模型

在介绍GMP模型之前，先看看go1.1版本之前采用的调度模型——GM模型

GM = Goroutine + Machine

G指的就是Goroutine协程，M指的就是Machine内核级线程 GM调度模型见下图 Golang高并发的“灵魂”——GMP模型

从上图中可以看到，每个线程M在获取Goroutine时，都需要从全局队列中获取，这时候就必须要先获取锁，然后才能获取Goroutine，而如果被M获取的Goroutine在运行过程中产生了新的Goroutine，则新的Goroutine也会被放回到全局队列中。通过描述，我们很容易发现GM模型有这么几个缺点：

调度：获取和返回Goroutine都需要先获取全局队列的锁，这就会在获取锁这里形成激烈地竞争，严重影响性能
资源利用率：当Goroutine1结合M1产生新的Goroutine2时，由于M1要继续运行Goroutine1，那么就需要把Goroutine2交还到全局队列，然后由其他的M获取后运行，而Goroutine1和Goroutine2是有相关性的，上下文都保存在M1中，这样会导致上下文切换或者保存原有状态是产生不必要的系统开销。
缓存资源：每个M在获取到Goroutine时，都需要缓存Goroutine的状态、加载上下文，这就要求M必须要配有一定的存储。而这就会产生严重的问题，我们知道M获取的Goroutine如果产生阻塞，M就会被悬挂起来，当M不够用时，系统会产生新的M，而每产生一个新的M就要对应地分配存储资源，这就增加了存储系统的开销，极端一点可能会导致存储资源消耗殆尽（一般系统中M存在的数量上限是10000），这显然是不合理的。

鉴于GM模型的上述几个缺点，在go1.1之后的版本，引入了P（processor）形成GMP模型来解决GM模型的几个缺点问题

GMP = Goroutine + Machine + Processor

Golang高并发的“灵魂”——GMP模型

解释一下上图的调度过程

当我们使用go func()创建一个G时，首先会被加入到一个P的本地队列中，以供P进行调度。
当P获取到一个G之后，会从空闲M队列中获取一个M（如果没有则会新建一个M），此时P同时获取到G和M，G只能跟M绑定运行，这样就可以正式运行G任务了。
在运行G的过程中，如果产生了新的G，则会将新产生的G放入到P的本地队列中，如果P的本地队列已满，则会放入全局队列中。
当一个G运行结束后（未发生系统调用），P会从本地队列中获取下一个G来运行（具体选取哪一个G则会根据不同的调度策略选取不同的G），如此往复，直到P的本地队列为空。此时P会查找全局队列，如果全局队列中有多余可运行的G，则会获取G来运行；而如果全局队列中没有多余的G可供获取，P则查找其他P的本地队列，如果有多余的G，则会偷取一半的G，放入自己的本地队列*后续使用
而如果跟P绑定的M在运行G的过程中发生了系统调用（如I/O），P会跟绑定的GM解绑，解绑后GM会去处理系统调用，而P会重新获取一个G和一个M运行。当原来的GM处理完成后，会重新获取原来的P，如果原来的P已被占用，则会从空闲P队列中获取一个P，如果还没有，则G会被添加到全局队列中，M会被添加到空闲M队列中。

func findrunnable() (gp *g, inheritTime bool) {
	_g_ := getg()
 
	// The conditions here and in handoffp must agree: if
	// findrunnable would return a G to run, handoffp must start
	// an M.
 
top:
	_p_ := _g_.m.p.ptr()
	if sched.gcwaiting != 0 {
		gcstopm()
		goto top
	}
	if _p_.runSafePointFn != 0 {
		runSafePointFn()
	}
 
	now, pollUntil, _ := checkTimers(_p_, 0)
 
	if fingwait && fingwake {
		if gp := wakefing(); gp != nil {
			ready(gp, 0, true)
		}
	}
	if *cgo_yield != nil {
		asmcgocall(*cgo_yield, nil)
	}
 
	// local runq                  先查询本地队列
	if gp, inheritTime := runqget(_p_); gp != nil {
		return gp, inheritTime
	}
 
	// global runq                 再查询全局队列
	if sched.runqsize != 0 {
		lock(&sched.lock)
		gp := globrunqget(_p_, 0)
		unlock(&sched.lock)
		if gp != nil {
			return gp, false
		}
	}
 
	// Poll network.
	// This netpoll is only an optimization before we resort to stealing.
	// We can safely skip it if there are no waiters or a thread is blocked
	// in netpoll already. If there is any kind of logical race with that
	// blocked thread (e.g. it has already returned from netpoll, but does
	// not set lastpoll yet), this thread will do blocking netpoll below
	// anyway.
	if netpollinited() && atomic.Load(&netpollWaiters) > 0 && atomic.Load64(&sched.lastpoll) != 0 {
		if list := netpoll(0); !list.empty() { // non-blocking
			gp := list.pop()
			injectglist(&list)
			casgstatus(gp, _Gwaiting, _Grunnable)
			if trace.enabled {
				traceGoUnpark(gp, 0)
			}
			return gp, false
		}
	}
 
	// Steal work from other P's.         最后从其他P里偷取
	procs := uint32(gomaxprocs)
	ranTimer := false
	// If number of spinning M's >= number of busy P's, block.
	// This is necessary to prevent excessive CPU consumption
	// when GOMAXPROCS>>1 but the program parallelism is low.
	if !_g_.m.spinning && 2*atomic.Load(&sched.nmspinning) >= procs-atomic.Load(&sched.npidle) {
		goto stop
	}
	if !_g_.m.spinning {
		_g_.m.spinning = true
		atomic.Xadd(&sched.nmspinning, 1)
	}
	for i := 0; i < 4; i++ {
		for enum := stealOrder.start(fastrand()); !enum.done(); enum.next() {
			if sched.gcwaiting != 0 {
				goto top
			}
			stealRunNextG := i > 2 // first look for ready queues with more than 1 g
			p2 := allp[enum.position()]
			if _p_ == p2 {
				continue
			}
			if gp := runqsteal(_p_, p2, stealRunNextG); gp != nil {
				return gp, false
			}
 
			// Consider stealing timers from p2.
			// This call to checkTimers is the only place where
			// we hold a lock on a different P's timers.
			// Lock contention can be a problem here, so
			// initially avoid grabbing the lock if p2 is running
			// and is not marked for preemption. If p2 is running
			// and not being preempted we assume it will handle its
			// own timers.
			// If we're still looking for work after checking all
			// the P's, then go ahead and steal from an active P.
			if i > 2 || (i > 1 && shouldStealTimers(p2)) {
				tnow, w, ran := checkTimers(p2, now)
				now = tnow
				if w != 0 && (pollUntil == 0 || w < pollUntil) {
					pollUntil = w
				}
				if ran {
					// Running the timers may have
					// made an arbitrary number of G's
					// ready and added them to this P's
					// local run queue. That invalidates
					// the assumption of runqsteal
					// that is always has room to add
					// stolen G's. So check now if there
					// is a local G to run.
					if gp, inheritTime := runqget(_p_); gp != nil {
						return gp, inheritTime
					}
					ranTimer = true
				}
			}
		}
	}
	if ranTimer {
		// Running a timer may have made some goroutine ready.
		goto top
	}

P的数量和M的数量的确定

P的数量：由启动时环境变量GOMAXPROCS来决定的，一般设置GOMAXPROCS也不会超过系统的核数 M的数量：go程序启动时，会设置M的最大数量，默认10000。但是内核很难创建出如此多的线程，因此默认情况下M的最大数量取决于内核

感谢你看到了最后，欢迎关注 晴天码字 晴天会持续努力，输出更多有趣且实用的主题 Golang高并发的“灵魂”——GMP模型

秒客网