Golang源码探究——从Go程序的入口到GMP模型

在大多数的编程语言中，main函数都是用户程序的入口函数，go中也是如此。那么main.main是整个程序的入口吗, 肯定不是，因为go程序依赖于runtime，在程序的初始阶段需要初始化运行时，之后才会运行到用户的main函数，那么main.main是在哪里被调用的呢？接下来就从go程序的入口，再到go的GMP模型进行一个探究。

注意：本文使用的go sdk的版本为go1.20

1.go程序的入口

1 首先，编写一个简单的go程序，并将其进行编译，在此使用linux系统：

package main

import "fmt"

func main() {
	fmt.Println("hello,world")
}

编译：-N -l 用于阻止编译时进行优化和内联

go build -gcflags "-N -l" main.go

2 然后使用gdb来调试go程序：

首先，使用gdb加载支持调试go语言的脚本文件：

gdbsource /usr/local/go/src/runtime/runtime-gdb.py

➜  RemoteWorking git:(master) ✗ gdb
(gdb) source /usr/local/go/src/runtime/runtime-gdb.py

3 调试程序：

gdb main

info files

0x45c020_rt0_amd64_linuxsrc/runtime/rt0_liunx_amd64.s

TEXT _rt0_amd64_linux(SB),NOSPLIT,$-8
	JMP	_rt0_amd64(SB)

_rt0_amd64asm_amd64.s

TEXT _rt0_amd64(SB),NOSPLIT,$-8
	MOVQ	0(SP), DI	// argc
	LEAQ	8(SP), SI	// argv
	JMP	runtime·rt0_go(SB)

rt0_go

TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0
	...
	MOVL	24(SP), AX		// copy argc
	MOVL	AX, 0(SP)
	MOVQ	32(SP), AX		// copy argv
	MOVQ	AX, 8(SP)
	CALL	runtime·args(SB)
	CALL	runtime·osinit(SB)
	CALL	runtime·schedinit(SB)

	// create a new goroutine to start program
	MOVQ	$runtime·mainPC(SB), AX		// entry
	PUSHQ	AX
	CALL	runtime·newproc(SB)
	POPQ	AX

	// start this M
	CALL	runtime·mstart(SB)

	CALL	runtime·abort(SB)	// mstart should never return
	RET
...

// mainPC is a function value for runtime.main, to be passed to newproc.
// The reference to runtime.main is made via ABIInternal, since the
// actual function (not the ABI0 wrapper) is needed by newproc.
DATA	runtime·mainPC+0(SB)/8,$runtime·main<ABIInternal>(SB)
GLOBL	runtime·mainPC(SB),RODATA,$8

rt0_goruntime.osinitruntime.schedinit

runtime.mainPCruntime.newprocruntime.mainnewprocgoroutine

go func()newprocruntime.maingoroutine

runtime.mstartsysmonscheduleschedulefindrunnablegoroutineexecutegoroutineggogomain goroutine

如下图所示：

2 GMP模型

GMP模型是go语言goroutine的调度系统，调度是将goroutine调度到线程上执行的过程，而操作系统的调度器则负责将线程调度到CPU上运行。

2.1 GM模型

GMGgoroutineMgoroutinemgoroutinengoroutineglobrunqglobrunqgoroutine

而且一个goroutine创建的goroutine也会被放入全局队列中，同时也需要加锁。这样也会造成程序的局部性较差，因为一个goroutine创建的另一个goroutine大概率不会在同一个线程上运行。

2.2 改进的GMP模型

1 所有线程都从全局队列获取goroutine，造成锁争用强度大。2. 程序的局部性较差

go语言引入了GMP模型，G同样代表一个goroutine，M代表machine，也就是worker thread，p代表processor，包含了运行go代码所需的资源。

官方解释：

// Goroutine scheduler
// The scheduler's job is to distribute ready-to-run goroutines over worker threads.
//
// The main concepts are:
// G - goroutine.
// M - worker thread, or machine.
// P - processor, a resource that is required to execute Go code.
//     M must have an associated P to execute Go code, however it can be
//     blocked or in a syscall w/o an associated P.

goroutinegoroutinegoroutinepprunqgoroutinerunq256goroutineglobrunqgoroutineprunqrunqgoroutine

p的数量pCPUgoroutineruntime.GOMAXPROCm的数量

2.3 相关数据结构

2.3.1 runtime.g

goroutine在runtime中表示为一个g结构体：

type g struct {
	stack       stack   // offset known to runtime/cgo
	stackguard0 uintptr // offset known to liblink

	...
	m         *m      // current m; offset known to arm liblink
	sched     gobuf
	...
	atomicstatus atomic.Uint32
	
	goid         uint64
	
	preempt       bool // preemption signal, duplicates stackguard0 = stackpreempt
}

type stack struct {
	lo uintptr
	hi uintptr
}

type gobuf struct {
	sp   uintptr
	pc   uintptr
	g    guintptr
	...
}

省略了一些不太关心的字段，其中的一些字段的含义如下：

stackstackguard0mschedatomicstatusgoidpreempt

goroutine是一个有栈协程，stack字段用于描述协程的栈，goroutine的初始栈大小为2K，并且是从堆中分配的，是可以动态增长的。

sched用来存储goroutine执行的上下文，它与goroutine切换的底层实现相关，其中sp标识stack pointer，pc为program counter，g用来反向关联到当前g。

g的状态:

atomicstatus字段表示goroutine的状态，goroutine有多种状态：

状态	含义
_Gidle	当前goroutine刚被分配，还没有被初始化
_Grunnable	当前goroutine处于待运行状态，他可能处于p的本地runq或者globrunq中，当前并没有在运行用户代码，它的栈也不归自己所有。
_Grunning	当前goroutine正在运行用户代码，有关联的M和P。不会处于任何runq中，栈归该goroutine所有。
_Gsyscall	当前goroutine正在执行系统调用，并没有在执行用户代码，拥有栈，而且被分配了M。
_Gwaiting	当前goroutine处于阻塞状态，即不再runq中，也没有得到运行。它肯定被记录在某个地方，比如chan的阻塞队列、mutex的阻塞队列中。
_Gdead	当前goroutine没有在使用，可能存在一个free list中或者刚刚被初始化。
_Gcopystack	当前goroutine的栈正在被移动，没有在执行用户代码也不在一个runq中。

2.3.2 runtime.m

GMP中的M代表一个工作线程，在runtime中使用m结构体来表示：

type m struct {
	g0      *g     // goroutine with scheduling stack
	gsignal       *g                // signal-handling g	
	curg          *g       // current running goroutine	
	p             puintptr // attached p for executing go code (nil if not executing go code)	
	id            int64	
	preemptoff    string // if != "", keep curg running on this m
	locks         int32
	spinning      bool // m is out of work and is actively looking for work
	mOS
}

省略了其中一些不太关心的字段，其中一些字段的含义如下：

g0gsignalcurgpidpreemptofflocksspiningmOS

2.3.3 runtime.p

GMP中的p代表processor，其中包含了一系列用于运行goroutine的资源，比如本地runq、堆内存缓存、栈内存缓存、goroutine id缓存等，在runtime中使用p结构体表示：

type p struct {
	id          int32
	status      uint32 // one of pidle/prunning/...

	schedtick   uint32     // incremented on every scheduler call
	syscalltick uint32     // incremented on every system call
	sysmontick  sysmontick // last tick observed by sysmon
	m           muintptr   // back-link to associated m (nil if idle)
    
	// Cache of goroutine ids, amortizes accesses to runtime·sched.goidgen.
	goidcache    uint64
	goidcacheend uint64

	// Queue of runnable goroutines. Accessed without lock.
	runqhead uint32
	runqtail uint32
	runq     [256]guintptr
	
	runnext guintptr

	// Available G's (status == Gdead)
	gFree struct {
		gList
		n int32
	}

	// preempt is set to indicate that this P should be enter the
	// scheduler ASAP (regardless of what G is running on it).
	preempt bool
}

其中省略了一些不太关心的字段，一些字段的含义如下：

idstatusschedticksyscallticksysmontickmgoidcache、goidcacheendrunqhead、runqtail、runqrunnextgFreepreempt

p的状态：

状态	含义
_Pidle	当前p处于空闲状态，没有被用于执行用户代码或调度。p处于idle list中，它的本地runq是空的
_Prunning	当前p与一个m进行关联并且被用于执行用户代码或者调度
_Psyscall	当前p没有在运行用于代码，它与系统调用中的M有亲和关系，但不属于它，并且可能被另一个M窃取。这类似于_Pidle，但使用轻量级转换并维护M亲和关系。
_Pgcstop	当前p因为STW而停止
_Pdead	停用状态，因为GOMAXPROC可用收缩，会造成多余的p被停用。一旦GOMAXPROC重新增长，那么停用的p会被重新启用。

2.3.4 runtime.schedt

还有另一个和调度相关的数据结构需要关注，就是runtime.schedt,其中包含了调度的一些全局数据，schedt类型的实例只会存在一个：

var (
	allm       *m        // 所有m组成一个链表
	gomaxprocs int32     // 对应与GOMAXPROC
	ncpu       int32     // CPU核心数
	
	sched      schedt   // 调度器相关的数据结构
	


	allpLock mutex     // 保护allp的锁
	allp []*p          // 所有的p
)

schedt结构如下：

其中全局runq就存在与schedt结构中

type schedt struct {
	goidgen   atomic.Uint64    
	
	midle        muintptr // idle m's waiting for work
	nmidle       int32    // number of idle m's waiting for work
	mnext        int64    // number of m's that have been created and next M ID
	maxmcount    int32    // maximum number of m's allowed (or die)
	nmsys        int32    // number of system m's not counted for deadlock
	nmfreed      int64    // cumulative number of freed m's

	ngsys atomic.Int32 // number of system goroutines

	pidle        puintptr // idle p's
	npidle       atomic.Int32
	nmspinning   atomic.Int32  // See "Worker thread parking/unparking" comment in proc.go.

	// Global runnable queue.
	runq     gQueue
	runqsize int32

	// Global cache of dead G's.
	gFree struct {
		lock    mutex
		stack   gList // Gs with stacks
		noStack gList // Gs without stacks
		n       int32
	}
}

goidgenmidlenmidlemnextmaxmcountnmsysnmfreedngsyspidlenpidlenmspiningqunq、runqsizegFree

2.4 g0、m0

在每个工作线程M中都存在一个g0，g0的主要功能就是执行调度程序，当需要执行调度程序时会将运行栈切换的g0栈，然后运行调度程序来寻找一个就绪的goroutine并切换运行。

有两个函数可用切换到g0栈来运行：

func mcall(fn func(*g))

func systemstack(fn func())

mcall：将调用mcall的协程栈切换的g0栈并且在g0栈上运行fn，mcall仅可以被除了g0、gsingal之外的g调用。
systemstack：在系统栈上运行fn，然后再切换回来。

m0为进程的第一个线程，也就是运行main goroutine的线程

3 G的创建与退出

我们再程序中通常使用下面的方式来创建一个goroutine：

go func()

runtime.newprocnewprocgoroutine

我们可用通过将代码编译为汇编来查看go func()是怎么执行的：

将下面的示例代码编译为汇编程序：

go build -gcflags -S main.go

package main

import (
	"fmt"
	"time"
)

func print() {
	fmt.Println("hello, GMP")
	time.Sleep(time.Second)
}

func main() {
	go print()

	select {}
}

汇编代码如下：

"".main STEXT size=50 args=0x0 locals=0x10 funcid=0x0 align=0x0
        0x0000 00000 (/root/RemoteWorking/main.go:13)   TEXT    "".main(SB), ABIInternal, $16-0
        0x0000 00000 (/root/RemoteWorking/main.go:13)   CMPQ    SP, 16(R14)
        0x0004 00004 (/root/RemoteWorking/main.go:13)   PCDATA  $0, $-2
        0x0004 00004 (/root/RemoteWorking/main.go:13)   JLS     43
        0x0006 00006 (/root/RemoteWorking/main.go:13)   PCDATA  $0, $-1
        0x0006 00006 (/root/RemoteWorking/main.go:13)   SUBQ    $16, SP
        0x000a 00010 (/root/RemoteWorking/main.go:13)   MOVQ    BP, 8(SP)
        0x000f 00015 (/root/RemoteWorking/main.go:13)   LEAQ    8(SP), BP
        0x0014 00020 (/root/RemoteWorking/main.go:13)   FUNCDATA        $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0014 00020 (/root/RemoteWorking/main.go:13)   FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0014 00020 (/root/RemoteWorking/main.go:14)   LEAQ    "".print·f(SB), AX
        0x001b 00027 (/root/RemoteWorking/main.go:14)   PCDATA  $1, $0
        0x001b 00027 (/root/RemoteWorking/main.go:14)   NOP
        0x0020 00032 (/root/RemoteWorking/main.go:14)   CALL    runtime.newproc(SB)

AXruntime.newproc

runtime.newproc代码如下

func newproc(fn *funcval) {
	gp := getg()          // 获取当前g
	pc := getcallerpc() 
	systemstack(func() {
		newg := newproc1(fn, gp, pc)     // 创建一个新的goroutine
 
		pp := getg().m.p.ptr()           // 获取当前g运行的m关联的p
		runqput(pp, newg, true)          // 将新的goroutine加入到就绪队列中

		if mainStarted {
			wakep()                     // 唤醒新的p
		}
	})
}

runqputgoroutineprunnextrunnextgoroutinegoroutinegoroutinegoroutine

newproc的主要逻辑就是创建了一个新的g，并将其放入当前g运行的m关联的p的本地runq中。

newproc1的代码如下：

func newproc1(fn *funcval, callergp *g, callerpc uintptr) *g {
	...
	mp := acquirem() // 禁止抢占
	pp := mp.p.ptr()    // 获取当前m关联的p
	newg := gfget(pp)   // 从p的缓存中获取一个g
	if newg == nil {
		newg = malg(_StackMin)    // 如果从缓存中获取不到，则新创建一个，_StackMin的值为2048，也就是2K
		casgstatus(newg, _Gidle, _Gdead)
		allgadd(newg) // publishes with a g->status of Gdead so GC scanner doesn't look at uninitialized stack.
	}
	...

	// goexit函数被放在了pc上，gostartcallfn会对其进行特殊处理
	newg.sched.pc = abi.FuncPCABI0(goexit) + sys.PCQuantum // +PCQuantum so that previous instruction is in same function
	newg.sched.g = guintptr(unsafe.Pointer(newg))
	gostartcallfn(&newg.sched, fn)
	newg.gopc = callerpc
	newg.ancestors = saveAncestors(callergp)
	newg.startpc = fn.fn
	...
	casgstatus(newg, _Gdead, _Grunnable)    // 改变g的状态
	
	newg.goid = pp.goidcache  // 分配goid
	pp.goidcache++
	
	releasem(mp)

	return newg
}

func gostartcallfn(gobuf *gobuf, fv *funcval) {
	var fn unsafe.Pointer
	if fv != nil {
		fn = unsafe.Pointer(fv.fn)
	} else {
		fn = unsafe.Pointer(abi.FuncPCABIInternal(nilfunc))
	}
	gostartcall(gobuf, fn, unsafe.Pointer(fv))
}

func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) {
	sp := buf.sp
	sp -= goarch.PtrSize
	*(*uintptr)(unsafe.Pointer(sp)) = buf.pc   // 在goroutine的栈帧中插入了goexit函数
	buf.sp = sp
	buf.pc = uintptr(fn)
	buf.ctxt = ctxt
}

在newproc1中获取了一个g实例，对其中的字段进行了设置，为其分配id，并修改状态为*_Grunnable*。特别需要注意的时，在gostartcall函数中，往goroutine的栈帧中插入了一个goexit函数，因此当goroutine从运行的函数退出时，就会返回到goexit函数中。

使用goland调试go程序时，可以从调用栈中查看到runtime.goexit函数，仿佛是runtime.goexit函数调用了runtime.main，而runtime.main又调用了main.main函数

而runtime.goexit是一段使用汇编实现的代码：

TEXT runtime·goexit(SB),NOSPLIT|TOPFRAME,$0-0
	BYTE	$0x90	// NOP
	CALL	runtime·goexit1(SB)	// does not return
	// traceback from goexit1 must hit code range of goexit
	BYTE	$0x90	// NOP

runtime.goexit1

func goexit1() {
	mcall(goexit0)
}

func goexit0(gp *g) {
    // 重置g的状态
	mp := getg().m
	pp := mp.p.ptr()

	casgstatus(gp, _Grunning, _Gdead)
	gcController.addScannableStack(pp, -int64(gp.stack.hi-gp.stack.lo))
	if isSystemGoroutine(gp, false) {
		sched.ngsys.Add(-1)
	}
	gp.m = nil
	locked := gp.lockedm != 0
	gp.lockedm = 0
	mp.lockedg = 0
	gp.preemptStop = false
	gp.paniconfault = false
	gp._defer = nil // should be true already but just in case.
	gp._panic = nil // non-nil for Goexit during panic. points at stack-allocated data.
	gp.writebuf = nil
	gp.waitreason = waitReasonZero
	gp.param = nil
	gp.labels = nil
	gp.timer = nil

	dropg()    // 将当前g从m移除
  
	gfput(pp, gp)  //将g放入p的gFreelist中
	
	schedule()   // 触发新一轮的调度
}

从runtime.goexit到runtime.goexit1，最终到runtime.goexit0函数中，对g的状态进行了重置，然后将g从m中移除，放入p的gFree List中，，以便后续重用。然后调用了scheduler函数，scheduler函数正是调度的入口，如此一来便形成了一个闭环。

总结

4 调度循环

go的调度器会不断调度goroutine到线程上运行，当一个goroutine结束运行、发生阻塞、主动让出、或者时间片用尽时就会触发新一轮的调度，重新选择一个goroutine来运行。整个流程如下：

mstartscheduleexecuteuser code

4.1 runtime.schedule

schedulescheduleg0

代码如下：

func schedule() {
	mp := getg().m
	
    // 线程持有锁时，不能进行调度，以免造成runtime内部错误
	if mp.locks != 0 {       
		throw("schedule: holding locks")
	}
	
    // 判断当前M有没有和G绑定，如果有，这个M就不能用来执行其它的G
	if mp.lockedg != 0 {
		stoplockedm()
		execute(mp.lockedg.ptr(), false) // Never returns.
	}

	// 判断是否在进行cgo调用，如果在就不能进行调度，因为g0栈正在被cgo使用
	if mp.incgo {
		throw("schedule: in cgo")
	}

top:
	pp := mp.p.ptr()        // 获取当前m关联的p
	pp.preempt = false

	if mp.spinning && (pp.runnext != 0 || pp.runqhead != pp.runqtail) {
		throw("schedule: spinning with local work")
	}
	
    // 寻找一个可运行的g，阻塞直到找到
	gp, inheritTime, tryWakeP := findRunnable() // blocks until work is available

	// 如果当前线程正在自旋寻找新的工作，因为已经找到工作了，重置自旋状态
	if mp.spinning {
		resetspinning()
	}

	...

	// If about to schedule a not-normal goroutine (a GCworker or tracereader),
	// wake a P if there is one.
	if tryWakeP {
		wakep()
	}
	if gp.lockedm != 0 {
		// Hands off own p to the locked m,
		// then blocks waiting for a new p.
		startlockedm(gp)
		goto top
	}
	
    // 调用execute来运行g
	execute(gp, inheritTime)
}

scheduleGGcgocgog0

findrunnablegexecuteg

4.2 runtime.findrunnable

_Runnablegoroutine

tracetrace reader

func findRunnable() (gp *g, inheritTime, tryWakeP bool) {
	mp := getg().m

	// The conditions here and in handoffp must agree: if
	// findrunnable would return a G to run, handoffp must start
	// an M.
	...
	// Try to schedule the trace reader.
	if trace.enabled || trace.shutdown {
		gp := traceReader()
		if gp != nil {
			casgstatus(gp, _Gwaiting, _Grunnable)
			traceGoUnpark(gp, 0)
			return gp, false, true
		}
	}
	...
}

trace readergcGC Worker

// Try to schedule a GC worker.
	if gcBlackenEnabled != 0 {
		gp, tnow := gcController.findRunnableGCWorker(pp, now)    // 尝试获取一个GC Worker
		if gp != nil {
			return gp, false, true
		}
		now = tnow
	}

goroutinegoroutine61goroutinep.schedtick

	// Check the global runnable queue once in a while to ensure fairness.
	// Otherwise two goroutines can completely occupy the local runqueue
	// by constantly respawning each other.
	if pp.schedtick%61 == 0 && sched.runqsize > 0 {
		lock(&sched.lock)
		gp := globrunqget(pp, 1)   // 每执行61次调度，就从全局队列中获取一个goroutine来运行    
		unlock(&sched.lock)
		if gp != nil {
			return gp, false, false
		}
	}

prunqgoroutine

// local runq
	if gp, inheritTime := runqget(pp); gp != nil {       // 从p的本地runq中获取
		return gp, inheritTime, false
	}

prunqgoroutinegoroutinerunqgoroutineprunq

// global runq
	if sched.runqsize != 0 {
		lock(&sched.lock)
		gp := globrunqget(pp, 0)
		unlock(&sched.lock)
		if gp != nil {
			return gp, false, false
		}
	}

goroutinenetpollergoroutine

// 如果netpoller启动了，并且其中管理的fd数量大于0，调用netpoll来轮询网络，以此来获取在网络中就绪的goroutine
if netpollinited() && netpollWaiters.Load() > 0 && sched.lastpoll.Load() != 0 {
		if list := netpoll(0); !list.empty() { // non-blocking
			gp := list.pop()        // 获取到一批goroutine，组成一个链表，获取链表头第一个
			injectglist(&list)      // 将其它goroutine放入本地runq中
			casgstatus(gp, _Gwaiting, _Grunnable)
			if trace.enabled {
				traceGoUnpark(gp, 0)
			}
			return gp, false, false
		}
	}

goroutinerunqgoroutinecpuPPallpP

if mp.spinning || 2*sched.nmspinning.Load() < gomaxprocs-sched.npidle.Load() {
		if !mp.spinning {
			mp.becomeSpinning()
		}

		gp, inheritTime, tnow, w, newWork := stealWork(now)      // 从其它P那偷取工作
		if gp != nil {
			// Successfully stole.
			return gp, inheritTime, false
		}
		if newWork {
			// There may be new timer or GC work; restart to
			// discover.
			goto top
		}

		now = tnow
		if w != 0 && (pollUntil == 0 || w < pollUntil) {
			// Earlier timer to wait for.
			pollUntil = w
		}
	}

4.3 runtime.execute、runtime.gogo

runtime.executegggogogoroutinegogogoroutine

func execute(gp *g, inheritTime bool) {
	mp := getg().m

	...

	mp.curg = gp  // 设置当前运行的g
	gp.m = mp     // 关联当前的m 
	casgstatus(gp, _Grunnable, _Grunning)   // 将当前g的状态切换为_Grunning
	gp.waitsince = 0
	gp.preempt = false
	gp.stackguard0 = gp.stack.lo + _StackGuard
	if !inheritTime {
		mp.p.ptr().schedtick++
	}

	...
    
	gogo(&gp.sched)       // 调用gogo来切换协程
}

4.4 runtime.gopark、runtime.goready

goparkgoroutine_Gwaiting

func gopark(unlockf func(*g, unsafe.Pointer) bool, lock unsafe.Pointer, reason waitReason, traceEv byte, traceskip int) {
	...
	// can't do anything that might move the G between Ms here.
	mcall(park_m)
}

func park_m(gp *g) {
	mp := getg().m

	if trace.enabled {
		traceGoPark(mp.waittraceev, mp.waittraceskip)
	}

	// 修改g的状态为Gwaiting
	casgstatus(gp, _Grunning, _Gwaiting)
	dropg()

	if fn := mp.waitunlockf; fn != nil {
		ok := fn(gp, mp.waitlock)
		mp.waitunlockf = nil
		mp.waitlock = nil
		if !ok {
			if trace.enabled {
				traceGoUnpark(gp, 2)
			}
			casgstatus(gp, _Gwaiting, _Grunnable)
			execute(gp, true) // Schedule it back, never returns.
		}
	}
    // 触发新一轮的调度
	schedule()
}

goroutinegoparkchangoroutinechangopark

goreadychangoreadysystemstackg0readyreadygoroutine_Grunnablerunq中goroutine

func goready(gp *g, traceskip int) {
	systemstack(func() {
		ready(gp, traceskip, true)
	})
}

func ready(gp *g, traceskip int, next bool) {
	status := readgstatus(gp)

	// Mark runnable.
	mp := acquirem() // disable preemption because it can be holding p in a local var
	if status&^_Gscan != _Gwaiting {
		dumpgstatus(gp)
		throw("bad g->status in ready")
	}

	// 将g的状态切换为_Grunnable
	casgstatus(gp, _Gwaiting, _Grunnable)
    // 将g添加到runq中
	runqput(mp.p.ptr(), gp, next)
    // 唤醒新的p
	wakep()
	releasem(mp)
}

4.5 work stealing和handoff机制

work stealing

当一个线程没有可用的工作并且从全局队列中也找不到时，该线程并不会立马陷入休眠或者被销毁，而且尝试从其它P中窃取一部分的工作来运行。

handoff

当一个goroutine处于系统调用时，可能会导致整个线程发生阻塞。为了充分利用多核CPU，当前P会与M进行解绑，并且寻找或创建一个新的M来运行工作。

5 抢占式调度

gogoroutine抢占式时间片时钟中断上下文上下文

CPUgoroutinegoroutine 10msgo用户态时钟中断go

goroutineGoschedgo调度器的抢占式调度实际上是通过hook的方式来实现的goroutinego插入栈检测的相关代码抢占调度

runtime.morestackruntime.morestack_noctxt

runtime.newstack

抢占相关的代码如下：

func newstack() {
    ...
    // 加载 stackguard0
    stackguard0 := atomic.Loaduintptr(&gp.stackguard0)
	
    // 判断stackguard0是否被标记为了抢占
	preempt := stackguard0 == stackPreempt
    
    if preempt {
		...

		// 触发抢占
		gopreempt_m(gp) // never return
	}
}

func gopreempt_m(gp *g) {
	...
	goschedImpl(gp)
}

func goschedImpl(gp *g) {
	status := readgstatus(gp)
	if status&^_Gscan != _Grunning {
		dumpgstatus(gp)
		throw("bad g status")
	}
	casgstatus(gp, _Grunning, _Grunnable)
	dropg()
	lock(&sched.lock)
	globrunqput(gp)
	unlock(&sched.lock)
	
    // 触发新的一轮调度
	schedule()
}

newstackstackguard0stackPreempt0xfffffade

goroutinegoroutinegoroutinegoroutinemorestackgoroutineschedule

goroutineCPU

5.1 异步抢占

go1.13go1.13

package main

import "fmt"

func fn(i int) {
	for {
		i++
		fmt.Println(i)
	}
}

func main() {
	go fn(0)

	for {
	}
}

程序启动后，会一直打印数字，但是在我机器上打印到15万多的时候就会停止，整个程序卡死了。

fmt.PrintlnGCGCSTW（Stop The Worldmain goroutien没有发生函数调用，也就无法对其抢占，因此造成了死锁

go1.14异步抢占机制

goroutine

操作系统信goroutinegoroutine信号sigHandlegoroutine

SIGURG

sigHandler

func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) {
	...
	// 如果当前信号是sigPreempt 而且 开启了异步抢占
	if sig == sigPreempt && debug.asyncpreemptoff == 0 && !delayedSignal {
		//进行抢占
		doSigPreempt(gp, c)
		
	}
	...
}

doSigPreemptgoroutineasyncPreemptgoroutineasyncPreempt

func doSigPreempt(gp *g, ctxt *sigctxt) {
	...
	if wantAsyncPreempt(gp) {
		if ok, newpc := isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()); ok {
			// Adjust the PC and inject a call to asyncPreempt.
			ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc)
		}
	}

	...
}

asyncPreemptasyncPreempt2schedule

func asyncPreempt2() {
	...
		mcall(gopreempt_m)
	...
}

func gopreempt_m(gp *g) {
	...
	goschedImpl(gp)
}

func goschedImpl(gp *g) {
	...

	schedule()
}

6 系统监控线程sysmon

sysmon（system mointor）

runtime.main

sysmon

sysmon

死锁检测轮询网络夺取处于系统调用中的P抢占长时间运行的G周期性触发GC

func sysmon() {
	lock(&sched.lock)
	sched.nmsys++
    // 1.死锁检测
	checkdead()
	unlock(&sched.lock)

	...

	for {
        // 休眠一定时间
		if idle == 0 { // start with 20us sleep...
			delay = 20
		} else if idle > 50 { // start doubling the sleep after 1ms...
			delay *= 2
		}
		if delay > 10*1000 { // up to 10ms
			delay = 10 * 1000
		}
		usleep(delay)

		...
		
		// poll network if not polled for more than 10ms
        // 2.轮询网络如果截至上次已经超过10ms
		lastpoll := sched.lastpoll.Load()
		if netpollinited() && lastpoll != 0 && lastpoll+10*1000*1000 < now {
			sched.lastpoll.CompareAndSwap(lastpoll, now)
			list := netpoll(0) // non-blocking - returns list of goroutines
			if !list.empty() {
				incidlelocked(-1)
				injectglist(&list)
				incidlelocked(1)
			}
		}
		...
        
		// retake P's blocked in syscalls
		// and preempt long running G's
        // 3.夺回阻塞与系统调用中的P
        // 4.抢占长时间运行的G
		if retake(now) != 0 {
			idle = 0
		} else {
			idle++
		}
        
		// check if we need to force a GC
        // 5.周期性触发GC
		if t := (gcTrigger{kind: gcTriggerTime, now: now}); t.test() && forcegc.idle.Load() {
			lock(&forcegc.lock)
			forcegc.idle.Store(false)
			var list gList
			list.push(forcegc.g)
			injectglist(&list)
			unlock(&forcegc.lock)
		}
		if debug.schedtrace > 0 && lasttrace+int64(debug.schedtrace)*1000000 <= now {
			lasttrace = now
			schedtrace(debug.scheddetail > 0)
		}
		unlock(&sched.sysmonlock)
	}
}