運(yùn)城網(wǎng)站建設(shè)價(jià)格長(zhǎng)尾詞seo排名優(yōu)化
調(diào)度系列
調(diào)度系列之goroutine
上一篇中介紹了goroutine,最本質(zhì)的一句話就是goroutine是用戶態(tài)的任務(wù)。我們通常說(shuō)的goroutine運(yùn)行其實(shí)嚴(yán)格來(lái)說(shuō)并不準(zhǔn)確,因?yàn)槿蝿?wù)只能被執(zhí)行。那么goroutine是被誰(shuí)執(zhí)行呢?是被m執(zhí)行。
在GMP的架構(gòu)中,m代表的是主動(dòng)執(zhí)行的能力,一個(gè)m對(duì)應(yīng)的是一個(gè)線程。注意的是m只是對(duì)應(yīng)操作系統(tǒng)的線程,因?yàn)榫€程是由操作系統(tǒng)來(lái)管理的,但是在用戶態(tài)中我們可以通過(guò)一些同步機(jī)制來(lái)實(shí)現(xiàn)一定程度的操縱。
同樣類比一個(gè)任務(wù)系統(tǒng)的話,goroutine對(duì)應(yīng)task,m對(duì)應(yīng)的就是worker。任務(wù)系統(tǒng)中創(chuàng)建一定數(shù)量的worker,worker獲取task并執(zhí)行,循環(huán)往復(fù)。通常在簡(jiǎn)單的任務(wù)系統(tǒng)中,只有worker和task兩個(gè)對(duì)象完全可以勝任,所有task出于全局的隊(duì)列(或者其他數(shù)據(jù)結(jié)構(gòu)中)。golang的調(diào)度系統(tǒng)最開(kāi)始也確實(shí)是GM架構(gòu)。但是golang的調(diào)度體系顯然不屬于簡(jiǎn)單的任務(wù)系統(tǒng),所以go在G和M中增加了一個(gè)中間層P。P對(duì)應(yīng)的是執(zhí)行的權(quán)限、執(zhí)行的資源,這個(gè)會(huì)在下篇介紹。
文章目錄
- m的狀態(tài)圖
- m的操作
- newm
- mstart
- mexit
- startm
- stopm
- m的對(duì)象
m的狀態(tài)圖
在介紹具體的細(xì)節(jié)前,同樣先來(lái)一個(gè)整體的狀態(tài)圖。
需要說(shuō)明的是,m不同于g,g有明確的status字段來(lái)記錄狀態(tài),m沒(méi)有記錄狀態(tài)的字段。雖然m沒(méi)有status字段以及可枚舉的狀態(tài)值,但仍然可以抽象出相應(yīng)的狀態(tài),來(lái)做狀態(tài)的流轉(zhuǎn)。
先介紹下幾個(gè)狀態(tài)值的含義。
- running。
表示m在運(yùn)行中。處于running狀態(tài)的m在執(zhí)行某個(gè)goroutine或者在調(diào)用findrunnable尋找可執(zhí)行的goroutine。需要注意的是,m處于running狀態(tài)時(shí),其g可能會(huì)處于running狀態(tài)或者syscall狀態(tài)。 - spinning。
表示m處于自旋狀態(tài),m有spinning字段表示是否處于自旋狀態(tài)。此時(shí)系統(tǒng)中沒(méi)有g(shù)oroutine可執(zhí)行時(shí),但是m不會(huì)立即掛起,而是嘗試尋找可執(zhí)行的任務(wù)。spinning的設(shè)計(jì)是為了減少線程的切換,因?yàn)榫€程切換的損耗是比較高的。 - idle。
表示m處于空閑狀態(tài)。此時(shí)m位于全局的隊(duì)列(schedt.midle)中,對(duì)應(yīng)的線程阻塞在condition上,等待喚醒。通常來(lái)說(shuō),m會(huì)在嘗試spinning后再切換為idle。但是go中對(duì)最大的spinning的數(shù)量做了限制,如果正在spining的數(shù)量過(guò)多,則會(huì)直接轉(zhuǎn)換為idle。
m開(kāi)始創(chuàng)建時(shí)會(huì)處于running或者spinning狀態(tài)(哪些情況下會(huì)處于spinning狀態(tài)還不確定)。
當(dāng)running狀態(tài)的m找不到可執(zhí)行的goroutine時(shí),會(huì)切換為spinning狀態(tài),spinning一段時(shí)間后會(huì)轉(zhuǎn)變?yōu)閕dle;另一個(gè)種情況時(shí),當(dāng)m從系統(tǒng)調(diào)用中返回時(shí),獲取不到p,則會(huì)轉(zhuǎn)換為spinning狀態(tài)。
當(dāng)然我們上面也說(shuō)過(guò),處于spining狀態(tài)的m的數(shù)量是有限制的,當(dāng)達(dá)到這個(gè)限制,running會(huì)直接轉(zhuǎn)變?yōu)閕dle。當(dāng)需要新的m時(shí),會(huì)先嘗試從schedt.midle這個(gè)隊(duì)列中獲取m,如果沒(méi)有再通過(guò)newm進(jìn)行創(chuàng)建。
m流轉(zhuǎn)的大概情況如此,下面我們來(lái)介紹細(xì)節(jié)。
m的操作
m的操作中,主要涉及到newm、mstart、mexit、startm等幾個(gè)方法,下面逐一進(jìn)行介紹。
newm
newm是創(chuàng)建m的入口(應(yīng)該也是唯一的入口)。newm創(chuàng)建m對(duì)象,并將其同os線程關(guān)聯(lián)起來(lái)運(yùn)行,fn為傳入的運(yùn)行的函數(shù)。在某些情況下(這里暫時(shí)不深究),不能直接創(chuàng)建os線程,通過(guò)newmHandoff來(lái)操作,代碼塊中略過(guò)。
// src/proc.go 2096
func newm(fn func(), _p_ *p, id int64) {// allocm adds a new M to allm, but they do not start until created by// the OS in newm1 or the template thread.//// doAllThreadsSyscall requires that every M in allm will eventually// start and be signal-able, even with a STW.//// Disable preemption here until we start the thread to ensure that// newm is not preempted between allocm and starting the new thread,// ensuring that anything added to allm is guaranteed to eventually// start.acquirem()mp := allocm(_p_, fn, id)mp.nextp.set(_p_)mp.sigmask = initSigmaskif gp := getg(); gp != nil && gp.m != nil && (gp.m.lockedExt != 0 || gp.m.incgo) && GOOS != "plan9" {...}newm1(mp)releasem(getg().m)
}
newm函數(shù)開(kāi)始時(shí),首先調(diào)用acquirem來(lái)防止發(fā)生搶占,并在結(jié)束時(shí)調(diào)用releasem來(lái)解鎖。acquirem和releasem是通過(guò)對(duì)m的locks字段進(jìn)行操作來(lái)達(dá)成目的的。
//go:nosplit
func acquirem() *m {_g_ := getg()_g_.m.locks++return _g_.m
}//go:nosplit
func releasem(mp *m) {_g_ := getg()mp.locks--if mp.locks == 0 && _g_.preempt {// restore the preemption request in case we've cleared it in newstack_g_.stackguard0 = stackPreempt}
}
之后調(diào)用allocm創(chuàng)建m對(duì)象,并做一些初始化的操作,主要是為g0和gsignal分配內(nèi)存。 g0在上一篇介紹g的時(shí)候提到過(guò),這是和每個(gè)m綁定的,主要執(zhí)行系統(tǒng)任務(wù),協(xié)程調(diào)度等任務(wù)都是在g0中執(zhí)行的。gsignal是為信號(hào)處理分配的棧。然后會(huì)將m加入全局的隊(duì)列(allm)中。allocm的代碼這里就不貼了,感興趣可以自己查看。
allocm創(chuàng)建的m調(diào)用newm1函數(shù)運(yùn)行。忽略cgo的部分。newm1中調(diào)用了newosproc方法來(lái)運(yùn)行m。
func newm1(mp *m) {if iscgo {...}execLock.rlock() // Prevent process clone.newosproc(mp)execLock.runlock()
}
newosproc調(diào)用了一些真正的底層方法,在準(zhǔn)備工作(略過(guò))之后調(diào)用pthread_create創(chuàng)建了os線程。os線程執(zhí)行的入口為mstart_stub,其會(huì)指向mstart,創(chuàng)建的m作為參數(shù)傳入。通過(guò)這里就講os線程同m關(guān)聯(lián)起來(lái)了。
// glue code to call mstart from pthread_create.
func mstart_stub()
// May run with m.p==nil, so write barriers are not allowed.
//
//go:nowritebarrierrec
func newosproc(mp *m) {// 忽略準(zhǔn)備工作....// Finally, create the thread. It starts at mstart_stub, which does some low-level// setup and then calls mstart.var oset sigsetsigprocmask(_SIG_SETMASK, &sigset_all, &oset)err = pthread_create(&attr, abi.FuncPCABI0(mstart_stub), unsafe.Pointer(mp))sigprocmask(_SIG_SETMASK, &oset, nil)if err != 0 {write(2, unsafe.Pointer(&failthreadcreate[0]), int32(len(failthreadcreate)))exit(1)}
}
mstart
newm是創(chuàng)建m的入口,mstart是m執(zhí)行的入口。mstart是匯編實(shí)現(xiàn),調(diào)用了mstart0。
// mstart is the entry-point for new Ms.
// It is written in assembly, uses ABI0, is marked TOPFRAME, and calls mstart0.
func mstart()
mstart0初始化了棧相關(guān)的字段,是我們?cè)趃oroutine中提到的stackguard0字段。這里getg()得到的應(yīng)該是對(duì)應(yīng)m的g0。然后調(diào)用mstart1。最后調(diào)用mexit。需要注意的是mstart1是不會(huì)返回的(這點(diǎn)下面詳細(xì)介紹),所以不用擔(dān)心mexit一下就執(zhí)行了。
func mstart0() {_g_ := getg()osStack := _g_.stack.lo == 0if osStack {...}// Initialize stack guard so that we can start calling regular// Go code._g_.stackguard0 = _g_.stack.lo + _StackGuard// This is the g0, so we can also call go:systemstack// functions, which check stackguard1._g_.stackguard1 = _g_.stackguard0mstart1()// Exit this thread.if mStackIsSystemAllocated() {// Windows, Solaris, illumos, Darwin, AIX and Plan 9 always system-allocate// the stack, but put it in _g_.stack before mstart,// so the logic above hasn't set osStack yet.osStack = true}mexit(osStack)
}
mstart1保證是非內(nèi)聯(lián)的,這是為了保證能夠記錄mstart調(diào)用mstart1時(shí)的執(zhí)行狀態(tài)(pc和sp),將其保存在g0.sched中。這樣調(diào)用gogo(&g0.sched)能夠回到mstart該節(jié)點(diǎn)繼續(xù)執(zhí)行,后面的就會(huì)執(zhí)行mexit。保證m的退出能夠執(zhí)行mexit。
mstart1中會(huì)先調(diào)用fn,然后調(diào)用schedule。g的介紹中提到過(guò)schedule方法是不會(huì)返回的,也是前面提到mstart1不會(huì)返回的原因。此時(shí),m真正進(jìn)入不斷尋找就緒的g并執(zhí)行的過(guò)程中,也進(jìn)入了狀態(tài)圖中running、spinning、idle之間不斷狀態(tài)流轉(zhuǎn)的過(guò)程中。
// The go:noinline is to guarantee the getcallerpc/getcallersp below are safe,
// so that we can set up g0.sched to return to the call of mstart1 above.
//
//go:noinline
func mstart1() {_g_ := getg()if _g_ != _g_.m.g0 {throw("bad runtime·mstart")}// Set up m.g0.sched as a label returning to just// after the mstart1 call in mstart0 above, for use by goexit0 and mcall.// We're never coming back to mstart1 after we call schedule,// so other calls can reuse the current frame.// And goexit0 does a gogo that needs to return from mstart1// and let mstart0 exit the thread._g_.sched.g = guintptr(unsafe.Pointer(_g_))_g_.sched.pc = getcallerpc()_g_.sched.sp = getcallersp()asminit()minit()// Install signal handlers; after minit so that minit can// prepare the thread to be able to handle the signals.if _g_.m == &m0 {mstartm0()}if fn := _g_.m.mstartfn; fn != nil {fn()}if _g_.m != &m0 {acquirep(_g_.m.nextp.ptr())_g_.m.nextp = 0}schedule()
}
mexit
mexit主要是做一些釋放資源的操作,包括:將分配的棧內(nèi)存釋放、從全局的隊(duì)列中移除m、將持有的p釋放移交,然后退出os線程。這里就不做過(guò)多的詳細(xì)的介紹。代碼也不貼了,位于 src/go/proc.go 1471
startm
newm是創(chuàng)建m的唯一入口,但實(shí)際上大多數(shù)時(shí)候需要m的時(shí)候都是調(diào)用了startm。startm和newm的唯一區(qū)別時(shí),其會(huì)先去全局的空閑隊(duì)列里尋找,如果找不到再去調(diào)用newm進(jìn)行創(chuàng)建。如果找到了,則獲取idle的m,并喚醒該m。
//go:nowritebarrierrec
func startm(_p_ *p, spinning bool) {mp := acquirem()lock(&sched.lock)if _p_ == nil {_p_, _ = pidleget(0)if _p_ == nil {unlock(&sched.lock)if spinning {// The caller incremented nmspinning, but there are no idle Ps,// so it's okay to just undo the increment and give up.if int32(atomic.Xadd(&sched.nmspinning, -1)) < 0 {throw("startm: negative nmspinning")}}releasem(mp)return}}nmp := mget()if nmp == nil {// No M is available, we must drop sched.lock and call newm.// However, we already own a P to assign to the M.//// Once sched.lock is released, another G (e.g., in a syscall),// could find no idle P while checkdead finds a runnable G but// no running M's because this new M hasn't started yet, thus// throwing in an apparent deadlock.//// Avoid this situation by pre-allocating the ID for the new M,// thus marking it as 'running' before we drop sched.lock. This// new M will eventually run the scheduler to execute any// queued G's.id := mReserveID()unlock(&sched.lock)var fn func()if spinning {// The caller incremented nmspinning, so set m.spinning in the new M.fn = mspinning}newm(fn, _p_, id)// Ownership transfer of _p_ committed by start in newm.// Preemption is now safe.releasem(mp)return}unlock(&sched.lock)if nmp.spinning {throw("startm: m is spinning")}if nmp.nextp != 0 {throw("startm: m has p")}if spinning && !runqempty(_p_) {throw("startm: p has runnable gs")}// The caller incremented nmspinning, so set m.spinning in the new M.nmp.spinning = spinningnmp.nextp.set(_p_)notewakeup(&nmp.park)// Ownership transfer of _p_ committed by wakeup. Preemption is now// safe.releasem(mp)
}
stopm
stopm是用來(lái)掛起m,其內(nèi)容也比較簡(jiǎn)單。將m放置到全局的空閑隊(duì)列中,然后調(diào)用mPark。mPark是一個(gè)阻塞的操作,其會(huì)阻塞在信號(hào)(m.park)上,等待喚醒,然后獲取P繼續(xù)執(zhí)行。
// Stops execution of the current m until new work is available.
// Returns with acquired P.
func stopm() {_g_ := getg()if _g_.m.locks != 0 {throw("stopm holding locks")}if _g_.m.p != 0 {throw("stopm holding p")}if _g_.m.spinning {throw("stopm spinning")}lock(&sched.lock)mput(_g_.m)unlock(&sched.lock)mPark()acquirep(_g_.m.nextp.ptr())_g_.m.nextp = 0
}
// mPark causes a thread to park itself, returning once woken.
//
//go:nosplit
func mPark() {gp := getg()notesleep(&gp.m.park)noteclear(&gp.m.park)
}
m的對(duì)象
m對(duì)應(yīng)結(jié)構(gòu)體的具體的代碼就不貼了,這里就挑一些字段進(jìn)行介紹。有后面涉及到的字段再來(lái)補(bǔ)充。
寫(xiě)在最后
本篇呢,依舊是只聚焦于m本身。同樣的道理,拋開(kāi)G和P,很難講到面面俱到。但是同樣的,讀完本篇,相信對(duì)m也會(huì)有一個(gè)本質(zhì)的理解。m就是一個(gè)worker,其同一個(gè)os線程關(guān)聯(lián)。我們會(huì)將活躍的m的數(shù)量控制在一定的范圍,以避免過(guò)多的切換造成不必要的損耗。m在不同條件下會(huì)在running、spinning、idle之間進(jìn)行狀態(tài)的轉(zhuǎn)換。我們通過(guò)不同的隊(duì)列以及一些同步機(jī)制在用戶態(tài)來(lái)管理m。下面可能還會(huì)有一篇來(lái)補(bǔ)充些M相關(guān)的內(nèi)容,然后再開(kāi)始P的介紹。