做網(wǎng)站運營有前景嗎愛站長尾詞
背景:
對于不能占滿所有cpu核數(shù)的進程,進行on-cpu的分析是沒有意義的,因為可能程序大部分時間都處在阻塞狀態(tài)。
實驗例子程序:
以centos8和golang1.23.3為例,測試下面的程序:
pprof_netio.go
package mainimport ("fmt""net/http"_ "net/http/pprof"//"time"
)func main() {go func() {_ = http.ListenAndServe("0.0.0.0:9091", nil)}()//并發(fā)數(shù)var ConChan = make(chan bool, 100)for {ConChan <- truego func() {defer func() {<-ConChan}()doNetIO()}()}
}func doNetIO() {//fmt.Printf("doNetIO start: %s\n", time.Now().Format(time.DateTime))for i := 0; i < 10; i++ {_, err := http.Get("http://127.0.0.1:8080/echo_delay")if err != nil {fmt.Printf("i:%d err: %v\n", i, err)return}}//fmt.Printf("doNetIO end: %s\n", time.Now().Format(time.DateTime))
}
測試請求的是nginx,nginx配置如下:
agent-8080.conf
server{listen 8080 reuseport;index index.html index.htm index.php;root /usr/share/nginx/html;access_log /var/log/nginx/access-8080.log main;error_log /var/log/nginx/access-8080.log error;location ~ /echo_delay {limit_rate 30;return 200 '{"code":"0","message":"ok","data":"012345678901234567890123456789"}';}location ~ /*.mp3 {root /usr/share/nginx/html;limit_rate 10k;}location ~ /* {return 200 '{}';}
}
編譯運行程序:
go build pprof_netio.go
./pprof_netio
top查看,cpu利用率非常低:
通過pprof:profile查看on-cpu耗時情況:
go tool pprof -http=192.168.36.5:9000 http://127.0.0.1:9091/debug/pprof/profile
默認采樣總時長30s,on-cpu時間才690ms,準確說是在30s內(nèi)只采樣到69次,每次采樣間隔10ms,pprof推算on-cpu時間是690ms,總之cpu利用率很低。
通過perf查看off-cpu耗時情況:
查看perf支持的調度事件:
以centos8為例,安裝依賴:
yum install kernel-debug kernel-debug-devel --nogpgcheck
echo 1 > /proc/sys/kernel/sched_schedstats
perf生成off-cpu火焰圖腳本:
perf-offcpu.sh
#/bin/shif [ "$1" == "" ]; thenecho “usage: $0 prog_name”exit
fi
pid=`ps aux | grep $1 | grep -v 'grep' | grep -v 'perf-offcpu' | awk '{print $2}'`
echo prog_name:$1
echo pid:$pid
perf record -e sched:sched_stat_sleep -e sched:sched_switch \-e sched:sched_stat_iowait -e sched:sched_process_exit \-e sched:sched_stat_blocked -e sched:sched_stat_wait \-g -o perf.data.raw -p $pid -- sleep 30
perf inject -v -s -i perf.data.raw -o perf.data
perf script -F comm,pid,tid,cpu,time,period,event,ip,sym,dso,trace | awk 'NF > 4 { exec = $1; period_ms = int($5 / 1000000) }NF > 1 && NF <= 4 && period_ms > 0 { print $2 }NF < 2 && period_ms > 0 { printf "%s\n%d\n\n", exec, period_ms }' | \stackcollapse.pl | \flamegraph.pl --countname=ms --title="Off-CPU Time Flame Graph" --colors=io > offcpu.svg
進行采樣:
sh perf-offcpu.sh 'pprof_netio'
perf的off-cpu火焰圖:
可以看出阻塞時間的65%都在等待網(wǎng)絡連接的建立、發(fā)送、讀取。
通過bcc/tools/offcputime查看off-cpu耗時情況:
centos8安裝bcc-tools:
yum install bcc-tools --nogpgcheck
bcc生成off-cpu火焰圖腳本:
bcc-offcputime.sh
#/bin/shif [ "$1" == "" ]; thenecho “usage: $0 prog_name”exit
fi
pid=`ps aux | grep $1 | grep -v 'grep' | grep -v 'bcc-offcputime' | awk '{print $2}'`
echo prog_name:$1
echo pid:$pid
/usr/share/bcc/tools/offcputime -df -p $pid 30 > out.stacks
flamegraph.pl --color=io --title="bcc Off-CPU Time Flame Graph" --countname=us < out.stacks > offcpu-bcc.svg
進行采樣:
sh bcc-offcputime.sh 'pprof_netio'
bcc的off-cpu火焰圖:
可以看出阻塞時間的67%都在等待網(wǎng)絡連接的建立、發(fā)送、讀取。
通過fgprof以代碼侵入方式對golang程序進行off-cpu耗時分析:
修改代碼,添加fgprof支持:
pprof_netio.go
package mainimport ("fmt""net/http"_ "net/http/pprof"//"time""github.com/felixge/fgprof"
)func main() {//fgprof支持http.DefaultServeMux.Handle("/debug/fgprof", fgprof.Handler())go func() {_ = http.ListenAndServe("0.0.0.0:9091", nil)}()//并發(fā)數(shù)var ConChan = make(chan bool, 100)for {ConChan <- truego func() {defer func() {<-ConChan}()doNetIO()}()}
}func doNetIO() {//fmt.Printf("doNetIO start: %s\n", time.Now().Format(time.DateTime))for i := 0; i < 10; i++ {_, err := http.Get("http://127.0.0.1:8080/echo_delay")if err != nil {fmt.Printf("i:%d err: %v\n", i, err)return}}//fmt.Printf("doNetIO end: %s\n", time.Now().Format(time.DateTime))
}
進行fgprof采樣:
go tool pprof --http=192.168.36.5:9000 http://localhost:9091/debug/fgprof?seconds=30
fgprof的off-cpu火焰圖:
從圖看,能大致定位到是阻塞在網(wǎng)絡讀寫上,但給人感覺采樣的范圍和頻率不及perf和bcc,而且看資料不支持采樣cgo程序。
參考資料:
Off-CPU Flame Graphs
Linux perf_events Off-CPU Time Flame Graph
fgprof package - github.com/felixge/fgprof - Go Packages
--end--