pprof与trace学习笔记-EW帮帮网

本博文源于笔者正在学习go文章，包含了两个demo，一个是pprof的goroutine阻塞，一个是简易的trace的demo

pprof

demo:分析问题，修改为正确

package main

import (
	"flag"
	"fmt"
	"os"
	"runtime/pprof"
	"time"
)

func logicCode() {
	var c chan int
	for {
		select {
		case <-c:
			fmt.Printf("recv from chan,value:\n")
		default:
			//time.Sleep(time.Millisecond * 500)
		}
	}
}

func main() {
	var isCPUProf bool
	var isMemPprof bool

	flag.BoolVar(&isCPUProf, "cpu", false, "turn cpu pprof on")
	flag.BoolVar(&isMemPprof, "mem", false, "turn mem pprof on")
	flag.Parse()

	if isCPUProf {
		file, err := os.Create("./cpu.prof")
		if err != nil {
			fmt.Printf("create cpu profile err:%v\n", err)
			return
		}
		pprof.StartCPUProfile(file)
		defer func() {
			pprof.StopCPUProfile()
			file.Close()
		}()
	}
	for i := 0; i < 6; i++ {
		go logicCode()
	}
	time.Sleep(20 * time.Second)
	if isMemPprof {
		f2, err := os.Create("./mem.prof")
		if err != nil {
			fmt.Printf("create mem profile err:%v\n", err)
			return
		}
		pprof.WriteHeapProfile(f2)
		f2.Close()
	}
}

命令行里运行，

go run main.go -cpu=true
go tool pprof .\cpu.prof

在这里插入图片描述
然后敲出top 3就可以看到top 3 所运行的效果，发现logicCode代码运行时间有点长，找到这个问题，发现被阻塞，直接，精简这段代码后，发现效果非常不错。

flat:当前函数分配的内存，不包含它调用其他函数造成的内存分配
flat%：当前函数分配内存占比
sum% :自己和前面所有的flat%累积值
cum: 当前函数及当前函数调用其他函数的分配内存的汇总
cum% :这个函数分配的内存，以及它调用其他函数分配的内存之和

trace

demo

package main

import (
	"fmt"
	"os"
	"runtime/trace"
)

func main() {
	f, err := os.Create("trace.out")
	if err != nil {
		fmt.Println(err)
		return
	}
	defer f.Close()
	if err := trace.Start(f); err != nil {
		fmt.Println(err)
		return
	}
	defer trace.Stop()
	//程序运行逻辑
	for i := 0; i < 10; i++ {
		fmt.Println(Add(i, i+1))
	}
}

func Add(i, j int) int {
	return i + j
}

输入命令

go tool trace trace.out

程序自动打开一个本地web服务器
在这里插入图片描述

2.1 View trace

是指查看完整的 trace 文件，观察程序的执行路径和执行过程，通常会包括以下信息：Goroutine 的生命周期、各种事件（如内存分配、垃圾回收）、调度、等待、阻塞等状态、

2.2 Goroutine analysis

这一部分显示了所有 Goroutine 的活动和状态。Go 的并发模型是基于 Goroutines（轻量级线程）实现的，而通过 Goroutine 分析，你可以：看到每个 Goroutine 执行的任务、分析 Goroutine 是否在阻塞、等待或过度创建等情况、观察 Goroutine 的调度、启动和终止等

2.3 Network blocking profile

这一部分分析网络相关的阻塞情况，通常是程序在进行网络操作时（例如 HTTP 请求、数据库连接等）发生的阻塞。它显示了因为网络操作而导致的延迟或卡住的情况。

2.4 Synchronization blocking profile

Synchronization blocking profile 分析了程序中由于同步操作（如锁、信号量、条件变量等）而发生的阻塞情况。这类同步操作通常是为了保护共享资源，防止数据竞态问题。

2.5 Syscall blocking profile

Syscall blocking profile 分析了程序在进行系统调用（如文件操作、网络 I/O、内存管理等）时发生的阻塞情况

2.6 Scheduler latency profile

Scheduler latency profile 分析了 Go 程序中调度器的延迟问题。

2.7 User-defined tasks

User-defined tasks 是程序员通过 runtime/trace 的 WithRegion 和 Eventf 等接口定义的自定义任务。

2.8 User-defined regions

User-defined regions 是程序员自定义的代码区域，通常用 WithRegion 函数来标记。类似于自定义任务，区域是对程序执行的特定代码块进行的标记，用于分析该区域的执行时长和性能。
用户定义区域让开发者能够细粒度地分析程序的执行，并且将重点放在程序中耗时较长的区域上。

2.9 Minimum mutator utilization

Minimum mutator utilization 是 Go 程序中与垃圾回收（GC）相关的一个度量，表示在 GC 运行期间，程序的“变异器”（Mutator）部分的工作效率。变异器是指执行用户代码的部分，它在 GC 期间可能会被暂停或延迟。如果变异器的利用率较低，意味着 GC 占用了太多的 CPU 时间，导致应用程序性能下降。

参考文章

连接

pprof与trace学习笔记