pprof actual combat - memory leak

scene

Multiple services are running in one server. After running for a period of time (less than one day), it is found that a process occupies more than 50% of memory resources, resulting in server exceptions

Solution

First, start pprof performance analysis in the code

runtime.SetBlockProfileRate(1)
	go func() {
		log.Println(http.ListenAndServe(":9080", nil))
	}()

Use the method of viewing in the web interface. Listen to port 9080 here
Open the browser to access the web page: http://ip:1080/debug/pprof/ visit

/debug/pprof/

Types of profiles available:
Count	Profile
4320	allocs
265	    block
0	    cmdline
110	    goroutine
4320	heap
0	    mutex
0	    profile
48	    threadcreate
0	    trace
full    goroutine stack dump
Profile Descriptions:

allocs: A sampling of all past memory allocations
block: Stack traces that led to blocking on synchronization primitives
cmdline: The command line invocation of the current program
goroutine: Stack traces of all current goroutines
heap: A sampling of memory allocations of live objects. You can specify the gc GET parameter to run GC before taking the heap sample.
mutex: Stack traces of holders of contended mutexes
profile: CPU profile. You can specify the duration in the seconds GET parameter. After you get the profile file, use the go tool pprof command to investigate the profile.
threadcreate: Stack traces that led to the creation of new OS threads
trace: A trace of execution of the current program. You can specify the duration in the seconds GET parameter. After you get the trace file, use the go tool trace command to investigate the trace.

When it comes to memory leaks, the first reaction is to check the heap. In fact, when checking

heap profile: 718: 76326576 [829493: 1114177328] @ heap/1048576
1: 75759616 [1: 75759616] @ 0x40f2f9 0x410dc9 0x4143aa 0x685cee 0x686045 0x6ad612 0x6ad345 0xab8dca 0xab8da3 0x9cf4fc 0x99a1e9 0x9cf6bb 0x667b4c 0x99bd85 0x9b9eb3 0x7dbf03 0x7d742d 0x471801
#	0x685ced	github.com/gogf/gf/container/gmap.(*StrAnyMap).doSetWithLockCheck+0x14d	C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/container/gmap/gmap_hash_str_any_map.go:220
#	0x686044	github.com/gogf/gf/container/gmap.(*StrAnyMap).GetOrSetFuncLock+0xa4	C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/container/gmap/gmap_hash_str_any_map.go:254
#	0x6ad611	github.com/gogf/gf/os/gmlock.(*Locker).getOrNewMutex+0x51		C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/os/gmlock/gmlock_locker.go:130
#	0x6ad344	github.com/gogf/gf/os/gmlock.(*Locker).Lock+0x44			C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/os/gmlock/gmlock_locker.go:33
#	0xab8dc9	github.com/gogf/gf/os/gmlock.Lock+0x3e9					C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/os/gmlock/gmlock.go:19
#	0xab8da2	avrtc/app/system/admin/internal/api.(*wssApi).Init.func1+0x3c2		C:/Users/DELL/Desktop/signal/app/system/admin/internal/api/websocket.go:200
#	0x9cf4fb	github.com/gogf/gf/net/ghttp.(*middleware).Next.func1.8+0x3b		C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/net/ghttp/ghttp_request_middleware.go:111
#	0x99a1e8	github.com/gogf/gf/net/ghttp.niceCallFunc+0x48				C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/net/ghttp/ghttp_func.go:46
#	0x9cf6ba	github.com/gogf/gf/net/ghttp.(*middleware).Next.func1+0x13a		C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/net/ghttp/ghttp_request_middleware.go:110
#	0x667b4b	github.com/gogf/gf/util/gutil.TryCatch+0x6b				C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/util/gutil/gutil.go:46
#	0x99bd84	github.com/gogf/gf/net/ghttp.(*middleware).Next+0x164			C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/net/ghttp/ghttp_request_middleware.go:47
#	0x9b9eb2	github.com/gogf/gf/net/ghttp.(*Server).ServeHTTP+0x652			C:/Users/DELL/go/pkg/mod/github.com/gogf/gf@v1.15.4/net/ghttp/ghttp_server_handler.go:122
#	0x7dbf02	net/http.serverHandler.ServeHTTP+0xa2					C:/Program Files/Go/src/net/http/server.go:2887
#	0x7d742c	net/http.(*conn).serve+0x8cc						C:/Program Files/Go/src/net/http/server.go:1952

It can be found that almost all of them are at the framework and system level, which is of little significance to the location of memory leakage. Here, I borrow the view of the boss to explain some of pprof's views on heap:

  • Memory profiling records the heap memory allocation and call stack information, not the complete memory of the process. I guess this is why it is called heap instead of memory in go pprof.
  • The allocation of stack memory is the memory that will be released after calling the stack, so it is not in the memory profile.
  • Memory profiling is based on sampling. By default, profile recording is performed once every 1000 heap memory allocations.
  • Because memory profiling is based on sampling and it tracks the allocated memory rather than the memory in use (for example, some memory has been allocated, seemingly used, but actually not used, such as the part of memory leakage), memory profiling cannot be used to measure the overall memory usage of the program.

In the actual scenario, the memory leakage is more caused by improper management of goroutine. At this time, let's check the contents of goroutine

goroutine profile: total 13647
11274 @ 0x43abc5 0x4068cf 0x40650b 0xab87b5 0x471801
#	0xab87b4	avrtc/app/system/admin/internal/api.notifyFxn+0x54	C:/Users/DELL/Desktop/signal/app/system/admin/internal/api/websocket.go:156

To illustrate the data here, the total at the beginning is the total number of CO process blocks, and the number at the beginning of each item represents the number of CO process blocks of the item, which can be clearly seen

websocket.go:156, that is, 11274 processes are opened in 156 lines of the file. Even if we only calculate one process according to 2kb, this is a number that can not be ignored

Solution result

Locate the number of rows, then it's easy to do

func notifyFxn(ws *ghttp.WebSocket, c chan string, lock string, done *bool) {
	for ;*done == false; {
		notify := <-c
		glog.Line().Debugf("Notify %s\n", notify)
		gmlock.Lock(lock)
		ws.WriteMessage(2, []byte(notify))
		gmlock.Unlock(lock)
	}
}

See that line 156 is blocked due to the notify: = < - C channel. So far, the positioning is over

This is a small actual combat case of pprof, which will be supplemented later with actual scenes

Keywords: Go server Memory Leak http

Added by shaoen01 on Wed, 23 Feb 2022 10:21:02 +0200