Debug memory leakage using Jemalloc

The memory is rising as expected, which may be due to memory leakage. For example, the new object is copied again without deleting, so that the previously allocated memory block is suspended, and the application cannot access that part of memory and release it; In C + +, stl containers will have a clear() method, and the elements in the container will be cleaned up along with the RAII principle. However, in addition to STL, strings may be accumulating continuously and allocating new memory blocks to store the growing strings.

stay cppzh group See the discussion on the use of jemalloc to debug the memory occupation, which can clearly dump the memory usage, so I tried.

install

# Used to generate pdf
yum -y install graphviz ghostscript

wget https://github.com/jemalloc/jemalloc/archive/5.1.0.tar.gz
tar zxvf 5.1.0.tar.gz
cd jemalloc-5.1.0/
./autogen.sh
./configure --prefix=/usr/local/jemalloc-5.1.0 --enable-prof
make -j
make install

Use cases and checks during program exit

# run
MALLOC_CONF=prof_leak:true,lg_prof_sample:0,prof_final:true LD_PRELOAD=/usr/local/jemalloc-5.1.0/lib/libjemalloc.so.2  ./a.out

# View memory usage
/usr/local/jemalloc-5.1.0/bin/jeprof a.out jeprof.34447.0.f.heap
> top

Long run - Test Case

For programs that run for a long time, such as server-side programs, it is usually impossible to exit. jemalloc provides a memory dump every time the specified size is increased.

The following example is a program that mock runs for a long time. It tests the sequence container (vector) and associated container (map), string and the most basic new respectively, and executes 1000 times every 100ms, representing the operation of the server.

#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <chrono>
#include <thread>

int main() {

    std::vector<int> vec;
    std::map<int, int> mp;
    std::string s;
    for (;;) {
        for (int i = 0; i < 1000; ++i) {
            vec.push_back(i);
            mp[rand()] = i;
            s += "xxxx";
            new char[4];
        }
        std::this_thread::sleep_for(std::chrono::microseconds(100));
    }

    return 0;
}

Compile run:

g++ test.cc -o a.out

Set the environment variable MALLOC_CONF set to prof:true,lg_prof_interval:26 enable jemalloc to start Prof and dump every 2 ^ 26 bytes (64M), and use LD_PRELOAD environment variable instead.

export MALLOC_CONF="prof:true,lg_prof_interval:26"
LD_PRELOAD=/usr/local/jemalloc-5.1.0/lib/libjemalloc.so.2  ./a.out

[root@pwh c++]# ls -l -t
total 212
-rw-r--r-- 1 root root  5208 Dec 19 14:31 jeprof.17988.17.i17.heap
-rw-r--r-- 1 root root  5206 Dec 19 14:31 jeprof.17988.16.i16.heap
-rw-r--r-- 1 root root  5204 Dec 19 14:31 jeprof.17988.15.i15.heap
-rw-r--r-- 1 root root  5204 Dec 19 14:31 jeprof.17988.14.i14.heap
-rw-r--r-- 1 root root  5204 Dec 19 14:31 jeprof.17988.13.i13.heap
-rw-r--r-- 1 root root  5204 Dec 19 14:31 jeprof.17988.12.i12.heap
-rw-r--r-- 1 root root  5204 Dec 19 14:31 jeprof.17988.11.i11.heap
-rw-r--r-- 1 root root  5200 Dec 19 14:31 jeprof.17988.10.i10.heap
-rw-r--r-- 1 root root  5200 Dec 19 14:31 jeprof.17988.9.i9.heap
-rw-r--r-- 1 root root  5200 Dec 19 14:31 jeprof.17988.8.i8.heap
-rw-r--r-- 1 root root  5198 Dec 19 14:31 jeprof.17988.7.i7.heap
-rw-r--r-- 1 root root  5198 Dec 19 14:31 jeprof.17988.6.i6.heap
...

Result analysis

Since the dump is performed every other segment of memory, each file is the segment information of memory. Use -- base to specify which heap file to start analysis from.

$ /usr/local/jemalloc-5.1.0/bin/jeprof a.out --base=jeprof.17988.0.i0.heap  jeprof.17988.17.i17.heap
$ /usr/local/jemalloc-5.1.0/bin/jeprof a.out --base=jeprof.17988.0.i0.heap  jeprof.17988.17.i17.heap
Using local file a.out.
Argument "MSWin32" isn't numeric in numeric eq (==) at /usr/local/jemalloc-5.1.0/bin/jeprof line 5123.
Argument "linux" isn't numeric in numeric eq (==) at /usr/local/jemalloc-5.1.0/bin/jeprof line 5123.
Using local file jeprof.17988.17.i17.heap.
Welcome to jeprof!  For help, type 'help'.
(jeprof) top
Total: 1002.5 MB
   754.5  75.3%  75.3%    754.5  75.3% __gnu_cxx::new_allocator::allocate@4031fc
   124.0  12.4%  87.6%    124.0  12.4% __gnu_cxx::new_allocator::allocate@402fac
   124.0  12.4% 100.0%    124.0  12.4% std::__cxx11::basic_string::_M_mutate
     0.0   0.0% 100.0%   1002.5 100.0% __libc_start_main
     0.0   0.0% 100.0%   1002.5 100.0% _start
     0.0   0.0% 100.0%   1002.5 100.0% main
     0.0   0.0% 100.0%    754.5  75.3% std::_Rb_tree::_M_create_node
     0.0   0.0% 100.0%    754.5  75.3% std::_Rb_tree::_M_emplace_hint_unique
     0.0   0.0% 100.0%    754.5  75.3% std::_Rb_tree::_M_get_node
     0.0   0.0% 100.0%    124.0  12.4% std::_Vector_base::_M_allocate

# Export as pdf
/usr/local/jemalloc-5.1.0/bin/jeprof --pdf a.out  --base=jeprof.17988.0.i0.heap jeprof.17988.17.i17.heap   > a.pdf

Statistics of memory usage

After taking a new memory interval and exporting it to pdf, 718MB of memory is allocated. 514.5MB is used in the operator overload function of [] in map, 60MB is allocated for string and 60MB is allocated for vector. The call stack of the most basic new char[4] stays in main(), so main() also occupies 84MB. The data obtained is consistent with the Total MB(718.5MB).

ref

Added by kooks on Thu, 27 Jan 2022 16:41:46 +0200