首页 > 其他分享 >gperftools实践

gperftools实践

时间:2022-10-31 12:33:40浏览次数:36  
标签:MALLOC gperf MiB Bytes 实践 heap gperftools tools

内存调试

目标

  • 获取进程内存分配的调用栈,内存占比火焰图;
  • 获得真实的in use内存数据,即不包含tcmalloc/ptmalloc的缓存;

原理

google tcmalloc替换glibc ptmalloc,在api中加代码桩。

实践

依赖项

  • 火焰图:https://github.com/brendangregg/FlameGraph.git
  • gperftools:https://github.com/gperftools/gperftools.git,本文中gperftools的安装操作如下:
1 # gperftools 安装路径
2 gperf_install_base_path="/var/.gperftools/release"
3 
4 # 编译安装
5 cd gperftools && ./autogen.sh && ./configure --prefix=${gperf_install_base_path}/ && make all -j2 && sudo make install

测试代码

将如下代码命名为 t_gperf_tools.cc

 1 #include <stdio.h>
 2 #include <stdlib.h>
 3 #include <malloc.h>
 4 #include <unistd.h>
 5 #include <vector>
 6 #include <map>
 7 #include <iostream>
 8 #include <thread>
 9 
10 #define GPERFTOOLS_EN (1)
11 #if GPERFTOOLS_EN == 1
12 #include "gperftools/heap-profiler.h"
13 #include "gperftools/malloc_extension.h"
14 #endif
15 
16 void MallocTestC() {
17   uint64_t *a;
18   while (1) {
19     a = (uint64_t *)calloc(1024 * 1024, sizeof(uint64_t));
20     sleep(1);
21     printf("Leak size %zu MB for %p\n", sizeof(uint64_t), a);
22   }
23 }
24 
25 void MallocTestCPP() {
26   static std::vector<std::map<uint64_t, std::shared_ptr<std::vector<uint64_t>>>>
27       map_vec;
28   while (1) {
29     if (map_vec.size() >= 40) {
30       malloc_stats();
31       map_vec.clear();
32       map_vec.shrink_to_fit();
33       // MallocExtension::instance()->ReleaseFreeMemory();
34       malloc_stats();
35     }
36 
37     std::map<uint64_t, std::shared_ptr<std::vector<uint64_t>>> map;
38     for (size_t i = 0U; i < 200; i++) {
39       std::shared_ptr<std::vector<uint64_t>> vec =
40           std::make_shared<std::vector<uint64_t>>();
41       //   vec->reserve(1024U);
42       vec->resize(1024U);
43       map.emplace(i, vec);
44     }
45     map_vec.emplace_back(map);
46     usleep(200000);
47   }
48 }
49 
50 int main(int argc, char **argv) {
51   (void)argc;
52   (void)argv;
53 
54 #if GPERFTOOLS_EN == 1
55   HeapProfilerStart("/tmp/t_gperf_tools_O0");
56 #endif
57 
58   std::thread thr{MallocTestCPP};
59   MallocTestC();
60   thr.join();
61 
62 #if GPERFTOOLS_EN == 1
63   HeapProfilerStop();
64 #endif
65   return 0;
66 }

 

内存火焰图

为了看到更完美的调用栈,我们采用-O0编译

1 # 编译
2 g++ t_gperf_tools.c -O0 -g -o t_gperf_tools -lpthread -ltcmalloc -L/var/.gperftools/release/lib -I/var/.gperftools/release/include
3 
4 # 运行,以1秒为间隔输出HeapProfiler文件
5 LD_LIBRARY_PATH=/var/.gperftools/release/lib HEAP_PROFILE_TIME_INTERVAL="1" ./t_gperf_tools

运行后,能看到输出的 HeapProfiler 文件

 1 Starting tracking the heap
 2 Dumping heap profile to /tmp/t_gperf_tools_O0.0001.heap (1667187833 sec since the last dump)
 3 Dumping heap profile to /tmp/t_gperf_tools_O0.0002.heap (1 sec since the last dump)
 4 Leak size 8 MB for 0x562deff26000
 5 Dumping heap profile to /tmp/t_gperf_tools_O0.0003.heap (1 sec since the last dump)
 6 Leak size 8 MB for 0x562df0f1c000
 7 Dumping heap profile to /tmp/t_gperf_tools_O0.0004.heap (1 sec since the last dump)
 8 Leak size 8 MB for 0x562df1f0a000
 9 Dumping heap profile to /tmp/t_gperf_tools_O0.0005.heap (1 sec since the last dump)
10 Leak size 8 MB for 0x562df2ef4000
11 Dumping heap profile to /tmp/t_gperf_tools_O0.0006.heap (1 sec since the last dump)
12 Leak size 8 MB for 0x562df3d50000
13 Dumping heap profile to /tmp/t_gperf_tools_O0.0007.heap (1 sec since the last dump)
14 Leak size 8 MB for 0x562df4d3c000
15 Dumping heap profile to /tmp/t_gperf_tools_O0.0008.heap (1 sec since the last dump)
16 Leak size 8 MB for 0x562df5d26000
17 Dumping heap profile to /tmp/t_gperf_tools_O0.0009.heap (1 sec since the last dump)
18 Leak size 8 MB for 0x562df6b80000

生成火焰图

1 # 解析 HeapProfiler 文件
2 /var/.gperftools/release/bin/pprof --collapsed ./t_gperf_tools /tmp/t_gperf_tools_O0.0009.heap > gperf.stacks
3 
4 # 生成火焰图
5 cat gperf.stacks | /home/user/disk/prjs/perf/mdc_perf/perf/FlameGraph/flamegraph.pl --color=mem --title="malloc() Flame Graph" --countname="calls" > gperf.svg
6 
7 # 打开火焰图
8 google-chrome gperf.svg

火焰图如下,对比代码,可清晰看到内存的消耗位置

 

进程实际in_use内存量

malloc_stats

应用程序申请内存,常规流程是经由glibc再到kernel的syscall,而glibc的内存管理为了减少申请/释放内存的系统调用,会为做一层 内存 的 缓存。

无论是glibc默认的ptmalloc,还是google的tcmalloc,都提供了“malloc_stats”API,用于获取当前进程总的内存占用量,以及实际的使用量,两者相减即为 此进程在glibc的缓存大小。

此处以tcmalloc为例,对应代码中的30、34两行,两行的输出对比可知,即使做了shrink_to_fit,应用释放内存也只是回到了page heap freelist。

# 30行 malloc_stats 的打印
------------------------------------------------
MALLOC:      142024744 (  135.4 MiB) Bytes in use by application
MALLOC: +       442368 (    0.4 MiB) Bytes in page heap freelist
MALLOC: +       120760 (    0.1 MiB) Bytes in central cache freelist
MALLOC: +            0 (    0.0 MiB) Bytes in transfer cache freelist
MALLOC: +        18464 (    0.0 MiB) Bytes in thread cache freelists
MALLOC: +      2752512 (    2.6 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Actual memory used (physical + swap)
MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Virtual address space used
MALLOC:
MALLOC:           4139              Spans in use
MALLOC:              3              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

# 34行 malloc_stats 的打印
------------------------------------------------
MALLOC:       75589672 (   72.1 MiB) Bytes in use by application
MALLOC: +     60833792 (   58.0 MiB) Bytes in page heap freelist
MALLOC: +      1231224 (    1.2 MiB) Bytes in central cache freelist
MALLOC: +      1277952 (    1.2 MiB) Bytes in transfer cache freelist
MALLOC: +      3673696 (    3.5 MiB) Bytes in thread cache freelists
MALLOC: +      2752512 (    2.6 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Actual memory used (physical + swap)
MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Virtual address space used
MALLOC:
MALLOC:            596              Spans in use
MALLOC:              3              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------

MallocExtension::instance()->ReleaseFreeMemory()

当应用想要将 glibc 的缓存 手动还给 内核时,tcmalloc 提供了 ReleaseFreeMemory 的API。

CPU性能调试

标签:MALLOC,gperf,MiB,Bytes,实践,heap,gperftools,tools
From: https://www.cnblogs.com/zengjianrong/p/16843885.html

相关文章

  • 实验7:基于REST API的SDN北向应用实践
    实验7:基于RESTAPI的SDN北向应用实践一、实验目的1.能够编写程序调用OpenDaylightRESTAPI实现特定网络功能;2.能够编写程序调用RyuRESTAPI实现特定网络功能。二、实......
  • 实验7:基于REST API的SDN北向应用实践
    实验7:基于RESTAPI的SDN北向应用实践一、实验目的能够编写程序调用OpenDaylightRESTAPI实现特定网络功能;能够编写程序调用RyuRESTAPI实现特定网络功能。二、实验......
  • Go 语言 context 最佳实践
    01 介绍Go语言在v1.7引入 context 包,关于它的使用方式,我们在之前的文章中已经介绍过,感兴趣的读者朋友们可以翻阅。本文我们介绍 context 包的最佳实践,包括传值......
  • 实验7:基于REST API的SDN北向应用实践
    实验7:基于RESTAPI的SDN北向应用实践一、实验目的能够编写程序调用OpenDaylightRESTAPI实现特定网络功能;能够编写程序调用RyuRESTAPI实现特定网络功能。二、实验......
  • 实验7:基于REST API的SDN北向应用实践
    一、实验目的能够编写程序调用OpenDaylightRESTAPI实现特定网络功能;能够编写程序调用RyuRESTAPI实现特定网络功能。二、实验环境下载虚拟机软件OracleVisualB......
  • 实验6:开源控制器实践——RYU
    实验6:开源控制器实践——RYU一、实验目的能够独立部署RYU控制器;能够理解RYU控制器实现软件定义的集线器原理;能够理解RYU控制器实现软件定义的交换机原理。二、实验......
  • 实验6:开源控制器实践——RYU
    a)搭建下图所示SDN拓扑,协议使用OpenFlow1.0,并连接Ryu控制器,通过Ryu的图形界面查看网络拓扑。b)运行当中的L2Switch,h1pingh2或h3,在目标主机使用tcpdump验证L2Switch......
  • 实验6:开源控制器实践——RYU
    基本要求a)搭建下图所示SDN拓扑,协议使用OpenFlow1.0,并连接Ryu控制器,通过Ryu的图形界面查看网络拓扑。b)运行当中的L2Switch,h1pingh2或h3,在目标主机使用tcpdump验......
  • 实验6:开源控制器实践——RYU
    一、实验目的1、能够独立部署RYU控制器;2、能够理解RYU控制器实现软件定义的集线器原理;3、能够理解RYU控制器实现软件定义的交换机原理。二、实验环境Ubuntu20.04Desktop......
  • 实验7:基于REST API的SDN北向应用实践
    实验目的能够编写程序调用OpenDaylightRESTAPI实现特定网络功能;能够编写程序调用RyuRESTAPI实现特定网络功能。实验要求(一)基本要求编写Python程序,调用OpenDayl......