首页 > 数据库 >PostgreSQL 源码性能诊断(perf profiling)指南(含火焰图生成分析FlameGraph) - 珍藏级

PostgreSQL 源码性能诊断(perf profiling)指南(含火焰图生成分析FlameGraph) - 珍藏级

时间:2023-08-19 15:45:31浏览次数:63  
标签:profiling PostgreSQL postgres digoal 0.00% 源码 home block perf

PostgreSQL 源码性能诊断(perf profiling)指南(含火焰图生成分析FlameGraph) - 珍藏级

作者

digoal

日期

2016-11-28

标签

PostgreSQL , Linux , perf , 性能诊断 , stap , systemtap , strace , dtrace , dwarf , profiler , perf_events , probe , dynamic probe , tracepoint


背景

数据库的性能优化是一个非常经典的话题,数据库的优化手段以及优化的角度也各不相同。

例如,可以从OS内核、网络、块设备、编译器、文件系统、SQL、数据库参数、业务逻辑、源码等各个方面去进行优化。

但是如果在优化前了解瓶颈在什么地方,可以向打鼹鼠一样,先打大的,起到事半功倍的效果。

本文将针对Linux平台下的PostgreSQL服务,讲解如何诊断PostgreSQL 数据库服务的瓶颈,了解瓶颈在哪个方面,甚至可以了解到瓶颈在哪段代码?

有了比较详细的性能诊断报告,可以给出比较好的指导优化建议。

我们要用到的工具是 《Linux 性能诊断(profiling) perf使用指南》

使用perf跟踪PostgreSQL的前提

要输出完备的跟踪信息,如符号表,call stack traces, 汇编指令。必须在编译PostgreSQL时设置相应的编译开关,例如GCC需要开启如下

CFLAGS="-g -ggdb -fno-omit-frame-pointer"  
 

建议Linux内核编译时加上

CONFIG_KALLSYMS=y
CONFIG_FRAME_POINTER=y 
 

编译perf时需要支持libunwind, 并加上

gcc -g dwarf

man gcc
       -gdwarf-version
           Produce debugging information in DWARF format (if that is supported).  This is the format used by DBX on IRIX 6.  The value of version may be either 2 or 3; the default version is 3.

           Note that with DWARF version 2 some ports require, and will always use, some non-conflicting DWARF 3 extensions in the unwind tables.
 

如果是yum安装的软件,可以安装对应的debuginfo包。

例子

wget https://ftp.postgresql.org/pub/source/v9.6.1/postgresql-9.6.1.tar.bz2 
tar -jxvf postgresql-9.6.1.tar.bz2 

cd postgresql-9.6.1
LIBS=-lpthread CC="/home/digoal/gcc6.2.0/bin/gcc" CFLAGS="-O3 -flto -g -ggdb -fno-omit-frame-pointer" ./configure --prefix=/home/digoal/pgsql9.6
LIBS=-lpthread CC="/home/digoal/gcc6.2.0/bin/gcc" CFLAGS="-O3 -flto -g -ggdb -fno-omit-frame-pointer" make world -j 64
LIBS=-lpthread CC="/home/digoal/gcc6.2.0/bin/gcc" CFLAGS="-O3 -flto -g -ggdb -fno-omit-frame-pointer" make install-world
 

常用跟踪手段

1. benchmark

首先,需要定一个目标,比如insert only的场景,如何找到其瓶颈。

创建测试表

postgres=# create table perf_test_ins(id serial8 primary key, info text, crt_time timestamp);
alCREATE TABLE
postgres=# alter sequence perf_test_ins_id_seq cache 10000;
ALTER SEQUENCE
 

创建测试脚本,不停插入

vi test.sql

insert into perf_test_ins(info,crt_time) values ('test',now());
 

疲劳压测开始

pgbench -M prepared -n -r -P 1 -f ./test.sql -h $PGDATA -c 32 -j 32 -T 600
 

2. 实时跟踪

pgbench压测的同时,实时观测。

su - root

perf top -agv -F 1000


解释
perf top --help

NAME
       perf-top - System profiling tool.

SYNOPSIS
       perf top [-e <EVENT> | --event=EVENT] [<options>]

DESCRIPTION
       This command generates and displays a performance counter profile in real time.

       -F <freq>, --freq=<freq>
           Profile at this frequency.
       -a, --all-cpus
           System-wide collection. (default)
       -g, --group
           Put the counters into a counter group.
       -K, --hide_kernel_symbols
           Hide kernel symbols.
       -U, --hide_user_symbols
           Hide user symbols.
       -v, --verbose
           Be more verbose (show counter open errors, etc).
 

输出

   PerfTop:   23287 irqs/sec  kernel:31.2%  exact:  0.0% [1000Hz cycles],  (all, 24 CPUs)
--------------------------------------------------------------------------------------------------------------

samples  pcnt         RIP        function                                  DSO
_______ _____ ________________ _________________________________________ _____________________________________

4179.00  1.6% 0000000000524e10 AllocSetAlloc.lto_priv.1186               /home/digoal/pgsql9.6/bin/postgres 
3966.00  1.5% 0000000000043d10 __GI_vfprintf                             /lib64/libc-2.12.so                  
3822.00  1.5% ffffffff8150bb50 schedule                                  [kernel.kallsyms]                    
3659.00  1.4% 0000000000082e10 __memset_sse2                             /lib64/libc-2.12.so                  
3523.00  1.4% 00000000005782d0 LWLockAcquire.constprop.931               /home/digoal/pgsql9.6/bin/postgres 
3219.00  1.2% 00000000002c9390 LWLockRelease                             /home/digoal/pgsql9.6/bin/postgres 
3114.00  1.2% 0000000000332e10 PostgresMain                              /home/digoal/pgsql9.6/bin/postgres 
3042.00  1.2% ffffffff8150e8b0 _spin_lock                                [kernel.kallsyms]                    
2948.00  1.1% 0000000000573fe0 hash_search_with_hash_value.constprop.930 /home/digoal/pgsql9.6/bin/postgres 
2901.00  1.1% 00000000005746d0 hash_search_with_hash_value.constprop.928 /home/digoal/pgsql9.6/bin/postgres 
2848.00  1.1% 00000000002db5f0 GetSnapshotData                           /home/digoal/pgsql9.6/bin/postgres 
2471.00  1.0% 00000000005338b0 hash_seq_search                           /home/digoal/pgsql9.6/bin/postgres 
2275.00  0.9% 0000000000579ad0 LWLockAcquire.constprop.932               /home/digoal/pgsql9.6/bin/postgres 
2244.00  0.9% 0000000000078460 _int_malloc                               /lib64/libc-2.12.so                  
2215.00  0.9% 0000000000088aa0 memcpy                                    /lib64/libc-2.12.so                  
2163.00  0.8% 00000000002df050 LockAcquireExtended                       /home/digoal/pgsql9.6/bin/postgres 
2057.00  0.8% 0000000000131a70 __strlen_sse42                            /lib64/libc-2.12.so                  
2017.00  0.8% 00000000002ba2d0 _bt_compare                               /home/digoal/pgsql9.6/bin/postgres 
1977.00  0.8% 0000000000006e40 doCustom.lto_priv.13                      /home/digoal/pgsql9.6/bin/pgbench  
1951.00  0.8% ffffffff8100a630 __switch_to                               [kernel.kallsyms]                    
1855.00  0.7% 000000000053d5e0 hash_search_with_hash_value               /home/digoal/pgsql9.6/bin/postgres 
1767.00  0.7% 00000000001c6670 hash_any                                  /home/digoal/pgsql9.6/bin/postgres 
1726.00  0.7% 00000000002c8310 PinBuffer.isra.3.lto_priv.2340            /home/digoal/pgsql9.6/bin/postgres 
1668.00  0.6% 00000000000ee570 XLogInsertRecord                          /home/digoal/pgsql9.6/bin/postgres 
1604.00  0.6% 0000000000250500 pgstat_report_activity                    /home/digoal/pgsql9.6/bin/postgres 
1572.00  0.6% ffffffff81189c00 fget_light                                [kernel.kallsyms]                    
1565.00  0.6% 0000000000426970 SearchCatCache                            /home/digoal/pgsql9.6/bin/postgres 
1460.00  0.6% ffffffff810e03a0 __audit_syscall_exit                      [kernel.kallsyms]                    
1415.00  0.5% ffffffff81057e80 update_curr                               [kernel.kallsyms]                    
1409.00  0.5% 00000000000777f0 _int_free                                 /lib64/libc-2.12.so                  
1386.00  0.5% 00000000000fbc10 CommitTransaction.lto_priv.2946           /home/digoal/pgsql9.6/bin/postgres 
1296.00  0.5% ffffffff8119f6d0 do_select                                 [kernel.kallsyms]                    
1285.00  0.5% 00000000002c8690 LWLockWaitListLock.lto_priv.1265          /home/digoal/pgsql9.6/bin/postgres 
1268.00  0.5% ffffffff8150e790 _spin_lock_irqsave                        [kernel.kallsyms]                    
1247.00  0.5% 0000000000577650 hash_search_with_hash_value.constprop.929 /home/digoal/pgsql9.6/bin/postgres 
1220.00  0.5% 0000000000538a10 AllocSetFree.lto_priv.1187                /home/digoal/pgsql9.6/bin/postgres 
1201.00  0.5% 0000000000389660 ExecModifyTable                           /home/digoal/pgsql9.6/bin/postgres 
1185.00  0.5% 000000000009b6d0 gettimeofday                              /lib64/libc-2.12.so                  
1171.00  0.5% ffffffff81278a60 copy_user_generic_string                  [kernel.kallsyms]                    
1142.00  0.4% 000000000013cb10 __memcmp_sse4_1                           /lib64/libc-2.12.so                  
1124.00  0.4% 0000000000032620 __GI___sigsetjmp                          /lib64/libc-2.12.so                  
1123.00  0.4% 00000000000ef2b0 XLogInsert                                /home/digoal/pgsql9.6/bin/postgres 
1112.00  0.4% 00000000002d29e0 LockReleaseAll                            /home/digoal/pgsql9.6/bin/postgres 
1108.00  0.4% 00000000003c7620 ExecInitExpr                              /home/digoal/pgsql9.6/bin/postgres 
1056.00  0.4% 000000000000ac80 threadRun.lto_priv.20                     /home/digoal/pgsql9.6/bin/pgbench  
1052.00  0.4% ffffffff81062a40 select_task_rq_fair                       [kernel.kallsyms]                    
1033.00  0.4% 0000000000017710 pqParseInput3                             /home/digoal/pgsql9.6/lib/libpq.so.5.9
 

按Enter键弹出可以配置的选项以及当前状态。

Mapped keys:
        [d]     display refresh delay.                  (2)
        [e]     display entries (lines).                (47)
        [f]     profile display filter (count).         (5)
        [F]     annotate display filter (percent).      (5%)
        [s]     annotate symbol.                        (NULL)
        [S]     stop annotation.
        [K]     hide kernel_symbols symbols.            (no)   是否显示kernel symbols
        [U]     hide user symbols.                      (no)   是否显示user symbols
        [z]     toggle sample zeroing.                  (0)
        [qQ]    quit.
 

调整压测命令,连接回环地址,(前面连接的是unix socket)

pgbench -M prepared -n -r -P 1 -f ./test.sql -h 127.0.0.1 -c 32 -j 32 -T 600
 

再次观测perf top结果, 采样消耗排名第一的变成了这条,找到了一个瓶颈,关闭iptables,这条就消失了,TPS也提升了。

5871.00  1.7% 0000000000002e40 ipt_do_table                              /lib/modules/2.6.32-358.23.2.ali1195.el6.x86_64/kernel/net/ipv4/netfilter/ip_tables.ko
 

3. 跟踪事件计数(指定事件)

指定命令进行跟踪,或者指定PID进行跟踪。

NAME
       perf-stat - Run a command and gather performance counter statistics

SYNOPSIS
       perf stat [-e <EVENT> | --event=EVENT] [-a] <command>
       perf stat [-e <EVENT> | --event=EVENT] [-a] — <command> [<options>]

DESCRIPTION
       This command runs a command and gathers performance counter statistics from it.
       -a, --all-cpus
           system-wide collection from all CPUs
       -c, --scale
           scale/normalize counter values
       -B, --big-num
           print large numbers with thousands′ separators according to locale
       -v, --verbose
           be more verbose (show counter open errors, etc)
       -p, --pid=<pid>
           stat events on existing process id
       -t, --tid=<tid>
           stat events on existing thread id
       -r, --repeat=<n>
           repeat command and print average + stddev (max: 100)
 

跟踪批量插入一批记录的事件调用统计

event来自perf list的输出

perf stat -e block:*,syscalls:*,xfs:* -a -B -v -c -- /home/digoal/pgsql9.6/bin/psql -h 127.0.0.1 -p 5288 -q -U postgres postgres -c "drop table if exists x; create table x as select a FROM generate_series(1,1000000) a;";
 

输出

                 0 block:block_rq_abort                                         [98.79%]
                 0 block:block_rq_requeue                                       [98.79%]
               995 block:block_rq_complete                                      [98.79%]
               993 block:block_rq_insert                                        [98.79%]
               997 block:block_rq_issue                                         [98.79%]
                 0 block:block_bio_bounce                                       [98.79%]
                 0 block:block_bio_complete                                     [98.79%]
                18 block:block_bio_backmerge                                    [98.79%]
                 1 block:block_bio_frontmerge                                    [98.79%]
             1,011 block:block_bio_queue                                        [98.79%]
               992 block:block_getrq                                            [98.79%]
                 2 block:block_sleeprq                                          [98.79%]
               250 block:block_plug                                             [98.79%]
                 2 block:block_unplug_timer                                     [98.79%]
               361 block:block_unplug_io                                        [98.79%]
                 0 block:block_split                                            [98.79%]
             1,011 block:block_remap                                            [98.79%]
                 0 block:block_rq_remap                                         [98.79%]
                 5 syscalls:sys_enter_socket                                    [98.79%]
                 5 syscalls:sys_exit_socket                                     [98.79%]
                 0 syscalls:sys_enter_socketpair                                    [98.79%]
                 0 syscalls:sys_exit_socketpair                                    [98.79%]
... 略 ...
 

计数例子

perf stat -a -B -v -c -- /home/digoal/pgsql9.6/bin/psql -h 127.0.0.1 -p 5288 -q -U postgres postgres -c "drop table if exists x; create table x as select a FROM generate_series(1,1000000) a;";

      22433.920027 task-clock                #   24.042 CPUs utilized           [100.00%]
             4,275 context-switches          #    0.000 M/sec                   [100.00%]
                98 CPU-migrations            #    0.000 M/sec                   [100.00%]
            56,484 page-faults               #    0.003 M/sec                  
     2,687,623,473 cycles                    #    0.120 GHz                     [80.73%]
     2,747,044,652 stalled-cycles-frontend   #  102.21% frontend cycles idle    [92.42%]
     2,358,755,489 stalled-cycles-backend    #   87.76% backend  cycles idle    [81.89%]
     3,373,693,967 instructions              #    1.26  insns per cycle         越高越好,说明CPU没闲着。至少要大于1,但是注意同步锁等待这个值也可能很高,CPU空转,这就不好了。  
                                             #    0.81  stalled cycles per insn [91.50%]  越低越好,说明CPU没有闲着。
       632,351,270 branches                  #   28.187 M/sec                   [91.82%]
         1,236,798 branch-misses             #    0.20% of all branches         [53.19%]

       0.933128299 seconds time elapsed
 

perf stat重点看的是instructions,判断CPU是否充分被利用,如果没有,说明系统有其他瓶颈,例如IO等待,或者其他等待。

4. 采样跟踪

这个是用得最多的,先采样,后分析报告。

NAME
       perf-record - Run a command and record its profile into perf.data

SYNOPSIS
       perf record [-e <EVENT> | --event=EVENT] [-l] [-a] <command>
       perf record [-e <EVENT> | --event=EVENT] [-l] [-a] — <command> [<options>]

DESCRIPTION
       This command runs a command and gathers a performance counter profile from it, into perf.data - without displaying anything.

       This file can then be inspected later on, using perf report.

       -A, --append
           Append to the output file to do incremental profiling.
       -o, --output=
           Output file name.
       -e, --event=          event来自perf list的输出
           Select the PMU event. 
	   Selection can be a symbolic event name (use perf list to list all events) or a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a hexadecimal event descriptor.
       -a, --all-cpus
           System-wide collection from all CPUs.
       -g, --call-graph
           Do call-graph (stack chain/backtrace) recording.
       -v, --verbose
           Be more verbose (show counter open errors, etc).
       -s, --stat   相对的是默认的per cpu mode
           Per thread counts.
       -G name,..., --cgroup name,...   如果你只想跟踪某一些进程的统计信息,可以将这些进程放到一个cgroup中,跟踪这个cgroup的。    
           monitor only in the container (cgroup) called "name". 
	   This option is available only in per-cpu mode. 
	   The cgroup filesystem must be mounted. 
	   All threads belonging to container "name" are monitored when they run on the monitored CPUs. 
	   Multiple cgroups can be provided. 
	   Each cgroup is applied to the corresponding event, i.e., first cgroup to first event, second cgroup to second event and so on. It is
           possible to provide an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. 
	   Cgroups must have corresponding events, i.e., they always refer to events defined earlier on the command line.
 

跑pgbench压测,和前面的pgbench一样,不断并发插入。

然后开启统计收集,收集10秒的统计信息,默认输出到perf.data文件。

perf record -avg -- sleep 10

[ perf record: Woken up 142 times to write data ]
[ perf record: Captured and wrote 37.111 MB perf.data (~1621415 samples) ]
 

5. 生成报告

前面使用perf record收集了统计信息到perf.data,接下来就对perf.data进行分析,输出报告。

NAME
       perf-report - Read perf.data (created by perf record) and display the profile

SYNOPSIS
       perf report [-i <file> | --input=file]

DESCRIPTION
       This command displays the performance counter profile information recorded via perf record.

       -i, --input=
           Input file name. (default: perf.data)
       -v, --verbose
           Be more verbose. (show symbol address, etc)
       -n, --show-nr-samples
           Show the number of samples for each symbol
       --showcpuutilization
           Show sample percentage for different cpu modes.
       -T, --threads
           Show per-thread event counters
       -U, --hide-unresolved
           Only display entries resolved to a symbol.
       -s, --sort=
           Sort by key(s): pid, comm, dso, symbol, parent.
       -g [type,min], --call-graph
           Display call chains using type and min percent threshold. type can be either:
           ·   flat: single column, linear exposure of call chains.
           ·   graph: use a graph tree, displaying absolute overhead rates.
           ·   fractal: like graph, but displays relative rates. Each branch of the tree is considered as a new profiled object.
               Default: fractal,0.5.
       --stdio
           Use the stdio interface.
       --tui
           Use the TUI interface, that is integrated with annotate and allows zooming into DSOs or threads, among other features. Use of --tui requires a tty, if one is not present, as when piping to other
           commands, the stdio interface is used.
 

--tui模式可以查看概览,支持交互式指令。

输入E全展开,输入C全收敛。  展开后可以使用键盘的上下键进行浏览。  
[.] 表示userspace
[k] 表示kernel

1.39%     0.00%     0.01%         15        postgres  /home/digoal/pgsql9.6/bin/postgres             0x524e3a         d [.] AllocSetAlloc.lto_priv.1186              
1.29%     0.00%     0.01%         22        postgres  /lib64/libc-2.12.so                              0x83127          B [.] __memset_sse2                            
1.25%     0.00%     0.00%          1        postgres  /home/digoal/pgsql9.6/bin/postgres             0x57831d         d [.] LWLockAcquire.constprop.931              
1.24%     0.00%     0.05%        107        postgres  /home/digoal/pgsql9.6/bin/postgres             0x574751         d [.] hash_search_with_hash_value.constprop.928
1.07%     0.00%     0.00%          2        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2c946f         d [.] LWLockRelease                            
1.01%     0.00%     0.00%          1        postgres  /home/digoal/pgsql9.6/bin/postgres             0x574035         d [.] hash_search_with_hash_value.constprop.930
0.98%     0.00%     0.03%         74        postgres  /home/digoal/pgsql9.6/bin/postgres             0x334832         d [.] PostgresMain                             
0.94%     0.00%     0.02%         49        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2db87b         d [.] GetSnapshotData                          
0.94%     0.00%     0.00%          1        postgres  [kernel.kallsyms]                                0xffffffff8150e8b1 k [k] _spin_lock                             
0.83%     0.00%     0.00%          1        postgres  [kernel.kallsyms]                                0xffffffff8150edb0 k [k] page_fault                             
0.81%     0.00%     0.00%          1        postgres  /lib64/libc-2.12.so                              0x88aa0          B [.] memcpy                                   
0.77%     0.00%     0.22%        465        postgres  /home/digoal/pgsql9.6/bin/postgres             0x533939         d [.] hash_seq_search                          
0.75%     0.00%     0.00%          1        postgres  /lib64/libc-2.12.so                              0x454ac          B [.] __GI_vfprintf                            
0.75%     0.00%     0.03%         59        postgres  /home/digoal/pgsql9.6/bin/postgres             0x579b1d         d [.] LWLockAcquire.constprop.932              
0.73%     0.00%     0.02%         42        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2df3ab         d [.] LockAcquireExtended                      
0.67%     0.00%     0.00%          2        postgres  /home/digoal/pgsql9.6/bin/postgres             0x1c66f9         d [.] hash_any                                 
0.65%     0.00%     0.02%         49        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2ba33a         d [.] _bt_compare                              
0.63%     0.00%     0.00%          3        postgres  /home/digoal/pgsql9.6/bin/postgres             0xee632          d [.] XLogInsertRecord                         
0.62%     0.00%     0.00%          1        postgres  /home/digoal/pgsql9.6/bin/postgres             0x53d5e1         d [.] hash_search_with_hash_value              
0.59%     0.00%     0.59%       1318        pgbench   /lib64/libc-2.12.so                              0x47956          B [.] __GI_vfprintf                  
0.57%     0.00%     0.00%          1        postgres  /lib64/libc-2.12.so                              0x131aa7         B [.] __strlen_sse42                           
0.55%     0.00%     0.55%       1241        pgbench   /home/digoal/pgsql9.6/bin/pgbench              0x85b3           d [.] doCustom.lto_priv.13           
0.55%     0.00%     0.03%         55        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2c8327         d [.] PinBuffer.isra.3.lto_priv.2340           
0.54%     0.00%     0.02%         35        postgres  /home/digoal/pgsql9.6/bin/postgres             0x25059f         d [.] pgstat_report_activity                   
0.53%     0.00%     0.02%         43        postgres  /lib64/libc-2.12.so                              0x78462          B [.] _int_malloc                              
0.52%     0.00%     0.01%         33        postgres  /home/digoal/pgsql9.6/bin/postgres             0x426a36         d [.] SearchCatCache                           
0.47%     0.00%     0.00%          3        postgres  [kernel.kallsyms]                                0xffffffff81272a1c k [k] radix_tree_lookup_slot                 
0.45%     0.00%     0.01%         27        postgres  /home/digoal/pgsql9.6/bin/postgres             0xfbe51          d [.] CommitTransaction.lto_priv.2946          
0.44%     0.00%     0.00%          1        postgres  /home/digoal/pgsql9.6/bin/postgres             0x5776f5         d [.] hash_search_with_hash_value.constprop.929
0.43%     0.00%     0.01%         30        postgres  /home/digoal/pgsql9.6/bin/postgres             0x38a044         d [.] ExecModifyTable                          
0.41%     0.00%     0.00%          1        postgres  [kernel.kallsyms]                                0xffffffff8150bef4 k [k] schedule                               
0.40%     0.00%     0.01%         22        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2c86c2         d [.] LWLockWaitListLock.lto_priv.1265         
0.40%     0.00%     0.01%         29        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2d2b2d         d [.] LockReleaseAll                           
0.40%     0.00%     0.01%         12        postgres  /home/digoal/pgsql9.6/bin/postgres             0x538a11         d [.] AllocSetFree.lto_priv.1187               
0.39%     0.00%     0.00%          1        postgres  [kernel.kallsyms]                                0xffffffff8111e567 k [k] find_get_page                          
0.39%     0.01%     0.00%         24        postgres  [kernel.kallsyms]                                0xffffffff8148d1f8 k [k] tcp_sendmsg                            
0.39%     0.00%     0.01%         34        postgres  /home/digoal/pgsql9.6/bin/postgres             0xef545          d [.] XLogInsert                               
0.39%     0.00%     0.01%         25        postgres  /lib64/libc-2.12.so                              0x32623          B [.] __GI___sigsetjmp                         
0.38%     0.00%     0.38%        860         pgbench  /home/digoal/pgsql9.6/bin/pgbench              0xb3f3           d [.] threadRun.lto_priv.20                    
0.38%     0.00%     0.01%         33        postgres  /lib64/libc-2.12.so                              0x77cf2          B [.] _int_free                                
0.38%     0.00%     0.00%          1        postgres  /lib64/libc-2.12.so                              0x13cb10         B [.] __memcmp_sse4_1                          
0.38%     0.00%     0.02%         36        postgres  /home/digoal/pgsql9.6/bin/postgres             0x3c762f         d [.] ExecInitExpr                             
0.36%     0.36%     0.00%        816         pgbench  [kernel.kallsyms]                                0xffffffff8119f950 k [k] do_select                              
0.36%     0.00%     0.00%          4        postgres  /home/digoal/pgsql9.6/bin/postgres             0x21e005         d [.] ReadBuffer_common.lto_priv.1215          
0.36%     0.00%     0.01%         28        postgres  /home/digoal/pgsql9.6/bin/postgres             0x1e5e50         d [.] _bt_doinsert                             
0.35%     0.00%     0.01%         16        postgres  [vdso]                                           0x7fffb61ff8b0   ! [.] 0x7fffb61ff8b0                           
0.35%     0.00%     0.00%          1        postgres  /lib64/libc-2.12.so                              0x9b6db          B [.] gettimeofday                             
0.35%     0.00%     0.35%        775         pgbench  /home/digoal/pgsql9.6/lib/libpq.so.5.9         0x18057          d [.] pqParseInput3                            
0.32%     0.00%     0.01%         18        postgres  /home/digoal/pgsql9.6/bin/postgres             0x2f9979         d [.] expression_tree_walker                   
0.31%     0.00%     0.01%         18        postgres  /home/digoal/pgsql9.6/bin/postgres             0x306a5c         d [.] socket_putmessage.lto_priv.1817          
0.31%     0.31%     0.00%        702         pgbench  [kernel.kallsyms]                                0xffffffff8148c9be k [k] tcp_sendmsg                            
0.31%     0.00%     0.01%         17        postgres  /home/digoal/pgsql9.6/bin/postgres             0x38891e         d [.] ExecInitModifyTable                      
0.31%     0.00%     0.00%          1        postgres  [kernel.kallsyms]                                0xffffffff81057ff7 k [k] update_curr                            
0.31%     0.01%     0.00%         27        postgres  [kernel.kallsyms]                                0xffffffff81494ea6 k [k] tcp_ack                                              
 

展开后的例子

-      1.39%     0.00%     0.01%         15        postgres  /home/digoal/pgsql9.6/bin/postgres      0x524e3a         d [.] AllocSetAlloc.lto_priv.1186
    - AllocSetAlloc.lto_priv.1186                                                                                                                      
      - 33.72% MemoryContextAlloc                                                                                                                      
         - 27.57% MemoryContextCreate                                                                                                                  
            - AllocSetContextCreate.constprop.955                                                                                                      
               - 43.77% StartTransaction.lto_priv.2177                                                                                                 
                    start_xact_command.part.3                                                                                                          
                    PostgresMain                                                                                                                       
                    ServerLoop                                                                                                                         
                    PostmasterMain                                                                                                                     
                    main                                                                                                                               
                    __libc_start_main                                                                                                                  
               - 16.08% CreatePortal.constprop.203                                                                                                     
                    PostgresMain                                                                                                                       
                    ServerLoop                                                                                                                         
                    PostmasterMain                                                                                                                     
                    main                                                                                                                               
                    __libc_start_main                                                                                                                  
               - 14.06% MakePerTupleExprContext                                                                                                        
                    ExecInsertIndexTuples.constprop.430                                                                                                
                    ExecModifyTable                                                                                                                    
                    ExecProcNode                                                                                                                       
                    standard_ExecutorRun                                                                                                               
                    pgss_ExecutorRun                                                                                                                   
                    ProcessQuery                                                                                                                       
                    PortalRunMulti                                                                                                                     
                    PortalRun                                                                                                                          
                    PostgresMain                                                                                                                       
                    ServerLoop                                                                                                                         
                    PostmasterMain                                                                                                                     
                    main                                                                                                                               
                    __libc_start_main                                                                                                                  
               - 13.35% CreateExecutorState                                                                                                            
                    standard_ExecutorStart                                                                                                             
                    pgss_ExecutorStart                                                                                                                 
                    ProcessQuery                                                                                                                       
                    PortalRunMulti                                                                                                                     
                    PortalRun                                                                                                                          
                    PostgresMain                                                                                                                       
                    ServerLoop                                                                                                                         
                    PostmasterMain                                                                                                                     
                    main                                                                                                                               
                    __libc_start_main                                                                                                                  
               - 12.74% ExecInitResult                                                                                                                 
                    ExecInitNode                                                                                                                       
                    ExecInitModifyTable                                                                                                                
                    ExecInitNode                                                                                                                       
                    standard_ExecutorStart                                                                                                             
                    pgss_ExecutorStart                                                                                                                 
                    ProcessQuery                                                                                                                       
                    PortalRunMulti                                                                                                                     
                    PortalRun                                                                                                                          
                    PostgresMain                                                                                                                       
                    ServerLoop 
		    ...................略................
 

如果要查看详情,建议使用--stdio模式,看得非常清晰

#perf report -v -n --showcpuutilization -g --stdio

# Events: 224K cycles
#
# Overhead  Samples       sys    us                 Command                                         Shared Object                                                                                            
(全局开销占比) (采样)   (系统开销占比, 系统开销耗时us)  (命令)                                       (共享对象)
# ........ ..........  ..............          ...............................................     ........................................................
#
     1.39%     0.00%     0.01%         15        postgres  /home/digoal/pgsql9.6/bin/postgres           0x524e3a         d [.] AllocSetAlloc.lto_priv.1186
                  |
                  --- AllocSetAlloc.lto_priv.1186
                     |          
                     |--33.72%-- MemoryContextAlloc
                     |          |          
                     |          |--27.57%-- MemoryContextCreate
                     |          |          AllocSetContextCreate.constprop.955
                     |          |          |          
                     |          |          |--43.77%-- StartTransaction.lto_priv.2177
                     |          |          |          start_xact_command.part.3
                     |          |          |          PostgresMain
                     |          |          |          ServerLoop
                     |          |          |          PostmasterMain
                     |          |          |          main
                     |          |          |          __libc_start_main
                     |          |          |          
                     |          |          |--16.08%-- CreatePortal.constprop.203
                     |          |          |          PostgresMain
                     |          |          |          ServerLoop
                     |          |          |          PostmasterMain
                     |          |          |          main
                     |          |          |          __libc_start_main
                     |          |          |          
                     |          |          |--14.06%-- MakePerTupleExprContext
                     |          |          |          ExecInsertIndexTuples.constprop.430
                     |          |          |          ExecModifyTable
                     |          |          |          ExecProcNode
                     |          |          |          standard_ExecutorRun
                     |          |          |          pgss_ExecutorRun
                     |          |          |          ProcessQuery
                     |          |          |          PortalRunMulti
                     |          |          |          PortalRun
                     |          |          |          PostgresMain
                     |          |          |          ServerLoop
                     |          |          |          PostmasterMain
                     |          |          |          main
                     |          |          |          __libc_start_main
                     |          |          |          
                     |          |          |--13.35%-- CreateExecutorState
                     |          |          |          standard_ExecutorStart
                     |          |          |          pgss_ExecutorStart
                     |          |          |          ProcessQuery
                     |          |          |          PortalRunMulti
                     |          |          |          PortalRun
                     |          |          |          PostgresMain
                     |          |          |          ServerLoop
                     |          |          |          PostmasterMain
                     |          |          |          main
                     |          |          |          __libc_start_main
...............略.............................
 

perf report输出为本文,可以较为详细的进行代码瓶颈的分析。

6. 生成火焰图

如果你觉得perf report的文本输出不够直观,可以生成火焰图,就更加直观了。

生成火焰图需要将perf record的原始数据弄出来,然后绘图。

perf.data的原始数据perf script即可输出。

生成火焰图

# git clone https://github.com/brendangregg/FlameGraph      # or download it from github
# mv perf.data FlameGraph/
# cd FlameGraph
# perf script | ./stackcollapse-perf.pl > out.perf-folded
# cat out.perf-folded | ./flamegraph.pl > perf-kernel.svg
 

perf-kernel.svg

svg可以使用浏览器打开,转换的图片如下

pic

指定内核内置事件跟踪

perf top|stat|record 都支持通过-e指定事件进行跟踪。

支持的事件如下

perf list

  cpu-cycles OR cycles                               [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  instructions                                       [Hardware event]
  cache-references                                   [Hardware event]
  cache-misses                                       [Hardware event]
  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]
......................略........................
 

事件支持写通配符,多个事件使用逗号隔开。

perf top -ag -e xfs:xfs_attr_list_sf,skb:kfree_skb,block:*,irq:softirq*

No CONFIG_PERF_EVENTS=y kernel support configured? 编译内核是需要配置CONFIG_PERF_EVENTS=y
 
perf stat -e xfs:xfs_attr_list_sf,skb:kfree_skb,block:*,irq:softirq* -- sleep 10

 Performance counter stats for 'sleep 10':

                 0 xfs:xfs_attr_list_sf                                        
                 0 skb:kfree_skb                                               
                 0 block:block_rq_abort                                        
                 0 block:block_rq_requeue                                      
                 0 block:block_rq_complete                                     
                 0 block:block_rq_insert                                       
                 0 block:block_rq_issue                                        
                 0 block:block_bio_bounce                                      
                 0 block:block_bio_complete                                    
                 0 block:block_bio_backmerge                                   
                 0 block:block_bio_frontmerge                                   
                 0 block:block_bio_queue                                       
                 0 block:block_getrq                                           
                 0 block:block_sleeprq                                         
                 0 block:block_plug                                            
                 0 block:block_unplug_timer                                    
                 0 block:block_unplug_io                                       
                 0 block:block_split                                           
                 0 block:block_remap                                           
                 0 block:block_rq_remap                                        
                 0 irq:softirq_entry                                           
                 0 irq:softirq_exit                                            
                 0 irq:softirq_raise                                           

      10.001021697 seconds time elapsed
 
perf record -avg -e xfs:xfs_attr_list_sf,skb:kfree_skb,block:*,irq:softirq* -- sleep 10

perf report -vg --stdio

# Events: 17  skb:kfree_skb
#
# Overhead         Command                                Shared Object            Symbol
# ........  ..............  ...........................................  ................
#
    35.29%            init  [kernel.kallsyms]                            0xffffffff8103c94b k [k] native_safe_halt
                      |
                      --- native_safe_halt

    11.76%       syslog-ng  /lib64/libpthread-2.12.so                    0xea10           B [.] __recvmsg
                 |
                 --- __recvmsg
...........略.......................
 

kernel source 动态跟踪

如果perf list列出的事件不能满足跟踪需求,你可以自行添加任意代码进行跟踪。

这个功能需要perf和内核的支持,详见《Linux 性能诊断(profiling) perf使用指南》

1. Listing variables available for tcp_sendmsg():

# perf probe -V tcp_sendmsg
Available variables at tcp_sendmsg
        @<tcp_sendmsg+0>
                size_t  size
                struct kiocb*   iocb
                struct msghdr*  msg
                struct sock*    sk

2. Creating a probe for tcp_sendmsg() with the "size" variable:

# perf probe --add 'tcp_sendmsg size'
Added new event:
  probe:tcp_sendmsg    (on tcp_sendmsg with size)

3. You can now use it in all perf tools, such as: 

        perf record -e probe:tcp_sendmsg -aR sleep 1
 

PostgreSQL source 动态跟踪

这个功能需要perf和软件编译时的支持,详见《Linux 性能诊断(profiling) perf使用指南》

sudo perf record -g dwarf -e block:block_rq_*,syscalls:sys_enter_write,syscalls:sys_enter_fsync -p 4495
 
sudo perf probe  -x `which postgres` XLogFileInit
sudo perf probe  -x `which postgres` XLogFileOpen
sudo perf probe  -x `which postgres` XLogFlush
 
sudo perf probe -l
  probe_postgres:XLogFileInit (on 0x000000000009a360)
  probe_postgres:XLogFileOpen (on 0x000000000009a860)
  probe_postgres:XLogFlush (on 0x00000000000a0670)
 
sudo perf record -g dwarf -u postgres -e probe_postgres:XLogFileInit,probe_postgres:XLogFileOpen,probe_postgres:XLogFlush
 
sudo perf stat -e probe_postgres:XLogFileInit,probe_postgres:XLogFileOpen,probe_postgres:XLogFlush -a -- /usr/pgsql-9.2/bin/pgbench -U postgres craig -c 2 -t 10000
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 20000/20000
tps = 715.854663 (including connections establishing)
tps = 716.092133 (excluding connections establishing)

 Performance counter stats for '/usr/pgsql-9.2/bin/pgbench -U postgres craig -c 2 -t 10000':

      64 probe_postgres:XLogFileInit       [100.00%]
       0 probe_postgres:XLogFileOpen       [100.00%]
  55,440 probe_postgres:XLogFlush                                    

 27.987364469 seconds time elapsed
 

参考

1. https://wiki.postgresql.org/wiki/Profiling_with_perf

2. http://blog.2ndquadrant.com/tracing-postgresql-perf/

3. 《Linux 性能诊断(profiling) perf使用指南》

4. http://blog.2ndquadrant.com/testing-postgresql-vagrant-computer/

5. http://blog.2ndquadrant.com/devops-testing-postgresql-with-vagrant-on-your-computer-part-two/

6. https://www.quora.com/How-do-I-compile-a-Linux-perf-tool-with-all-features-For-Linux-4-0-on-Ubuntu

标签:profiling,PostgreSQL,postgres,digoal,0.00%,源码,home,block,perf
From: https://www.cnblogs.com/yaoyangding/p/17642538.html

相关文章

  • 基于Redis的Geo实现附近商铺搜索(含源码)
    微信公众号访问地址:基于Redis的Geo实现附近商铺搜索(含源码)一、GEO常用命令及使用示范1.1、GEO的数据结构GEO 就是 Geolocation 的简写形式,代表地理坐标。Redis 在 3.2 版本中加入了对 GEO 的支持,允许存储地理坐标信息,帮助我们根据经纬度来检索数据。常见的命令有:1、GEOAD......
  • 学浪、荔枝微课平台的平替,兔知云课堂知识付费源码
    经历了雪浪下架风波后,我们深刻认识到“鸡蛋不能放在一个篮子里”的道理。尽管现在已经恢复正常,但这段经历仍然促使我们在寻找其他课程上架平台的同时,审慎对待不同的推销信息。作为知识付费领域的一份子,我希望分享一款备受认可的私域场景下的知识付费解决方案——兔知云课堂。如果......
  • 基于JAVA+hadoop网络云盘上传下载系统-计算机毕业设计源码+LW文档
    摘 要随着信息技术的发展,管理系统越来越成熟,各种企事业单位使用各种类型的管理系统来提高工作效率,从而降低手工劳动的弊端。网络云盘能够为广大用户提供安全、免费、方便的存储空间,还能实现资源的共享,但是网络云盘还是存在不足,如何为用户提供更简单明了、便于操作的云盘空间就......
  • 高等学校国防教育教学素材管理系统-计算机毕业设计源码+LW文档
    摘要随着信息技术的发展,管理系统越来越成熟,各种企事业单位使用各种类型的管理系统来提高工作效率,从而降低手工劳动的弊端。我国政府一直以来都非常重视国防教育事业的发展,近几年来高校学生人数逐渐增加,学校对图书馆国防教育教学素材信息、借阅信息的管理也愈发的困难。因此,高校提......
  • 中学生考试管理系统的设计与实现-计算机毕业设计源码+LW文档
    摘要随着信息技术的发展,管理系统越来越成熟,各种企事业单位使用各种类型的管理系统来提高工作效率,从而降低手工操作的弊端。我国政府一直以来都非常重视中学阶段教育的发展,近几年来学生人数逐渐增加,对在线考试的需求越来越多。因此,通过开发中学生考试管理系统来提高学习效率,增强考......
  • 基于SSM技术的心理学网站系统设计与实现-计算机毕业设计源码+LW文档
    一、选题背景及意义随着信息化社会的形成和微电子技术日新月异的发展,为开发心理学网站系统提供了良好的环境。在当前的技术和社会需求环境下,心理学网站系统将具有长足的发展。特别是在互联网高度发展和应用的今天,随着心理关爱需求的越来越多,也正因此,心理学网站系统应运而生,可以为......
  • 宠物商城管理系统的设计与实现-计算机毕业设计源码+LW文档
    一、选题背景及意义随着时代的进步,年轻人伴随着压力在外奋斗,家里老人也就没有了儿女的陪伴。宠物扮演了一个重要的角色。当你奋斗一天下班回到家,推开门看到你的宠物蹲在门口只为第一时间迎接你回来带给自己的治愈,在小动物这,你可以百分百相信它不会背叛你,不会因为你做错什么事情而......
  • 【PACS源码】认识PACS的架构和工作流程
     (一)PACS系统的组成及架构   PACS系统的基本组成部分包括:数字影像采集、通讯和网络、医学影像存储、医学影像管理、各类工作站五个部分。   而目前PACS系统的软件架构选型上看,主要有C/S和B/S两种形式。   C/S架构,即Client/Server(客户机/服务器)架构,将运算......
  • Java智慧工地系统源码(微服务框架+Java+MySQL)
    智慧工地系统是依托物联网、互联网、AI、可视化建立的大数据管理平台,是一种全新的管理模式,能够实现劳务管理、安全施工、绿色施工的智能化和互联网化。围绕施工现场管理的人、机、料、法、环五大维度,以及施工过程管理的进度、质量、安全三大体系为基础应用,实现全面高效的工程管理需......
  • 基于javaweb江西红色旅游网站的设计与实现-计算机毕业设计源码+LW文档
    开发语言:Java框架:ssmJDK版本:JDK1.8服务器:tomcat7数据库:mysql5.7(一定要5.7版本)数据库工具:Navicat11开发软件:eclipse/myeclipse/idea[请下载Eclipse运行]Maven包:Maven3.3.9浏览器:谷歌浏览器数据库--TablestructureforchatDROPTABLEIFEXISTSchat;CREATETABLEchat......