首页 > 数据库 >lightdb/postgresql中的MemoryContext out of memory原因分析及解决思路

lightdb/postgresql中的MemoryContext out of memory原因分析及解决思路

时间:2022-12-30 14:12:55浏览次数:62  
标签:index MemoryContext postgresql lightdb free used chunks blocks total

内存上下文的设计思路可以参考src/backend/utils/mmgr/README。

https://www.pgcon.org/2019/schedule/attachments/514_introduction-memory-contexts.pdf

http://www.light-pg.com/docs/lightdb/13.3-22.2/lt_cheat_funcs.html

可以通过lt_cheat_funcs(该扩展对运行时无明显的侵入和干扰,可以在生产使用)扩展实时查询,如下:

zjh@postgres=# create extension lt_cheat_funcs ;
CREATE EXTENSION

zjh@postgres=# select * from pg_stat_get_memory_context();
           name           |       parent       | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes 
--------------------------+--------------------+-------+-------------+---------------+------------+-------------+------------
 TopMemoryContext         |                    |     0 |      510640 |            12 |      23720 |          19 |     486920
 dynahash                 | TopMemoryContext   |     1 |        8192 |             1 |       1448 |           0 |       6744

不仅可以查询当前实例的MemoryContext树,还可以在每个语句执行结束前打印内存上下文。

shared_preload_libraries='lt_stat_statements,lt_stat_activity,lt_cheat_funcs,lt_prewarm,lt_cron,ltaudit,lt_hint_plan,lt_show_plans,lt_standby_forward,lt_pathman'
lt_cheat_funcs.log_memory_context=on
2022-09-06 09:32:13.807665T ltsql zjh@postgres [local] client backend SELECT 00000[2022-09-06 09:28:47 CST] 0 [84905] STATEMENT:  select 1;
TopMemoryContext: 494256 total in 11 blocks; 9760 free (11 chunks); 484496 used
  dynahash: 8192 total in 1 blocks; 1448 free (0 chunks); 6744 used
  TopTransactionContext: 8192 total in 1 blocks; 7784 free (2 chunks); 408 used
  RowDescriptionContext: 8192 total in 1 blocks; 6888 free (0 chunks); 1304 used
  MessageContext: 32768 total in 3 blocks; 21504 free (6 chunks); 11264 used
  dynahash: 8192 total in 1 blocks; 552 free (0 chunks); 7640 used
  dynahash: 16384 total in 2 blocks; 4592 free (2 chunks); 11792 used
  TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 used
  dynahash: 8192 total in 1 blocks; 552 free (0 chunks); 7640 used
  TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used
    PortalContext: 1024 total in 1 blocks; 592 free (0 chunks); 432 used
  dynahash: 16384 total in 2 blocks; 3504 free (2 chunks); 12880 used
  CacheMemoryContext: 524288 total in 7 blocks; 104920 free (0 chunks); 419368 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 2048 total in 2 blocks; 496 free (1 chunks); 1552 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 2048 total in 2 blocks; 952 free (2 chunks); 1096 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1128 free (1 chunks); 1944 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1096 free (2 chunks); 1976 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 2048 total in 2 blocks; 952 free (2 chunks); 1096 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1096 free (2 chunks); 1976 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 3072 total in 2 blocks; 1096 free (2 chunks); 1976 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 656 free (2 chunks); 1392 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1128 free (1 chunks); 1944 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 952 free (2 chunks); 1096 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 2048 total in 2 blocks; 952 free (2 chunks); 1096 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 448 free (1 chunks); 1600 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 776 free (1 chunks); 2296 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 656 free (2 chunks); 1392 used
    index info: 2048 total in 2 blocks; 656 free (2 chunks); 1392 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 2048 total in 2 blocks; 656 free (2 chunks); 1392 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 1024 total in 1 blocks; 0 free (0 chunks); 1024 used
    index info: 3072 total in 2 blocks; 1160 free (2 chunks); 1912 used
    index info: 2048 total in 2 blocks; 952 free (2 chunks); 1096 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 2048 total in 2 blocks; 688 free (2 chunks); 1360 used
    index info: 2048 total in 2 blocks; 952 free (2 chunks); 1096 used
    index info: 2048 total in 2 blocks; 656 free (2 chunks); 1392 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
    index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used
  WAL record construction: 49768 total in 2 blocks; 6360 free (0 chunks); 43408 used
  dynahash: 8192 total in 1 blocks; 2616 free (0 chunks); 5576 used
  MdSmgr: 8192 total in 1 blocks; 7760 free (0 chunks); 432 used
  dynahash: 8192 total in 1 blocks; 552 free (0 chunks); 7640 used
  dynahash: 16384 total in 2 blocks; 6656 free (3 chunks); 9728 used
  dynahash: 8192 total in 1 blocks; 552 free (0 chunks); 7640 used
  dynahash: 104120 total in 2 blocks; 2616 free (0 chunks); 101504 used
  ErrorContext: 8192 total in 1 blocks; 7928 free (7 chunks); 264 used

  lt_cheat_funcs还有一个参数lt_cheat_funcs.exit_on_segv,控制segment fault的时候是否触发master进程宕机。因为这种概率发生极小,所以默认可以不用开启。一旦有异常某个参数值导致了越界,可以考虑临时补救。

  lt_cheat_funcs对性能的影响。基于ltbench测试可知,其对性能的影响几乎可以忽略不计。

https://jnidzwetzki.github.io/2022/05/28/postgres-memory-context.html

共享内存的使用姿势  https://pgsql-hackers.postgresql.narkive.com/6VuZjbMh/shared-memory-and-memory-context-question

主要在两个c文件中:src/backend/utils/mmgr/aset.c是实现,src/backend/utils/mmgr/mcxt.c对外接口。如果想要干预或监控memorycontext,可以通过注册回调hook进行。

要想掌握memorycontext,还得理解发生out of memory的情况以及如何解决,如下:

client backend BIND 53200[2022-10-12 11:09:13 CST] 14856195 [402043] ERROR:  out of memory
client backend BIND 53200[2022-10-12 11:09:13 CST] 14856195 [402043] DETAIL:  Failed on request of size 48 in memory context "MessageContext".
client backend BIND 53200[2022-10-12 11:09:13 CST] 14856195 [402043] STATEMENT:  insert into tbfundprftflow4 (ta_client,last_asset
client backend PARSE 53200[2022-10-12 11:09:55 CST] 0 [402230] ERROR:  out of memory
client backend PARSE 53200[2022-10-12 11:09:55 CST] 0 [402230] DETAIL:  Failed on request of size 40 in memory context "CachedPlanQuery".
client backend PARSE 53200[2022-10-12 11:09:55 CST] 0 [402230] STATEMENT:  insert into tbfundprftflow15 (ta_client,last_asset,curr_income_ratio
client backend PARSE 53200[2022-10-12 11:09:13 CST] 0 [402043] ERROR:  out of memory
client backend PARSE 53200[2022-10-12 11:09:13 CST] 0 [402043] DETAIL:  Failed on request of size 65536 in memory context "ErrorContext".

 从源码可以看出,如果allocset中没有空闲chunk或申请的内存特别大(超过AllocSetContext.allocChunkLimit)时,都会调用c malloc申请内存,如下:

    /*
     * Time to create a new regular (multi-chunk) block?
     */
    if (block == NULL)
    {
        Size        required_size;

        /*
         * The first such block has size initBlockSize, and we double the
         * space in each succeeding block, but not more than maxBlockSize.
         */
        blksize = set->nextBlockSize;
        set->nextBlockSize <<= 1;
        if (set->nextBlockSize > set->maxBlockSize)
            set->nextBlockSize = set->maxBlockSize;

        /*
         * If initBlockSize is less than ALLOC_CHUNK_LIMIT, we could need more
         * space... but try to keep it a power of 2.
         */
        required_size = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
        while (blksize < required_size)
            blksize <<= 1;

        /* Try to allocate it */
        block = (AllocBlock) malloc(blksize);

  至于错误日志中的“Failed on request of size 40 in memory context "CachedPlanQuery"”,是调用者打印的。在CachedPlanQuery内存上下文中找不到空闲可满足40字节的chunk,导致在allocsetalloc中新创建max(blksize = set->nextBlockSize,required_size)大的块(它是8KB,16KB,翻倍这么上去放在内存内存池,所以至少是申请8K失败,具体多少不确定,甚至有可能n GB,lightdb 22.4开始将会打印具体的值),此时通过malloc分配。如果malloc失败,就会直接返回NULL,没有打印实际申请多大的内存失败。从memorycontext本身申请是永远不会失败的。所以就要分析malloc失败的原因。

malloc()函数分配内存失败的常见原因:

1. 内存不足。

2. 在前面的程序中出现了内存的越界访问,导致malloc()分配函数所涉及的一些信息被破坏。下次再使用malloc()函数申请内存就会失败,返回空指针NULL(0)。

3. 没有连续内存能够满足申请。http://t.zoukankan.com/huty-p-8518846.html

4. 根据vm.overcommit_ratio和vm.overcommit_memory分析每个进程能够使用的内存大小。可以通过sar -r查看Committed_AS,通过/proc/meminfo查看Committed_AS、CommitLimit。如下:

 

 

https://www.postgresql.org/message-id/4057e37d0fad0814281017dc6c211c00.squirrel@sq.gransy.com

https://www.cnblogs.com/zhjh256/p/15424236.html

标签:index,MemoryContext,postgresql,lightdb,free,used,chunks,blocks,total
From: https://www.cnblogs.com/lightdb/p/15546560.html

相关文章

  • linux离线安装postgresql
    1.软件包说明软件包 说明postgresqllibrariesandclientbinariespostgresql-server coredatabaseserverpostgresql-contrib additionalsuppliedmodulespostgresql-......
  • oracle、mysql&mariadb、postgresql中表名、列名的默认大小写问题
    从https://zhuanlan.zhihu.com/p/378495347、https://blog.csdn.net/qq_43454016/article/details/123231325,可知,mysql表名的大小写敏感取决于lower_case_table_names......
  • lightdb获取建表ddl
    selectdbms_metadata.get_ddl('table','act_hi_identitylink','etfpub');CREATETABLEetfpub.act_hi_identitylink(id_charactervarying(64)NOTNULL,gro......
  • PostgreSQL数据库操作(2)
    一、数据库常规操作1.创建数据库//数据库登录psql-h服务器-U用户名-d数据库-p端口./psql-Upostgres-p5432-h127.0.0.1//创建数据库postgres=#crea......
  • PostgreSQL单机安装与介绍(1)
    一、PostgreSQL简介官网:https://www.postgresql.org/下载地址:https://www.postgresql.org/download/linux/redhat/源码下载:https://www.postgresql.org/ftp/source/v15.......
  • 通知:2023年PostgreSQL考试认证计划
    通知:2023年PostgreSQL考试认证计划PostgreSQL考试认证中心(简称:PGCCC)经中国PostgreSQL考试认证中心(简称:PGCCC)研究决定,现下发2023年度中国PostgreSQL考试认证全年计划,......
  • postgresql批量新增或更新的sql语句(Mapper实现)
    1@Insert({"<script>"+2"<foreachcollection=\"list\"item=\"item\"separator=\";\">"+3"INSERTINTOsc_water_month(\n......
  • 找出PostgreSQL schema变更差异
    管理PostgreSQL的例行操作之一是定期更新数据库系统的架构。PostgreSQL在更新schema、添加type、函数、触发器或修改表添加和删除列、更新列数据类型等方面提供了可靠的方......
  • 关系型数据库学习手记——初见倾心PostgreSQL、MySQL、SQLite、MongoDB
    一、关系型数据库系统理论知识1.1学习笔记​​数据库系统概念读书笔记-引言​​数据库系统概念读书笔记-关系数据库数据库系统概念读书笔记-数据库发展史(上)数据库系统概念......
  • 为什么“去O”唯有PostgreSQL?
     本文根据digoal(德哥)在〖2019DAMS中国数据智能管理峰会〗现场演讲内容整理而成。讲师介绍digoal(德哥),PostgreSQL中国社区发起人之一、常委、兼任社区大学校长。阿里云......