首页 > 数据库 >循序渐进丨MogDB 数据库查询重写规则lazyagg详解

循序渐进丨MogDB 数据库查询重写规则lazyagg详解

时间:2024-10-15 16:50:25浏览次数:3  
标签:rows .. object MogDB t1 cost lazyagg 重写 id

问题概述

在 MogDB 和 openGauss 中,参数rewrite_rule用于控制查询重写,本文介绍查询重写规则lazyagg。

在未设置rewrite_rule=lazyagg的情况下,子查询中有GROUP BY会先进行GROUP BY;

lazyagg表示延迟聚合运算,目的是消除子查询中的聚合运算,先关联再GROUP BY;

当子查询中有GROUP BY,子查询中的表很大,子查询与外面的表(比较小/过滤完之后数据量少)进行关联之后还有GROUP BY,这个时候就可以开启lazyagg特性,加快SQL性能。

参考示例

现在有如下例子:

orcl=> explain analyze select /*+ set(rewrite_rule none) */ t1.object_id, sum(total)
orcl->   from test02 t1,
orcl->        (select object_id, sum(data_object_id) as total from test01 group by object_id) t2
orcl->  where t1.object_id = t2.object_id and t1.owner='SCOTT'
orcl->  group by t1.object_id;
                                                                    QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1452817.48..1452817.80 rows=32 width=70) (actual time=19813.801..19813.804 rows=36 loops=1)
   Group By Key: t1.object_id
   ->  Hash Join  (cost=1450644.14..1452748.21 rows=13854 width=38) (actual time=19786.470..19813.740 rows=36 loops=1)
         Hash Cond: (test01.object_id = t1.object_id)
         ->  HashAggregate  (cost=1450525.16..1451391.03 rows=86587 width=44) (actual time=19785.539..19802.382 rows=86987 loops=1)
               Group By Key: test01.object_id
               ->  Seq Scan on test01  (cost=0.00..1227838.44 rows=44537344 width=12) (actual time=0.006..5913.694 rows=44537344 loops=1)
         ->  Hash  (cost=118.58..118.58 rows=32 width=6) (actual time=0.140..0.140 rows=36 loops=1)
                Buckets: 32768  Batches: 1  Memory Usage: 258kB
               ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=6) (actual time=0.036..0.128 rows=36 loops=1)
                     Index Cond: ((owner)::text = 'SCOTT'::text)
 Total runtime: 19814.139 ms
(12 rows)

/*+ set(rewrite_rule none) */表示禁止所有查询重写规则。从执行计划中看到,子查询先进行了GROUP BY,再与test02进行关联,整个SQL GROUP BY了2次。

现在设置rewrite_rule=lazyagg,我们再来看一下执行计划:

orcl=> explain analyze select /*+ set(rewrite_rule lazyagg) */ t1.object_id, sum(total)
orcl->   from test02 t1,
orcl->        (select object_id, sum(data_object_id) as total from test01 group by object_id) t2
orcl->  where t1.object_id = t2.object_id and t1.owner='SCOTT'
orcl->  group by t1.object_id;
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=64868.16..64868.48 rows=32 width=44) (actual time=45.018..45.023 rows=36 loops=1)
   Group By Key: t1.object_id
   ->  Nested Loop  (cost=12.23..64785.86 rows=16460 width=12) (actual time=0.150..38.695 rows=18432 loops=1)
         ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=6) (actual time=0.017..0.060 rows=36 loops=1)
               Index Cond: ((owner)::text = 'SCOTT'::text)
         ->  Bitmap Heap Scan on test01  (cost=12.23..2015.71 rows=514 width=12) (actual time=3.912..36.585 rows=18432 loops=36)
               Recheck Cond: (object_id = t1.object_id)
               Heap Blocks: exact=18432
               ->  Bitmap Index Scan on idx_test01_objectid  (cost=0.00..12.11 rows=514 width=0) (actual time=2.304..2.304 rows=18432 loops=36)
                     Index Cond: (object_id = t1.object_id)
 Total runtime: 45.229 ms
(11 rows)

从执行计划中看到,子查询中的聚合运算被消除了,子查询中的表test01与test02先做了关联,关联之后再进行GROUP BY,整个SQL只做了1次GROUP BY。

注意事项

想要lazyagg查询改写规则生效,必须满足两个条件:

  1. 子查询中有GROUP BY

  2. 子查询与外面的表关联之后还有GROUP BY

如果子查询与外面的表关联之后没有GROUP BY,lazyagg查询改写规则不会生效,这个时候请使用谓词推入。

比如下面SQL lazyagg就不会生效,因为子查询与外面的表关联之后没有GROUP BY:

orcl=> explain analyze select /*+ set(rewrite_rule lazyagg) */ t1.object_id, total
  from test02 t1,
       (select object_id, sum(data_object_id) as total from test01 group by object_id) t2
 where t1.object_id = t2.object_id and t1.owner='SCOTT';
                                                                 QUERY PLAN                                                                 
--------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1450642.43..1452765.96 rows=13982 width=38) (actual time=19008.136..19038.606 rows=36 loops=1)
   Hash Cond: (test01.object_id = t1.object_id)
   ->  HashAggregate  (cost=1450523.45..1451397.33 rows=87388 width=44) (actual time=19007.086..19026.905 rows=86987 loops=1)
         Group By Key: test01.object_id
         ->  Seq Scan on test01  (cost=0.00..1227837.30 rows=44537230 width=12) (actual time=0.004..5699.204 rows=44537344 loops=1)
   ->  Hash  (cost=118.58..118.58 rows=32 width=6) (actual time=0.123..0.123 rows=36 loops=1)
          Buckets: 32768  Batches: 1  Memory Usage: 258kB
         ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=6) (actual time=0.048..0.110 rows=36 loops=1)
               Index Cond: ((owner)::text = 'SCOTT'::text)
 Total runtime: 19038.900 ms
(10 rows)

设置rewrite_rule=predpushnormal:​​​​​​​

orcl=> explain analyze select /*+ set(rewrite_rule predpushnormal) */ t1.object_id, total
  from test02 t1,
       (select object_id, sum(data_object_id) as total from test01 group by object_id) t2
 where t1.object_id = t2.object_id and t1.owner='SCOTT';
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=12.20..64826.62 rows=2 width=38) (actual time=1.542..36.819 rows=36 loops=1)
   ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=6) (actual time=0.028..0.094 rows=36 loops=1)
         Index Cond: ((owner)::text = 'SCOTT'::text)
   ->  GroupAggregate  (cost=12.20..2022.09 rows=2 width=44) (actual time=36.683..36.683 rows=36 loops=36)
         Group By Key: test01.object_id
         ->  Bitmap Heap Scan on test01  (cost=12.20..2019.52 rows=510 width=12) (actual time=4.016..31.464 rows=18432 loops=36)
               Recheck Cond: (t1.object_id = object_id)
               Heap Blocks: exact=18432
               ->  Bitmap Index Scan on idx_test01_objectid  (cost=0.00..12.07 rows=510 width=0) (actual time=2.369..2.369 rows=18432 loops=36)
                     Index Cond: (t1.object_id = object_id)
 Total runtime: 37.015 ms
(11 rows)

子查询中有union all可以生效:​​​​​​​

orcl=> explain analyze select /*+ set(rewrite_rule lazyagg) */ t1.owner, sum(total)
orcl->  from test02 t1,
orcl->       (select object_id, sum(data_object_id) as total from test01 group by object_id
orcl(>        union all
orcl(>        select object_id, sum(data_object_id) as total from test03 group by object_id
orcl(>       ) t2
orcl-> where t1.object_id = t2.object_id and t1.owner='SCOTT'
orcl-> group by t1.owner; 
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=12.20..100248.99 rows=1 width=43) (actual time=41.682..41.682 rows=1 loops=1)
   Group By Key: t1.owner
   ->  Nested Loop  (cost=12.20..64549.61 rows=7139875 width=11) (actual time=0.156..37.395 rows=18468 loops=1)
         ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=11) (actual time=0.019..0.062 rows=36 loops=1)
               Index Cond: ((owner)::text = 'SCOTT'::text)
         ->  Append  (cost=12.20..2008.36 rows=511 width=12) (actual time=4.092..35.783 rows=18468 loops=36)
               ->  Bitmap Heap Scan on test01  (cost=12.20..2000.09 rows=510 width=12) (actual time=4.022..34.234 rows=18432 loops=36)
                     Recheck Cond: (object_id = t1.object_id)
                     Heap Blocks: exact=18432
                     ->  Bitmap Index Scan on idx_test01_objectid  (cost=0.00..12.07 rows=510 width=0) (actual time=2.411..2.411 rows=18432 loops=36)
                           Index Cond: (object_id = t1.object_id)
               ->  Index Scan using idx_test03_objectid on test03  (cost=0.00..8.27 rows=1 width=12) (actual time=0.162..0.169 rows=36 loops=36)
                     Index Cond: (object_id = t1.object_id)
 Total runtime: 41.905 ms
(14 rows)

子查询中有union无法生效:​​​​​​​

orcl=> explain analyze select /*+ set(rewrite_rule lazyagg) */ t1.owner, sum(total)
orcl->  from test02 t1,
orcl->       (select object_id, sum(data_object_id) as total from test01 group by object_id
orcl(>        union 
orcl(>        select object_id, sum(data_object_id) as total from test03 group by object_id
orcl(>       ) t2
orcl-> where t1.object_id = t2.object_id and t1.owner='SCOTT'
orcl-> group by t1.owner; 
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1457835.61..1462212.44 rows=1 width=69) (actual time=19242.592..19242.592 rows=1 loops=1)
   Group By Key: t1.owner
   ->  Hash Join  (cost=1457835.61..1462072.93 rows=27900 width=37) (actual time=19219.921..19242.559 rows=67 loops=1)
         Hash Cond: (test01.object_id = t1.object_id)
         ->  HashAggregate  (cost=1457716.63..1459460.38 rows=174375 width=44) (actual time=19218.659..19231.297 rows=96155 loops=1)
               Group By Key: test01.object_id, (sum(test01.data_object_id))
               ->  Append  (cost=1450523.45..1456844.75 rows=174375 width=44) (actual time=19077.681..19160.701 rows=173974 loops=1)
                     ->  HashAggregate  (cost=1450523.45..1451397.33 rows=87388 width=44) (actual time=19077.681..19095.367 rows=86987 loops=1)
                           Group By Key: test01.object_id
                           ->  Seq Scan on test01  (cost=0.00..1227837.30 rows=44537230 width=12) (actual time=0.006..5869.211 rows=44537344 loops=1)
                     ->  HashAggregate  (cost=2833.81..3703.68 rows=86987 width=44) (actual time=35.819..52.104 rows=86987 loops=1)
                           Group By Key: test03.object_id
                           ->  Seq Scan on test03  (cost=0.00..2398.87 rows=86987 width=12) (actual time=0.011..11.127 rows=86987 loops=1)
         ->  Hash  (cost=118.58..118.58 rows=32 width=11) (actual time=0.075..0.075 rows=36 loops=1)
                Buckets: 32768  Batches: 1  Memory Usage: 258kB
               ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=11) (actual time=0.019..0.066 rows=36 loops=1)
                     Index Cond: ((owner)::text = 'SCOTT'::text)
 Total runtime: 19242.893 ms
(18 rows)

这个时候还是用谓词推入:​​​​​​​

orcl=> explain analyze select /*+ set(rewrite_rule predpushnormal) */ t1.owner, sum(total)
orcl->  from test02 t1,
orcl->       (select object_id, sum(data_object_id) as total from test01 group by object_id
orcl(>        union 
orcl(>        select object_id, sum(data_object_id) as total from test03 group by object_id
orcl(>       ) t2
orcl-> where t1.object_id = t2.object_id and t1.owner='SCOTT'
orcl-> group by t1.owner; 
                                                                            QUERY PLAN                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=2030.41..65094.73 rows=1 width=69) (actual time=38.381..38.381 rows=1 loops=1)
   Group By Key: t1.owner
   ->  Nested Loop  (cost=2030.41..65094.71 rows=3 width=37) (actual time=1.710..38.356 rows=67 loops=1)
         ->  Index Scan using idx_test02_owner on test02 t1  (cost=0.00..118.58 rows=32 width=11) (actual time=0.021..0.065 rows=36 loops=1)
               Index Cond: ((owner)::text = 'SCOTT'::text)
         ->  HashAggregate  (cost=2030.41..2030.44 rows=3 width=44) (actual time=38.200..38.205 rows=67 loops=36)
               Group By Key: test01.object_id, (sum(test01.data_object_id))
               ->  Append  (cost=12.20..2030.40 rows=3 width=44) (actual time=37.877..38.108 rows=72 loops=36)
                     ->  GroupAggregate  (cost=12.20..2022.09 rows=2 width=44) (actual time=37.856..37.857 rows=36 loops=36)
                           Group By Key: test01.object_id
                           ->  Bitmap Heap Scan on test01  (cost=12.20..2019.52 rows=510 width=12) (actual time=3.786..32.509 rows=18432 loops=36)
                                 Recheck Cond: (t1.object_id = object_id)
                                 Heap Blocks: exact=18432
                                 ->  Bitmap Index Scan on idx_test01_objectid  (cost=0.00..12.07 rows=510 width=0) (actual time=2.208..2.208 rows=18432 loops=36)
                                       Index Cond: (t1.object_id = object_id)
                     ->  GroupAggregate  (cost=0.00..8.28 rows=1 width=44) (actual time=0.203..0.203 rows=36 loops=36)
                           Group By Key: test03.object_id
                           ->  Index Scan using idx_test03_objectid on test03  (cost=0.00..8.27 rows=1 width=12) (actual time=0.159..0.166 rows=36 loops=36)
                                 Index Cond: (t1.object_id = object_id)
 Total runtime: 38.676 ms
(20 rows)

关于作者

罗炳森,云和恩墨数据库架构师,15年SQL优化&数据库性能优化经验,擅长超大型,超复杂SQL&存储过程优化,国内首位专注于SQL优化实战案例博客作者;5年+ 数据库/ETL/BI/大数据培训经验,累计为中国培养了近1000多名DBA,2000多名ETL/BI/大数据人才;2015年出版《Oracle查询优化改写技巧与案例》,累计销量达6万余册,2018年出版《SQL优化核心思想》,被翻译为繁体字在台湾出版。

标签:rows,..,object,MogDB,t1,cost,lazyagg,重写,id
From: https://blog.csdn.net/weixin_54551388/article/details/142958678

相关文章

  • C++ 面向对象、特征、重载和重写、构造和析构、向上和向下转型、深浅拷贝。
    什么是面向对象(Object-OrientedProgramming,OOP)1.面向对象是一种编程范式,它通过将软件系统的设计和开发分解为“对象”(Object)的方式来实现更好地组织代码。面向对象的核心思想是将程序的结构分为对象,这些对象包含数据和操作这些数据的函数(即方法)。每个对象是类的实例,而类定......
  • Java面向对象第四章方法重写与多态练习题
    练习1:使用重写优化电子宠物系统需求说明使用方法重写优化电子宠物系统,实现如下效果 packagecom.hz.ch02;/***猫类*@author26255**/publicclassCatextendsFather{ privateStringsex; publicCat(Stringname,inthealth,intlove,Stringse......
  • 重写之前的测试
    publicclassPlanInformation{privateintid;//日报流水号privateStringplanid;//产品生产批次号(例如:2312-110,有8位字符组成,前四位表示年月,后三位表示序号)privateStringplanname;//产品名称(例如:“表体”)privateStringprocess;//工序名称(例如:“10.00.射蜡”数字表示......
  • 存算分离+双集群容灾丨云和恩墨与华为共同发布 MogDB × OceanStor Dorado 联合解决方
    引言为期三天的第九届华为全联接大会(HUAWEICONNECT2024)于9月19日在上海世博中心&展览馆盛大召开。本次大会以“共赢行业智能化”为主题,邀请思想领袖、商业精英、技术专家、合作伙伴、开发者等业界同仁,从战略、产业、生态等方面探讨如何通过智能化、数字化技术,赋能千行万业,把握新......
  • 7-41 PAT排名汇总 (25 分)(详解 思路 重写sort函数)兄弟们冲呀呀呀呀呀呀呀
    二:思路===================================================================利用结构体存储学号成绩总排名考点号本考点的排名;然后在输入的时候要记得将j设置成一个变量,这样就能挨个把所有的值都输入到结构体数组当中。再然后就是,向结构体里的变量进行赋值。三:上码......
  • Day22笔记-多态&函数重写&运算符重载&对象的内置内容
    一、多态多态的前提:继承体现1:同一种事物的多种体现形式,如:动物有很多种体现2:在定义的过程无法确定变量的类型,只有当程序正常运行的时候才会确定该变量是什么类型,调用哪个函数#体现1:同一种事物的多种体现形式,如:动物有很多种classAnimal():  passclassCat(Animal):......
  • C#实现扫码枪串口通信(利用serialPort控件重写)
     1usingSystem;2usingSystem.Collections.Generic;3usingSystem.ComponentModel;4usingSystem.Data;5usingSystem.Drawing;6usingSystem.IO.Ports;7usingSystem.Linq;8usingSystem.Text;9usingSystem.Threading.Tasks;10usi......
  • YOLOv8改进 | 融合改进 | C2f融合重写星辰网络⭐以及CAA【二次融合 +​ CVPR2024】
      秋招面试专栏推荐 :深度学习算法工程师面试问题总结【百面算法工程师】——点击即可跳转......
  • 4.Java面向对象第四章方法重写与多态
    4.Java面向对象第四章方法重写与多态文章目录4.Java面向对象第四章方法重写与多态一、方法重写二、Object三、类型转换四、抽象一、方法重写1.在子类与父类之间2.方法名相同3.参数个数和类型相同5.访问权限不能严于父类二、Objectobject类是超类(基类)Java中......
  • web群集--nginx实现重定向与重写操作的详细配置过程详与案例展示
    文章目录前言什么是重定向?重定向能做什么?何时需要重定向功能?nginx通过什么来实现重定向和重写操作的?nginx的重定向和重写有什么区别?案例展示重定向1.将所有对将所有对http://test.com的访问重定向到http://www.test.com重写将所有对http://test.com/old-path的访问......