最近有个应用,前端调用后台的一个逻辑很慢,请开发提供了对应逻辑使用的SQL,进行脱敏,示例如下,
select t.AGENT as agent,
nvl(sum(case
when t.operation_type = 'A' then 1 else 0 end),0) as
DflCount,
nvl(sum(case
when t.operation_type = 'B' then 1 else 0 end),0) as
IfCount,
nvl(sum(case
when t.operation_type = 'C' then 1 else 0 end),0) as
AecCount,
nvl(sum(case
when t.operation_type = 'D' then 1 else 0 end),0) as
BsCount
from OP_LOG t
where
t.code = 'AA'
and t.ORI_CODE = 'ABC'
and t.T_DATE BETWEEN to_date('20201209','yyyymmdd') and to_date('20201209','yyyymmdd')
and IS_VALID = 1
and t.operation_type in ('A','B','C','D')
group by t.agent
order by agent;
生产环境这张表的统计信息如下,3000多万的数据,
NUM_ROWS BLOCKS AVG_ROW_LEN
36181236 866883 118
这张表存在一个主键唯一索引,两个非唯一索引,
SQL> create index idx_op_log_01 on op_log(t_date);
Index created.
SQL> create index idx_op_log_03 on op_log(t_no, ori_code, t_code, t_date);
Index created.
SQL> alter table op_log add constraint pk_op_log_id primary key (id);
Table altered.
这三个索引对应的统计信息,
BLEVEL LEAF_BLOCKS DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY CLUSTERING_FACTORY NUM_ROWS
3 277636 152 1793 16031 2436821 30765015
3 270751 1429772 1 23 33314457 34989358
3 177428 38178216 1 1 37102260 38178216
生产环境SQL的执行计划如下所示,可以看到,选择的是idx_op_log_03的索引跳跃扫描,成本值是2585,buffer是115,
------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2585 (100)| | 0 |00:00:00.01 | 115 |
| 1 | SORT GROUP BY | | 1 | 4 | 2585 (1)| 00:00:32 | 0 |00:00:00.01 | 115 |
|* 2 | TABLE ACCESS BY INDEX ROWID| OP_LOG | 1 | 4 | 2583 (0)| 00:00:32 | 0 |00:00:00.01 | 115 |
|* 3 | INDEX SKIP SCAN | IDX_OP_LOG_03 | 1 | 25 | 2560 (0)| 00:00:31 | 0 |00:00:00.01 | 115 |
------------------------------------------------------------------------------------------------------------------------------------------------
2 - filter((INTERNAL_FUNCTION("T"."OPERATION_TYPE") AND "IS_VALID"=1))
3 - access("T"."ORI_CODE"='ABC' AND "T"."T_CODE"='A' AND "T"."T_DATE"=TO_DATE(' 2020-12-09
00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
filter(("T"."T_DATE"=TO_DATE(' 2020-12-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "T"."ORI_CODE"='ABC' AND
"T"."T_CODE"='A'))
这张表是3000万的数据量,单表查询,一个主键索引,一个t_date单键值索引,一个复合索引(t_no,ori_code,t_code,t_date)。原始SQL的条件中包含了idx_op_log_03复合索引除前导列的另外三个字段,常规上可供Oracle优化器选择的执行计划,一个是全表扫描,一个就是idx_op_log_03的索引跳跃扫描,经过计算,Oracle认为索引跳跃扫描的成本值更低,这是他认为当前情况下最优的执行计划。
全表扫描,会多块读3000万数据所有大约86万个数据块,索引跳跃扫描,其实是构建遍历了所有前导列值的索引,伪代码如下,大约27万个数据块,而且还得单块读回表,实际读的数据块会更高,而且t_no的dinstinct,不是很少,因此这两种执行计划,算是半斤八两,
...
where
t_no='1' and ori_code='A' and t_code='ABC' and t_date ...
and
t_no='2' and ori_code='A' and t_code='ABC' and t_date ...
and
...
and
t_no='...' and ori_code='A' and t_code='ABC' and t_date ...
这个查询条件还是很简单的,如果根据当前的条件,创建新的索引,会提高效率么?
如果是DG,通常情况,我们可以开启备库的snapshot,模拟创建索引,测试他的效率,但是现在不让做,只可以在测试系统进行测试了。
测试系统这张表的数据量就几万,如果要贴近生产,增加数据量是一种形式,但是增加到几千万的量级,一个是需要消耗时间,另一个就是消耗空间。其实,Oracle是通过统计信息计算执行计划成本值的,因此我们可以借助改造统计信息,来达到让Oracle认为这是“生产”量级的计算,验证我们的调整。
首先,通过dbms_stats.set_table_stats设置表的统计信息值,
SQL> exec dbms_stats.set_table_stats('BISAL','OP_LOG',numrows=>36181263,numblks=>866883);
PL/SQL procedure successfully completed.
可以看到,当前表的统计信息和生产基本一致了,
SQL> select table_name,num_rows,blocks,avg_row_len from user_tables
2 where table_name='OP_LOG';
TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN
---------------- ---------- -------- -----------
OP_LOG 36181263 866883 129
接着,通过dbms_stats.set_index_stats设置idx_op_log_03的索引统计信息,
SQL> EXEC dbms_stats.set_index_stats('BISAL','IDX_OP_LOG_03',indlevel=>3,numlblks=>270751,numdist=>1429772,avglblk=>1,avgdblk=>23,clstfct=>33314457,numrows=>34989358);
PL/SQL procedure successfully completed.
执行SQL,在测试环境下,其执行计划和生产的相同,都是idx_op_log_03的索引跳跃扫描。根据SQL条件,我们会创建两个索引,索引01是调整了原始索引03的字段顺序,将t_no置在尾部,索引02是删除t_no,增加了agent,
1. t_code, ori_code, t_date, t_no
2. t_code, ori_code, t_date, agent
SQL> create index idx_op_log_01 on op_log(t_code, ori_code, t_date, t_no);
Index created.
SQL> create index idx_op_log_02 on op_log(t_code, ori_code, t_date, agent);
Index created.
执行SQL,他采用的是idx_op_log_02,cost是7,
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 7 (100)| | 0 |00:00:00.01 | 2 | 1 |
| 1 | SORT GROUP BY | | 1 | 111 | 7 (15)| 00:00:01 | 0 |00:00:00.01 | 2 | 1 |
|* 2 | TABLE ACCESS BY INDEX ROWID| OP_LOG | 1 | 1545 | 6 (0)| 00:00:01 | 0 |00:00:00.01 | 2 | 1 |
|* 3 | INDEX RANGE SCAN | IDX_OP_LOG_02 | 1 | 10817 | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 2 | 1 |
---------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter((INTERNAL_FUNCTION("T"."OPERATION_TYPE") AND "IS_VALID"=1))
3 - access("T"."T_CODE"='A' AND "T"."ORI_CODE"='ABC' AND "T"."T_DATE"=TO_DATE(' 2020-12-09 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
为了对比,通过HINT使用idx_op_log_01索引,cost是8,01和02索引相差很小,
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 8 (100)| | 0 |00:00:00.01 | 2 | 1 |
| 1 | SORT GROUP BY | | 1 | 111 | 8 (13)| 00:00:01 | 0 |00:00:00.01 | 2 | 1 |
|* 2 | TABLE ACCESS BY INDEX ROWID| OP_LOG | 1 | 1545 | 7 (0)| 00:00:01 | 0 |00:00:00.01 | 2 | 1 |
|* 3 | INDEX RANGE SCAN | IDX_OP_LOG_01 | 1 | 10817 | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 2 | 1 |
---------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter((INTERNAL_FUNCTION("T".OPERATION_TYPE") AND "IS_VALID"=1))
3 - access("T"."T_CODE"='A' AND "T"."ORI_CODE"='ABC' AND "T"."T_DATE"=TO_DATE(' 2020-12-09 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
因此,单就这个场景,最佳选择是idx_op_log_02,但是保险起见,还是得结合程序中对该表的使用形式,以及数据结构,做下决策,例如选择索引01,除了其成本值,和01相差很小外,还可以删除原始索引03,降低索引个数。还可以考虑是否能将operation_type加入到索引中?
从这个案例,能体会到提升SQL语句的性能,看着好像就是创建索引,但是往细了抠,还是有很多知识点能让我们挖掘的,考虑因素越多,虽然脑壳疼,但是很可能让我们的应用运行起来更顺畅,少些烦恼,这可能就是”磨刀不误砍柴工“,还得持续学习。