一、问题概述
某业务人员反馈帮领导导数据的SQL,执行了20分钟没出结果。同时他疑问为什么加了并行SQL仍然跑的慢?
1.1 SQL文本
SELECT *
FROM (SELECT XFSH, XFMC, GFSH, GHFDJXH, kpfs, kpje, ROWNUM AS XH
FROM (SELECT /*+ parallel(64)*/
KP.XFSH,
KP.XFMC,
KP.GFSH,
KP.GHFDJXH,
count(DISTINCT KP.SELF_FP_FPXX_ID) as kpfs,
sum(ABS(TO_NUMBER(HJJE)) + ABS(TO_NUMBER(HJSE))) as KPJE
FROM SSDSJ.SELF_FP_FPXX KP
WHERE KP.GLKPBZ = 'Y'
AND NVL(ZFBZ, 'N') = 'N'
AND FPKJZT NOT IN ('0', '2')
AND SFHZFP = '0'
AND KP.KPRQ >= DATE '2020-01-01'
AND KP.KPRQ < DATE '2022-11-01'
AND NOT EXISTS
(SELECT 1
FROM SSDSJ.SELF_FP_FPXX HZ
WHERE HZ.BLUE_FPDM = KP.FP_DM
AND HZ.BLUE_FPHM = KP.FP_HM
AND NVL(HZ.ZFBZ, 'N') = 'N'
AND HZ.FPKJZT NOT IN ('0', '2'))
group by KP.XFSH, KP.XFMC, KP.GFSH, KP.GHFDJXH
ORDER BY 4 DESC, 3 DESC) S) V
where V.XH <= 100
;
1.2 执行计划
二、问题原因
结合SQL文本及执行计划,业务人员为了提高SQL执行效率,加了64的并发,但仍然效率低下,原因是什么?我们知道并行只适用于多块读,而INDEX FULL SCAN和INDEX RANGE SCAN均为单块读,这就是并行未启作用的原因。
接下来观察执行计划,ID=7和9的过滤列,明显很差,但ROWS估算值却不大,如果ROWS估算正确,该SQL也不至于2小时出不了结果,所以这里不适合走索引+回表。
三、解决方案
经过以上分析,该SQL适合两个关联表走全表扫描+hash join,这样再加上适量的并行度,可以有效的提升SQL性能。
3.1 添加HINT
SQL> SELECT *
2 FROM (SELECT XFSH, XFMC, GFSH, GHFDJXH, kpfs, kpje, ROWNUM AS XH
3 FROM (SELECT /*+ parallel(8) full(KP)*/
4 KP.XFSH,
5 KP.XFMC,
6 KP.GFSH,
7 KP.GHFDJXH,
8 count(DISTINCT KP.SELF_FP_FPXX_ID) as kpfs,
9 sum(ABS(TO_NUMBER(HJJE)) + ABS(TO_NUMBER(HJSE))) as KPJE
10 FROM SSDSJ.SELF_FP_FPXX KP
11 WHERE KP.GLKPBZ = 'Y'
12 AND NVL(ZFBZ, 'N') = 'N'
13 AND FPKJZT NOT IN ('0', '2')
14 AND SFHZFP = '0'
15 AND KP.KPRQ >= DATE '2020-01-01'
16 AND KP.KPRQ < DATE '2022-11-01'
17 AND NOT EXISTS
18 (SELECT /*+ full(HZ) hj_aj*/ 1
19 FROM SSDSJ.SELF_FP_FPXX HZ
20 WHERE HZ.BLUE_FPDM = KP.FP_DM
21 AND HZ.BLUE_FPHM = KP.FP_HM
22 AND NVL(HZ.ZFBZ, 'N') = 'N'
23 AND HZ.FPKJZT NOT IN ('0', '2'))
24 group by KP.XFSH, KP.XFMC, KP.GFSH, KP.GHFDJXH
25 ORDER BY 4 DESC, 3 DESC) S) V
26 where V.XH <= 100;
3.2 优化后执行计划
四、性能对比
经过以上优化措施,大大提升了SQL效率,对比如下:
逻辑读 | 物理读 | 执行时间(秒) | |
优化前 | 15278511 | 12659,022 | 1200+ |
优化后 | 3946974 | 3879140 | 34 |