ITPub博客

首页 > 数据库 > Oracle > 都是标量子查询惹的祸

都是标量子查询惹的祸

Oracle 作者:Appleses 时间:2016-01-30 17:05:14 0 删除 编辑


都是标量子查询惹的祸


系统又报了一个跑的慢的sql语句,看图就知道这个很恐怖的,已经跑了1天了,还需要跑6个月的时间,

 

把sql语句拿出来瞅瞅:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未结清余额,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 余额,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 额度区间,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企业主'

else '非小企业主' end 小企业主标识

from riskpubstrategy.lwt_ambs_cc_201406 A

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企业主'

else '非小企业主' end;

 

内存中的执行计划:SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('a50xqp45uy256',0,'advanced'));

简单分析下,查询的只有一个表,这就很奇怪了,不管这个表多么大,也不可能跑这么久的吧,但是有一个标量子查询,根据经验,如果标量子查询的表数据量很多,或者关联的列没有索引的话,标量子查询的性能将会是非常差的,我们由执行计划也可以看出标量子查询的表RISKPUBSC.MICRO_BUSI_DATABASE的关联列没有索引,至此,我们猜测可能是由于这个标量子查询引起sql语句性能低下,猜测归猜测,那我们实验一下呢?

 

先去掉标量子查询,然后执行一下,如下:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未结清余额,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 余额,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 额度区间/*,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企业主'

else '非小企业主' end 小企业主标识*/

from riskpubstrategy.lwt_ambs_cc_201406 A

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end/*,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企业主'

else '非小企业主' end*/

 

执行一下: ,发现这个sql很快的,大约274秒,也就是5分钟左右,好吧,看来真是标量子查询惹的祸,,,,,,那我们首先建立索引看看性能如何呢?

create index ind_MICRO_BUSI_DATABASE_acct on RISKPUBSC.MICRO_BUSI_DATABASE(acct) NOLOGGING parallel 8;

alter index ind_MICRO_BUSI_DATABASE_acct NOPARALLEL;

 

然后再重新执行一下之前的语句:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未结清余额,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 余额,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 额度区间,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企业主'

else '非小企业主' end 小企业主标识

from riskpubstrategy.lwt_ambs_cc_201406 A

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企业主'

else '非小企业主' end;

加了索引后其执行计划:

 

发现性能比较好,,大约就8分钟搞定,,,,,,至此其实这个sql加个索引就算优化好了,但是根据经验,如果标量子查询的表的数据量超大(一般认为超过100W,不是绝对的)的时候,而且关联的列又不是主键的时候,标量子查询的性能其实不是怎么好,,,,,,好吧,那我们来看看他们的数据量如何呢?????

 

首先看看数据量,

SELECT v.TABLE_NAME,

v.BLOCKS,

v.table_size2,

v.NUM_ROWS,

v.LAST_ANALYZED

FROM vw_table_lhr v

WHERE v.TABLE_NAME IN ('LWT_AMBS_CC_201406',

'MICRO_BUSI_DATABASE');

发现标量子查询的表是有点大,300W的数据量,如果是标量子查询的话,大约要对标量子查询的表(MICRO_BUSI_DATABASE)扫描1988W次,相当于NL连接了,所以这个性能不怎么好,那么就修改为左外连接呗,修改后的sql:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未结清余额,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 余额,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 额度区间,

CASE WHEN b.ACCT is not null THEN '小企业主'

else '非小企业主' end 小企业主标识

from riskpubstrategy.lwt_ambs_cc_201406 A

LEFT outer join RISKPUBSC.MICRO_BUSI_DATABASE B

on (a.ACCT=b.ACCT)

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end,

CASE WHEN b.ACCT is not null THEN '小企业主'

else '非小企业主' end

 

执行计划:

 

 

时间呢?????尼玛,,,,,,,这么快,,,,,由6个月转换为44秒,,,,,,又是由自行车的速度到飞船的速度呀,,,,,,飞跃吧,,,,,,,,

这里细心的人会发现,我去掉了b.STATUS_CURR is not null 这个条件,因为我查询过这个列本身就没有为空的数据,,,,,这个又是个小小的优化哟,,,,,

 

 

最后简单总结下:

标量子查询的使用是有条件的:

①标量子查询的涉及的表的关联列必须要有索引或者关联列是主键列

②标量子查询的涉及的表的数据量不能太大,数据量太大即使关联列有索引依然很慢的

③对同一个表不能有多个标量子查询,这样性能也不会很高

 

第三个注意事项是神马意思???????简单举个例子:

Select (select a from taba t1 where t1.id=t.id) a,

(select b from taba t1 where t1.id=t.id) b,

(select c from taba t1 where t1.id=t.id) c

……….

From tabb ;

 

这种情况大家还是修改为左外连接比较快。。。。。。如果taba稍微大点的话就会导致到大表的多次扫描,这个严重影响性能。。。。。。。。。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/28628435/viewspace-1984512/,如需转载,请注明出处,否则将追究法律责任。

下一篇: 没有了~
请登录后发表评论 登录
全部评论

注册时间:2016-01-30

  • 博文量
    120
  • 访问量
    27250