ITPub博客

首页 > IT职业 > IT生活 > 使用exists、not eixsts不能用到semi-join、anti-join的一些情况讨论

使用exists、not eixsts不能用到semi-join、anti-join的一些情况讨论

原创 IT生活 作者:yaanzy 时间:2005-10-18 13:18:15 0 删除 编辑

在Oralce中我们经常会使用exists、not exists来进行一些表连接操作,在普通情况下Oracle的执行计划如果使用到semi-join、anti-join方式来进行表连接的话,速度一遍很快,否则很慢。通过使用提示hash_sj hash_aj也可使执行计划走semi-join、anti-join。

但是在有些情况下,即使增加了提示,或者对表进行了分析后,执行计划也不走semi-join、anti-join方式,下面就是对这些情况的讨论。

附注:本文中的例子均是在Oracle9.2.0.6 for linux中测试的。

[@more@]

1、在where条件中多个exists、not exists是or的关系

SQL> create table a1 as select * from dba_objects where rownum <= 10000;
SQL> create table a2 as
select * from (select t.*,rownum rd from dba_objects t) p where rd>5000 and rd<= 15000;
SQL> create table a3 as
select * from (select t.*,rownum rd from dba_objects t) p where rd>7500 and rd<= 17500;

SQL> analyze table a1 compute statistics;
SQL> analyze table a2 compute statistics;
SQL> analyze table a3 compute statistics;

SQL> set autotrace on exp
SQL> set timing on

----- exists or exists 情况

SQL> select a1.owner,count(*)
2 from a1
3 where exists
4 (select /*+ hash_sj(a1 a2) */* from a2
5 where a2.object_id=a1.object_id)
6 or
7 exists
8 (select /*+ hash_sj(a1 a2) */* from a3
9 where a3.object_id=a1.object_id)
10 group by a1.owner;

OWNER COUNT(*)
------------------------------ ----------
SYS 5000

已用时间: 00: 00: 26.66

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=18 Card=1 Bytes=7)
1 0 SORT (GROUP BY) (Cost=18 Card=1 Bytes=7)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=975 Bytes=6825)
4 2 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=1 Bytes=4)
5 2 TABLE ACCESS (FULL) OF 'A3' (Cost=17 Card=1 Bytes=4)


执行计划中显然没有用到semi-jobin,这时候只有改写SQL,通过union all的方式来达到使用到semi-join的目的。

SQL> select x.owner,count(*)
2 from (select * from a1
3 where exists
4 (select /*+ hash_sj(a1 a2)*/* from a2
5 where a2.object_id= a1.object_id )
6 union all
7 select * from a1
8 where not exists
9 (select /*+ hash_aj(a1 p)*/* from a2 p
10 where p.object_id=a1.object_id)
11 and exists
12 (select /*+ hash_sj(a1 a3)*/* from a3
13 where a3.object_id=a1.object_id)
14 ) x
15 group by x.owner;

OWNER COUNT(*)
------------------------------ ----------
SYS 5000

已用时间: 00: 00: 00.31

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=90 Card=10001 Bytes=
170017)

1 0 SORT (GROUP BY) (Cost=90 Card=10001 Bytes=170017)
2 1 VIEW (Cost=88 Card=10001 Bytes=170017)
3 2 UNION-ALL
4 3 HASH JOIN (SEMI) (Cost=35 Card=10000 Bytes=200000)
5 4 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=70000)

6 4 VIEW OF 'VW_SQ_2' (Cost=17 Card=10000 Bytes=130000)

7 6 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000Bytes=40000)

8 3 HASH JOIN (ANTI) (Cost=53 Card=1 Bytes=24)
9 8 HASH JOIN (SEMI) (Cost=35 Card=10000 Bytes=200000)
10 9 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=70000)

11 9 VIEW OF 'VW_SQ_1' (Cost=17 Card=10000 Bytes=130000)

12 11 TABLE ACCESS (FULL) OF 'A3' (Cost=17 Card=10000 Bytes=40000)

13 8 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)

----- not exists or not exists 情况

SQL> select a1.owner,count(*)
2 from a1
3 where not exists
4 (select /*+ hash_sj(a1 a2) */* from a2
5 where a2.object_id=a1.object_id)
6 or
7 not exists
8 (select /*+ hash_sj(a1 a2) */* from a3
9 where a3.object_id=a1.object_id)
10 group by a1.owner;

OWNER COUNT(*)
------------------------------ ----------
SYS 7500

已用时间: 00: 00: 20.72

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=18 Card=1 Bytes=7)
1 0 SORT (GROUP BY) (Cost=18 Card=1 Bytes=7)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=975 Bytes=6825)
4 2 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=1 Bytes=4)
5 2 TABLE ACCESS (FULL) OF 'A3' (Cost=17 Card=1 Bytes=4)

SQL> select x.owner,count(*)
2 from (select * from a1
3 where not exists
4 (select /*+ hash_sj(a1 a2)*/* from a2
5 where a2.object_id= a1.object_id )
6 union all
7 select * from a1
8 where exists
9 (select /*+ hash_aj(a1 p)*/* from a2 p
10 where p.object_id=a1.object_id)
11 and not exists
12 (select /*+ hash_sj(a1 a3)*/* from a3
13 where a3.object_id=a1.object_id)
14 ) x
15 group by x.owner;

OWNER COUNT(*)
------------------------------ ----------
SYS 7500

已用时间: 00: 00: 00.31

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=89 Card=2 Bytes=34)
1 0 SORT (GROUP BY) (Cost=89 Card=2 Bytes=34)
2 1 VIEW (Cost=88 Card=2 Bytes=34)
3 2 UNION-ALL
4 3 HASH JOIN (ANTI) (Cost=35 Card=1 Bytes=11)
5 4 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=70000)

6 4 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)

7 3 HASH JOIN (SEMI) (Cost=53 Card=1 Bytes=24)
8 7 HASH JOIN (ANTI) (Cost=35 Card=1 Bytes=11)
9 8 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=70000)

10 8 TABLE ACCESS (FULL) OF 'A3' (Cost=17 Card=10000 Bytes=40000)

11 7 VIEW OF 'VW_SQ_1' (Cost=17 Card=10000 Bytes=130000)

12 11 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)


----- not exists or exists 情况

SQL> select a1.owner,count(*)
2 from a1
3 where not exists
4 (select /*+ hash_sj(a1 a2) */* from a2
5 where a2.object_id=a1.object_id)
6 or exists
7 (select /*+ hash_sj(a1 a3) */* from a3
8 where a3.object_id=a1.object_id)
9 group by a1.owner;

OWNER COUNT(*)
------------------------------ ----------
SYS 7500

已用时间: 00: 00: 20.57

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=18 Card=1 Bytes=7)
1 0 SORT (GROUP BY) (Cost=18 Card=1 Bytes=7)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=975 Bytes=6825)
4 2 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=1 Bytes=4)
5 2 TABLE ACCESS (FULL) OF 'A3' (Cost=17 Card=1 Bytes=4)

SQL> select x.owner,count(*)
2 from (select * from a1
3 where not exists
4 (select /*+ hash_sj(a1 a2)*/* from a2
5 where a2.object_id= a1.object_id )
6 union all
7 select * from a1
8 where exists
9 (select /*+ hash_aj(a1 p)*/* from a2 p
10 where p.object_id=a1.object_id)
11 and exists
12 (select /*+ hash_sj(a1 a3)*/* from a3
13 where a3.object_id=a1.object_id)
14 ) x
15 group by x.owner;

OWNER COUNT(*)
------------------------------ ----------
SYS 7500

已用时间: 00: 00: 00.32

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=90 Card=10001 Bytes=
170017)

1 0 SORT (GROUP BY) (Cost=90 Card=10001 Bytes=170017)
2 1 VIEW (Cost=88 Card=10001 Bytes=170017)
3 2 UNION-ALL
4 3 HASH JOIN (ANTI) (Cost=35 Card=1 Bytes=11)
5 4 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=70000)

6 4 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)

7 3 HASH JOIN (SEMI) (Cost=53 Card=10000 Bytes=330000)
8 7 HASH JOIN (SEMI) (Cost=35 Card=10000 Bytes=200000)
9 8 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=70000)

10 8 VIEW OF 'VW_SQ_2' (Cost=17 Card=10000 Bytes=130000)

11 10 TABLE ACCESS (FULL) OF 'A3' (Cost=17 Card=10000 Bytes=40000)

12 7 VIEW OF 'VW_SQ_1' (Cost=17 Card=10000 Bytes=130000)

13 12 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)

从上面的例子可以看出速度的提高是非常明显的,当然对于or的情况越多,修改后的SQL也就越复杂,不过相对于效率
的提高来说还是很值得的。

2、如果子句中含有group by这种排序操作,使用in来代替exists操作会大大提高效率,示例如下:

修改一下示例表:

SQL> insert into a2 select * from a2;
SQL> commit;

<1> 使用exists操作,无论添加什么提示都不能使其走semi-join,耗费大量时间

SQL> select count(*) from a1 a
2 where exists
3 ( select /*+ hash_sj(a b) */*
4 from ( select object_id from a2
5 group by object_id having count(*) > 1
6 ) b
7 where a.object_id=b.object_id
8 );

COUNT(*)
----------
5000

已用时间: 00: 00: 46.22

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=500 Bytes=2000)
4 2 FILTER
5 4 SORT (GROUP BY NOSORT) (Cost=17 Card=1 Bytes=4)
6 5 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=1 Bytes=4)

<2> 将exists改为in句型,就可以用到semi-join,效率大大提高

SQL> select count(*) from a1 a where object_id in
2 ( select object_id from a2 b
3 group by object_id having count(*) > 1
4 );

COUNT(*)
----------
5000

已用时间: 00: 00: 00.90

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=37 Card=1 Bytes=17)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (SEMI) (Cost=37 Card=500 Bytes=8500)
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=40000)
4 2 VIEW OF 'VW_NSO_1' (Cost=19 Card=500 Bytes=6500)
5 4 FILTER
6 5 SORT (GROUP BY) (Cost=19 Card=500 Bytes=2000)
7 6 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)

对于not exists情况也是同样的,需要将语句改写为not in,但是要加上提示以及关联的列要有非空的限制才能使其走anti-join的执行计划。

SQL> select count(*) from a1 a
2 where not exists
3 ( select /*+ hash_aj(a b)*/* from
4 ( select object_id from a2
5 group by object_id having count(*) > 1
6 ) b
7 where a.object_id=b.object_id
8 );

COUNT(*)
----------
5000

已用时间: 00: 00: 46.47

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=500 Bytes=2000)

4 2 FILTER
5 4 SORT (GROUP BY NOSORT) (Cost=17 Card=1 Bytes=4)
6 5 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=1 Bytes=4)

SQL> select count(*) from a1 a
2 where object_id is not null
3 and object_id not in
4 ( select object_id from a2 b
5 where object_id is not null
6 group by object_id having count(*) > 1
7 );

COUNT(*)
----------
5000

已用时间: 00: 00: 00.75

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=37 Card=1 Bytes=17)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (ANTI) (Cost=37 Card=9500 Bytes=161500)
3 2 TABLE ACCESS (FULL) OF 'A1' (Cost=17 Card=10000 Bytes=40000)
4 2 VIEW OF 'VW_NSO_1' (Cost=19 Card=500 Bytes=6500)
5 4 FILTER
6 5 SORT (GROUP BY) (Cost=19 Card=500 Bytes=2000)
7 6 TABLE ACCESS (FULL) OF 'A2' (Cost=17 Card=10000 Bytes=40000)

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/3898/viewspace-808146/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
  • 博文量
    108
  • 访问量
    766811