首页 > Linux操作系统 > Linux操作系统 > 关联查询时使用树状查询要小心
很多人为了方便经常将几个表的关联查询和树状查询合在一起,这样做很可能产生一些并不需要的记录。
如果在查询的时候两张或两张以上的表进行关联查询,同时需要对其中一张执行CONNECT BY操作,那么建议要不然将CONNECT BY放到查询最里层,要不然将CONNECT BY放到查询最外层,而不要将CONNECT BY与多表连接放在一起。
通过一个例子来说明这个问题吧:
SQL> CREATE TABLE T (ID NUMBER, FID NUMBER, NAME VARCHAR2(30));
表已创建。
SQL> ALTER TABLE T ADD CONSTRAINT PK_T PRIMARY KEY (ID);
表已更改。
SQL> INSERT INTO T VALUES (1, 0, 'TABLE');
已创建 1 行。
SQL> INSERT INTO T VALUES (2, 0, 'INDEX');
已创建 1 行。
SQL> INSERT INTO T SELECT 10000 + ROWNUM, 1, TABLE_NAME FROM DBA_TABLES;
已创建1624行。
SQL> INSERT INTO T SELECT 20000 + ROWNUM, 2, INDEX_NAME FROM DBA_INDEXES;
已创建2401行。
SQL> CREATE INDEX IND_T_FID ON T(FID);
索引已创建。
SQL> CREATE TABLE T1 (ID NUMBER, USERNAME VARCHAR2(30), TID NUMBER);
表已创建。
SQL> ALTER TABLE T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (ID);
表已更改。
SQL> INSERT INTO T1 VALUES (1, 'YANGTK', 1);
已创建 1 行。
SQL> INSERT INTO T1 VALUES (2, 'YANGTK', 10005);
已创建 1 行。
SQL> INSERT INTO T1 VALUES (3, 'YANGTK', 10006);
已创建 1 行。
SQL> INSERT INTO T1 VALUES (4, 'TEST', 1);
已创建 1 行。
SQL> CREATE INDEX IND_T1_NAME ON T1(USERNAME);
索引已创建。
构造一个很简单的树型结构的表,并构造一张关联表,下面执行一个简单的连接查询:
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK';
ID FID NAME USERNAME
---------- ---------- ------------------------------ -------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
如果在这个查询的基础上加上一个树型查询,START WITH FID = 0,看看执行的结果:
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK'
5 START WITH FID = 0
6 CONNECT BY PRIOR T.ID = FID;
ID FID NAME USERNAME
---------- ---------- ------------------------------ -----------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
现在结果变成了5条,为什么不加树型查询得到3条记录,而添加了树型查询结果变成了5条呢。
观察一下执行计划:
SQL> SET AUTOT ON EXP
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK'
5 START WITH FID = 0
6 CONNECT BY PRIOR T.ID = FID;
ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
执行计划
----------------------------------------------------------
Plan hash value: 1198160839
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 292 | 3 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | FILTER | | | | | |
| 4 | COUNT | | | | | |
| 5 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 16 | COUNT | | | | | |
| 17 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 18 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T1"."USERNAME"='YANGTK')
2 - filter("FID"=0)
3 - filter("FID"=0)
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
15 - access("T"."ID"="T1"."TID")
20 - access("T"."ID"="T1"."TID")
Note
-----
- dynamic sampling used for this statement
Oracle先进行的连接,然后是CONNECT BY,最后使用filter("T1"."USERNAME"='YANGTK')来进行过滤。
这就是造成数据重复的原因。
实际上这个SQL相当于先执行了一个:
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 START WITH FID = 0
5 CONNECT BY PRIOR T.ID = FID;
ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
1 0 TABLE TEST
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
已选择6行。
执行计划
----------------------------------------------------------
Plan hash value: 2039799266
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 292 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
|* 2 | FILTER | | | | | |
| 3 | COUNT | | | | | |
| 4 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 8 | HASH JOIN | | | | | |
| 9 | CONNECT BY PUMP | | | | | |
| 10 | COUNT | | | | | |
| 11 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 15 | COUNT | | | | | |
| 16 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 17 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 18 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 19 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FID"=0)
2 - filter("FID"=0)
7 - access("T"."ID"="T1"."TID")
8 - access("FID"=NULL)
14 - access("T"."ID"="T1"."TID")
19 - access("T"."ID"="T1"."TID")
Note
-----
- dynamic sampling used for this statement
然后再次基础上执行了T1.USERNAME = 'YANGTK'。
直接的关联加树型查询等价于下面的SQL:
SQL> SELECT * FROM
2 (
3 SELECT T.ID, FID, NAME, USERNAME
4 FROM T, T1
5 WHERE T.ID = T1.TID
6 START WITH FID = 0
7 CONNECT BY PRIOR T.ID = FID
8 )
9 WHERE USERNAME = 'YANGTK';
ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
执行计划
----------------------------------------------------------
Plan hash value: 778818883
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 240 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 4 | 240 | 3 (0)| 00:00:01 |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | FILTER | | | | | |
| 4 | COUNT | | | | | |
| 5 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 16 | COUNT | | | | | |
| 17 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 18 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("USERNAME"='YANGTK')
2 - filter("FID"=0)
3 - filter("FID"=0)
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
15 - access("T"."ID"="T1"."TID")
20 - access("T"."ID"="T1"."TID")
Note
-----
- dynamic sampling used for this statement
而这个结果显然不是希望得到的。
对于这种即包含关联又包含树型查询的SQL,最好的方法是先做一个,再做另一个,这样可以确保得到预期的结果:
SQL> SELECT * FROM
2 (
3 SELECT T.ID, FID, NAME, USERNAME
4 FROM T, T1
5 WHERE T.ID = T1.TID
6 AND T1.USERNAME = 'YANGTK'
7 )
8 START WITH FID = 0
9 CONNECT BY PRIOR ID = FID;
ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
执行计划
----------------------------------------------------------
Plan hash value: 3886537187
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 219 | 2 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
|* 2 | FILTER | | | | | |
| 3 | COUNT | | | | | |
| 4 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID| T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 15 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 16 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 17 | COUNT | | | | | |
| 18 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 20 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 21 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 22 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FID"=0)
2 - filter("FID"=0)
6 - access("T1"."USERNAME"='YANGTK')
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
14 - access("T1"."USERNAME"='YANGTK')
16 - access("T"."ID"="T1"."TID")
20 - access("T1"."USERNAME"='YANGTK')
22 - access("T"."ID"="T1"."TID")
Note
-----
- dynamic sampling used for this statement
刚才是先做关联,然后做树型查询,也可以反过来,先做树型查询:
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM
3 (
4 SELECT ID, FID, NAME
5 FROM T
6 START WITH FID = 0
7 CONNECT BY PRIOR ID = FID
8 ) T, T1
9 WHERE T.ID = T1.TID
10 AND T1.USERNAME = 'YANGTK';
ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
执行计划
----------------------------------------------------------
Plan hash value: 973607771
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40 | 2920 | 3 (34)| 00:00:01 |
|* 1 | HASH JOIN | | 40 | 2920 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 4 | VIEW | | 40 | 1720 | 1 (0)| 00:00:01 |
|* 5 | CONNECT BY WITH FILTERING | | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID | T | | | | |
|* 7 | INDEX RANGE SCAN | IND_T_FID | 2 | 26 | 1 (0)| 00:00:01 |
| 8 | NESTED LOOPS | | | | | |
| 9 | BUFFER SORT | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | TABLE ACCESS BY INDEX ROWID| T | 40 | 1720 | 1 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | IND_T_FID | 16 | | 1 (0)| 00:00:01 |
|* 13 | TABLE ACCESS FULL | T | 40 | 1720 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T"."ID"="T1"."TID")
3 - access("T1"."USERNAME"='YANGTK')
5 - filter("FID"=0)
7 - access("FID"=0)
12 - access("FID"=NULL)
13 - access("FID"=NULL)
Note
-----
- dynamic sampling used for this statement
两种方法选择那种,就需要根据具体情况进行分析了。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-69444/,如需转载,请注明出处,否则将追究法律责任。