ITPub博客

首页 > 数据库 > Oracle > SQL优化:NOT IN 与 NOT EXSIST的"隐性转换"带来的问题

SQL优化:NOT IN 与 NOT EXSIST的"隐性转换"带来的问题

原创 Oracle 作者:Opendba 时间:2015-10-09 15:45:32 0 删除 编辑
据同事反应,他们有一条sql跑2个小时都跑不出来,特别的慢。其中有一部分数据是通过分布式查询获取的,怀疑是不是DBLINK传输有问题。结果却是not in 隐性转换成了not exsist导致没有走索引,改变了最优的执行计划.

1.找到同事拿到了SQL

SQL> select c.g3e_fid, c.cd_ssdw, c.sbmc, c.cd_smzq, c.g3e_fno
  2    from GZDLGIS.B$GG_PD_DL_N  a,
  3         GZDLGIS.B$GG_PD_DL_LN b,
  4         GZDLGIS.B$COMMON_N    c
  5   where a.g3e_fid = b.g3e_fid
  6     and a.g3e_fid = c.g3e_fid
  7     and a.ltt_id = 0
  8     and b.ltt_id = 0
  9     and c.ltt_id = 0
10     and a.g3e_fid not in
11         (select e.g3e_fid
12            from jacal.em_e_common@pwgis e, JACAL.EM_E_DLDL@pwgis
13           where e.ref_oid = 0
14             and e.isdeleted = 0
15             and e.type_code = 2279
16             and e.oid = JACAL.EM_E_DLDL.oid)
17  ;
Execution Plan
----------------------------------------------------------
Plan hash value: 1294747072

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                         |   145K|  9822K|       |  6457M  (1)|999:59:59 |        |      |<<这根本跑不出来
|*  1 |  FILTER                   |                         |       |       |       |            |          |        |      |
|*  2 |   HASH JOIN               |                         |   145K|  9822K|  4464K| 47188   (1)| 00:11:01 |        |      |
|*  3 |    HASH JOIN              |                         |   142K|  2786K|  2960K|  1408   (2)| 00:00:20 |        |      |
|*  4 |     VIEW                  | index$_join$_002        |   137K|  1338K|       |   624   (2)| 00:00:09 |        |      |
|*  5 |      HASH JOIN            |                         |       |       |       |            |          |        |      |
|*  6 |       INDEX RANGE SCAN    | IDX_B$GG_PD_DL_LN_LTTID |   137K|  1338K|       |   128   (3)| 00:00:02 |        |      |
|   7 |       INDEX FAST FULL SCAN| IDX_GG_PD_DL_LN_FID     |   137K|  1338K|       |   223   (1)| 00:00:04 |        |      |
|*  8 |     TABLE ACCESS FULL     | B$GG_PD_DL_N            |   137K|  1339K|       |   534   (1)| 00:00:08 |        |      |
|*  9 |    TABLE ACCESS FULL      | B$COMMON_N              |  6460K|   301M|       | 29472   (2)| 00:06:53 |        |      |<<这里全表扫,所以COST高
|  10 |   REMOTE                  |                         |     1 |    69 |       | 46934   (1)| 00:10:58 |  PWGIS | R->S |<<这里是分布式了,还需要看另一边的执行计划
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM  "A2", "A1" WHERE "E"."REF_OID"=0 AND "E"."ISDELETED"=0 AND 
              TO_NUMBER("E"."TYPE_CODE")=2279 AND LNNVL("E"."G3E_FID"<>:B1) AND "E"."OID"="EM_E_DLDL"."OID"))<<not in 已经被隐性转换成了not exsist。

   2 - access("A"."G3E_FID"="C"."G3E_FID")
   3 - access("A"."G3E_FID"="B"."G3E_FID")
   4 - filter("B"."LTT_ID"=0)
   5 - access(ROWID=ROWID)
   6 - access("B"."LTT_ID"=0)
   8 - filter("A"."LTT_ID"=0)
   9 - filter("C"."LTT_ID"=0 AND "C"."G3E_FID"<>0)

Remote SQL Information (identified by operation id):
----------------------------------------------------

  10 - SELECT "A1"."OID","A1"."TYPE_CODE","A1"."REF_OID","A1"."G3E_FID","A1"."ISDELETED","A2"."OID" FROM
        "JACAL"."EM_E_COMMON" "A1","JACAL"."EM_E_DLDL" "A2" WHERE "A1"."OID"="A2"."OID" AND "A1"."REF_OID"=0 AND
        "A1"."ISDELETED"=0 AND TO_NUMBER("A1"."TYPE_CODE")=2279 AND LNNVL("A1"."G3E_FID"<>:1) (accessing 'PWGIS' )<<从REMOTE取得数据


2.根据上面执行计划,REMOTE只返回了一行记录,却浪费这么多的COST,这里看不到REMOTE端的执行计划,把SQL放到REMOTE pwgis 上看看计划

SQL> SELECT 0 FROM  jacal.em_e_common E, JACAL.EM_E_DLDL WHERE "E"."REF_OID"=0 AND "E"."ISDELETED"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND LNNVL("E"."G3E_FID"<>&B1) AND "E"."OID"="EM_E_DLDL"."OID";

Execution Plan
----------------------------------------------------------
Plan hash value: 3510700788

------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |               |     1 |    32 | 47094   (2)| 00:09:26 |
|   1 |  NESTED LOOPS      |               |     1 |    32 | 47094   (2)| 00:09:26 |
|*  2 |   TABLE ACCESS FULL| EM_E_COMMON   |     1 |    26 | 47094   (2)| 00:09:26 |<<全表扫EM_E_COMMON 
|*  3 |   INDEX UNIQUE SCAN| EM_E_DLDL_OID |     1 |     6 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("E"."REF_OID"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND
              "E"."ISDELETED"=0 AND LNNVL("E"."G3E_FID"<>10000))
   3 - access("E"."OID"="EM_E_DLDL"."OID")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     212336  consistent gets<<超高的一致读
          0  physical reads
          0  redo size
        507  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

3.REMOTE端这边jacal.em_e_common走了全表扫导致大量COST。现在是需要看看该的索引情况。

SQL> select INDEX_OWNER,INDEX_NAME,COLUMN_NAME,COLUMN_POSITION from dba_ind_columns where TABLE_NAME = 'EM_E_COMMON' and INDEX_OWNER='JACAL';
INDEX_OWNER                    INDEX_NAME                     COLUMN_NAME                  
------------------------------ ------------------------------ ------------------------------
JACAL                          EM_E_COMMON_EQMOID             EQM_OID                      
JACAL                          EM_E_COMMON_KXG3EFID           KXG3E_FID                    
JACAL                          EM_E_COMMON_TYPECODE           TYPE_CODE                    
JACAL                          EM_E_COMMON_G3E_FID            G3E_FID     <<<访问的谓词存大这两个索引                  
JACAL                          EM_E_COMMON_SSDW               SSDW                         
JACAL                          EM_E_COMMON_BDZ                BDZG3E_FID                   
JACAL                          EM_E_COMMON_PATTERN_CHOOSE     PATTERN_CHOOSE               
JACAL                          EM_E_COMMON_SMZQ               SMZQ                         
JACAL                          PK_EM_E_COMMON                 OID                          


4.上面有索引,没有走,看看该表的直方图。

SQL> select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,LAST_ANALYZED from dba_tab_columns where owner='&owner' and table_name='&tablename' order by 2 desc;
Enter value for owner: JACAL
Enter value for tablename: EM_E_COMMON
old   1: select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,LAST_ANALYZED from dba_tab_columns where owner='&owner' and table_name='&tablename' order by 2 desc
new   1: select COLUMN_NAME,NUM_DISTINCT,DENSITY,NUM_NULLS,LAST_ANALYZED from dba_tab_columns where owner='JACAL' and table_name='EM_E_COMMON' order by 2 desc

COLUMN_NAME                    NUM_DISTINCT    DENSITY  NUM_NULLS LAST_ANALYZE
------------------------------ ------------ ---------- ---------- ------------
OID                                 7256202 1.3781E-07          0 01-SEP-15
G3E_FID                             4481513 2.2314E-07      30363 01-SEP-15<<选择性好,有NULL值
REF_OID                              473212 1.5439E-06          0 01-SEP-15
NAME                                 468838 .000329272       3521 01-SEP-15
EDIT_TIME                            209190 4.7803E-06          0 01-SEP-15
AZDD                                  76939 .000012997    5284887 01-SEP-15
CCBH                                  51462 .000019432    6208283 01-SEP-15
GDZCBH                                15726 .000063589    7341041 01-SEP-15
JGDH                                   8056 .000378788      29470 01-SEP-15
SBBSM                                  7242 .000138083    7086898 01-SEP-15
SCCJ                                   5599 .001644737    5573886 01-SEP-15
KXNAME                                 3219 .000591716     595952 01-SEP-15
KXG3E_FID                              3212 .000311333     543561 01-SEP-15
BZ                                     2765 .000361664    7103932 01-SEP-15
TYRQ                                   1739 .000913242    1746924 01-SEP-15
CCNY                                   1513  .00122399    2914269 01-SEP-15
PATTERN_CHOOSE                         1462 .001003009    2030165 01-SEP-15
SCCJ_NAME                              1038 .001605136    2598284 01-SEP-15
SBGYS                                   777 .001287001    7242714 01-SEP-15
SGDW                                    469 .002132196      32745 01-SEP-15
CREATE_TIME                             460 .002173913    7357254 01-SEP-15
BDZNAME                                 277 .003690037     592677 01-SEP-15
BDZG3E_FID                              277 .003636364     591040 01-SEP-15
SBZR                                    275 .003164557    7319499 01-SEP-15
WHBS                                    168 8.8366E-08    1597936 01-SEP-15
WHBS_ID                                 115 .008695652    7199309 01-SEP-15
SBZR_ID                                  69 .014492754    7383353 01-SEP-15
TYPE_CODE                                40 6.8907E-08          0 01-SEP-15<<不适合索引
SYSM                                     16      .0625    7379177 01-SEP-15
SSDW                                     12 6.9125E-08      22922 01-SEP-15
ISDELETED                                 5 6.8907E-08          0 01-SEP-15
DYDJ                                      5 8.3647E-08    1278677 01-SEP-15
SMZQ                                      4 8.2405E-08    1188629 01-SEP-15
CQSS                                      3 6.9203E-08      31108 01-SEP-15
SFXZSB                                    2 6.9266E-08      37657 01-SEP-15
EQM_ID_CODE                               0          0    7384023 01-SEP-15
SBZT                                      0          0    7384023 01-SEP-15
EQM_OID                                   0          0    7384023 01-SEP-15
ISLABEL                                   0          0    7384023 01-SEP-15
DELETE_DATE                               0          0    7384023 01-SEP-15
CARD_CODE                                 0          0    7384023 01-SEP-15


5.有索引却没有走,需要选择性也好,我们再来分析下这个SQL
SELECT "A1"."OID",
       "A1"."TYPE_CODE",
       "A1"."REF_OID",
       "A1"."G3E_FID",
       "A1"."ISDELETED",
       "A2"."OID"
FROM "JACAL"."EM_E_COMMON" "A1", "JACAL"."EM_E_DLDL" "A2"
WHERE "A1"."OID" = "A2"."OID"
   AND "A1"."REF_OID" = 0
   AND "A1"."ISDELETED" = 0
   AND TO_NUMBER("A1"."TYPE_CODE") = 2279
   AND LNNVL("A1"."G3E_FID" <> :1);

发现G3E_FID因为被转化成了<>,也就是不等于,外面是LNNVL该函数是探测NULL值用的,如果是是UNKOWN状态就返回FALSE。这里应该把not转换成了<>来执行。

6.如果把谓词推入到REMOTE会不会就走索引了呢,那么把not in改成not exsist,就不会转换<>.

SQL> select c.g3e_fid,c.cd_ssdw,c.sbmc,c.cd_smzq,c.g3e_fno
  2    from GZDLGIS.B$GG_PD_DL_N a,GZDLGIS.B$GG_PD_DL_LN b,GZDLGIS.B$COMMON_N c
  3   where a.g3e_fid = b.g3e_fid
  4     and a.g3e_fid = c.g3e_fid
  5     and a.ltt_id = 0
  6     and b.ltt_id = 0
  7     and c.ltt_id = 0
  8     and not exists (select e.g3e_fid
  9    from jacal.em_e_common@pwgis e, JACAL.EM_E_DLDL@pwgis
10   where e.ref_oid = 0
11     and e.isdeleted = 0
12     and e.type_code = 2279
13     and e.oid = JACAL.EM_E_DLDL.oid and e.g3e_fid=a.g3e_fid );

Execution Plan
----------------------------------------------------------
Plan hash value: 1294747072

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                         |   145K|  9822K|       |   597K  (1)| 02:19:30 |        |      |
|*  1 |  FILTER                   |                         |       |       |       |            |          |        |      |
|*  2 |   HASH JOIN               |                         |   145K|  9822K|  4464K| 47188   (1)| 00:11:01 |        |      |
|*  3 |    HASH JOIN              |                         |   142K|  2786K|  2960K|  1408   (2)| 00:00:20 |        |      |
|*  4 |     VIEW                  | index$_join$_002        |   137K|  1338K|       |   624   (2)| 00:00:09 |        |      |
|*  5 |      HASH JOIN            |                         |       |       |       |            |          |        |      |
|*  6 |       INDEX RANGE SCAN    | IDX_B$GG_PD_DL_LN_LTTID |   137K|  1338K|       |   128   (3)| 00:00:02 |        |      |
|   7 |       INDEX FAST FULL SCAN| IDX_GG_PD_DL_LN_FID     |   137K|  1338K|       |   223   (1)| 00:00:04 |        |      |
|*  8 |     TABLE ACCESS FULL     | B$GG_PD_DL_N            |   137K|  1339K|       |   534   (1)| 00:00:08 |        |      |
|*  9 |    TABLE ACCESS FULL      | B$COMMON_N              |  6460K|   301M|       | 29472   (2)| 00:06:53 |        |      |
|  10 |   REMOTE                  |                         |     1 |    69 |       |     4   (0)| 00:00:01 |  PWGIS | R->S |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM  "A2", "A1" WHERE "E"."G3E_FID"=:B1 AND "E"."REF_OID"=0 AND
              "E"."ISDELETED"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND "E"."OID"="EM_E_DLDL"."OID"))
   2 - access("A"."G3E_FID"="C"."G3E_FID")
   3 - access("A"."G3E_FID"="B"."G3E_FID")
   4 - filter("B"."LTT_ID"=0)
   5 - access(ROWID=ROWID)
   6 - access("B"."LTT_ID"=0)
   8 - filter("A"."LTT_ID"=0)
   9 - filter("C"."LTT_ID"=0 AND "C"."G3E_FID"<>0)

Remote SQL Information (identified by operation id):
----------------------------------------------------

  10 - SELECT "A1"."OID","A1"."TYPE_CODE","A1"."REF_OID","A1"."G3E_FID","A1"."ISDELETED","A2"."OID" FROM
        "JACAL"."EM_E_COMMON" "A1","JACAL"."EM_E_DLDL" "A2" WHERE "A1"."OID"="A2"."OID" AND "A1"."G3E_FID"=:1 AND
        "A1"."REF_OID"=0 AND "A1"."ISDELETED"=0 AND TO_NUMBER("A1"."TYPE_CODE")=2279 (accessing 'PWGIS' )

REMOTE的的执行时间已经从 00:10:58 ---->00:00:01,谓词G3E_FID也没有隐性转换.



7.再看REMOTE的执行计划,已经走了索引.
SQL> SELECT 0 FROM  jacal.em_e_common E, JACAL.EM_E_DLDL WHERE "E"."G3E_FID"=&B1 AND "E"."REF_OID"=0 AND "E"."ISDELETED"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND "E"."OID"="EM_E_DLDL"."OID";
Enter value for b1: 10000
old   1: SELECT 0 FROM  jacal.em_e_common E, JACAL.EM_E_DLDL WHERE "E"."G3E_FID"=&B1 AND "E"."REF_OID"=0 AND "E"."ISDELETED"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND "E"."OID"="EM_E_DLDL"."OID"
new   1: SELECT 0 FROM  jacal.em_e_common E, JACAL.EM_E_DLDL WHERE "E"."G3E_FID"=10000 AND "E"."REF_OID"=0 AND "E"."ISDELETED"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND "E"."OID"="EM_E_DLDL"."OID"

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 2368747865

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     1 |    32 |     5   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                     |     1 |    32 |     5   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EM_E_COMMON         |     1 |    26 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | EM_E_COMMON_G3E_FID |     2 |       |     3   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN          | EM_E_DLDL_OID       |     1 |     6 |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("E"."REF_OID"=0 AND TO_NUMBER("E"."TYPE_CODE")=2279 AND "E"."ISDELETED"=0)
   3 - access("E"."G3E_FID"=10000)
   4 - access("E"."OID"="EM_E_DLDL"."OID")



8.总结
1.DBLINK的优化时,需要看DBLINK端的执行计划.分布式的优化通常比较复杂,优化的主应该在网络传输尽可能少的数据.端REMOTE的数据比源端 的数据比较多,可用driving_site 推送到REMOTE端执行
2.涉及not in ,not exsist的优化,网上众多优化方案,有的说exsist比in好,有的说in比exsist好,这必没有绝对的答案.在谓词列没有NULL的情况下NOT IN 与NOT EXSIST等价.
3.not in 与not exsist的等价时,会进行隐性转换,这时候的SQL就有可能不是之前的执行计划.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/30264304/viewspace-1813605/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2015-05-28

  • 博文量
    7
  • 访问量
    24222