ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 组合索引怎么应该怎么选取引导列?(转载落落)

组合索引怎么应该怎么选取引导列?(转载落落)

原创 Linux操作系统 作者:Aminiy 时间:2013-09-13 11:52:41 0 删除 编辑
有这样一个SQL 
select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT'; 
id列选择性很高,owner选择性很低 
要优化它很简单,只需要在t1表上建立一个组合索引(owner,id),在t2表上建立一个索引(id) 
现在要讨论的是我们应该怎么建立组合索引,哪一列(owner,id)应该放在最前面? 
 
现在来做个实验 
 
SQL> desc t1 
Name        Type          Nullable Default Comments  
----------- ------------- -------- ------- --------  
ID          NUMBER        Y                          
OBJECT_NAME VARCHAR2(128) Y                          
OWNER       VARCHAR2(30)  Y                          
 
SQL> desc t2 
Name      Type        Nullable Default Comments  
--------- ----------- -------- ------- --------  
ID        NUMBER      Y                          
STATUS    VARCHAR2(7) Y                          
TEMPORARY VARCHAR2(1) Y     
 
 
SQL> create index inx_id on t2(id); 
 
Index created. 
 
SQL> create index inx_id_owner on t1(id,owner); 
 
Index created. 
 
SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT'; 
 
Elapsed: 00:00:00.02 
 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 2432674005 
 
--------------------------------------------------------------------------------------- 
| Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT       |              |     1 |    16 |    88   (2)| 00:00:02 | 
|   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          | 
|*  2 |   HASH JOIN            |              |  2416 | 38656 |    88   (2)| 00:00:02 | 
|*  3 |    INDEX FAST FULL SCAN| INX_ID_OWNER |  2416 | 26576 |    50   (0)| 00:00:01 | 
|   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 | 
--------------------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 
   2 - access("T1"."ID"="T2"."ID") 
   3 - filter("T1"."OWNER"='SCOTT') 
 
 
Statistics 
---------------------------------------------------------- 
          0  recursive calls 
          0  db block gets 
        392  consistent gets 
          0  physical reads 
          0  redo size 
        422  bytes sent via SQL*Net to client 
        420  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 
                       
SQL> create index inx_owner_id on t1(owner,id); 
 
Index created. 
 
SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT'; 
 
Elapsed: 00:00:00.03 
 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 277464349 
 
--------------------------------------------------------------------------------------- 
| Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT       |              |     1 |    16 |    47   (3)| 00:00:01 | 
|   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          | 
|*  2 |   HASH JOIN            |              |  2416 | 38656 |    47   (3)| 00:00:01 | 
|*  3 |    INDEX RANGE SCAN    | INX_OWNER_ID |  2416 | 26576 |     9   (0)| 00:00:01 | 
|   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 | 
--------------------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 
   2 - access("T1"."ID"="T2"."ID") 
   3 - access("T1"."OWNER"='SCOTT') 
 
 
Statistics 
---------------------------------------------------------- 
          0  recursive calls 
          0  db block gets 
        169  consistent gets 
          0  physical reads 
          0  redo size 
        422  bytes sent via SQL*Net to client 
        420  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 
 
如果OWNER作为引导列,逻辑读由以前的392变成现在的169,并且由以前的 index fast full scan 变成index range scan 
 
如果强制指定走索引 inx_id_owner  
           
SQL>  select /*+ index(t1 inx_id_owner) */ count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT'; 
 
Elapsed: 00:00:00.03 
 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 3161475902 
 
--------------------------------------------------------------------------------------- 
| Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT       |              |     1 |    16 |   259   (1)| 00:00:04 | 
|   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          | 
|*  2 |   HASH JOIN            |              |  2416 | 38656 |   259   (1)| 00:00:04 | 
|*  3 |    INDEX FULL SCAN     | INX_ID_OWNER |  2416 | 26576 |   221   (1)| 00:00:03 | 
|   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 | 
--------------------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 
   2 - access("T1"."ID"="T2"."ID") 
   3 - access("T1"."OWNER"='SCOTT') 
       filter("T1"."OWNER"='SCOTT') 
 
 
Statistics 
---------------------------------------------------------- 
          0  recursive calls 
          0  db block gets 
        387  consistent gets 
          0  physical reads 
          0  redo size 
        422  bytes sent via SQL*Net to client 
        420  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 
 
依然要387个逻辑读。 
 
为什么要以owner为引导列?因为ID是join列,并且ID列上面没过滤条件,如果以ID列作为引导列,由于没过滤条件 
那么CBO只能走 index full scan,或者index fast full scan,因为引导列没过滤条件,走不了index range scan, 
最多走index skip scan,不过index skip scan代价过高,因为index skip scan要求 引导列选择性很低,但是ID这里选择性很高 
 
SQL> select /*+ index_ss(t1 inx_id_owner) */ count(*) from t1,t2 where t1.id=t2.id and t1.owner='SYS'; 
 
Elapsed: 00:00:00.10 
 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 3493079762 
 
--------------------------------------------------------------------------------------- 
| Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT       |              |     1 |    16 | 72529   (1)| 00:14:31 | 
|   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          | 
|*  2 |   HASH JOIN            |              |  2416 | 38656 | 72529   (1)| 00:14:31 | 
|*  3 |    INDEX SKIP SCAN     | INX_ID_OWNER |  2416 | 26576 | 72491   (1)| 00:14:30 | 
|   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 | 
--------------------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 
   2 - access("T1"."ID"="T2"."ID") 
   3 - access("T1"."OWNER"='SYS') 
       filter("T1"."OWNER"='SYS') 
 
 
Statistics 
---------------------------------------------------------- 
          1  recursive calls 
          0  db block gets 
        387  consistent gets 
          0  physical reads 
          0  redo size 
        424  bytes sent via SQL*Net to client 
        420  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 
 
如果owner列作为引导列,那么优化器就可以选择index range scan,这样相比index full scan, index fast full scan 
肯定要少扫描很多leaf block,逻辑读就会相对较少。 
 
其实到这里,是否可以总结一下建立组合索引的原则呢? 
引导列要选择过滤条件的列作为引导列,比如 where a.xxx='xxx' 或者 a.xxx> 或者 a.xxx< 
引导列的选择性越高越好,因为选择性越高,扫描的leaf block就越少,效率就越高 
尽量把join列放到组合索引最后面 
 
这里选择以owner列作为引导列,由于owner选择性很低,所以测试索引压缩对于性能的提升 
 
SQL> analyze  index inx_owner_id validate structure; 
 
Index analyzed. 
 
SQL> select height, 
  2         blocks, 
  3         lf_blks, 
  4         br_blks, 
  5         OPT_CMPR_COUNT, 
  6         OPT_CMPR_PCTSAVE 
  7    from index_stats 
  8   where name = 'INX_OWNER_ID'; 
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS OPT_CMPR_COUNT OPT_CMPR_PCTSAVE 
---------- ---------- ---------- ---------- -------------- ---------------- 
         2        256        219          1              1               26 
  
SQL> drop index inx_owner_id; 
 
Index dropped 
 
SQL> create index inx_owner_id on t1(owner,id) compress 1; 
 
Index created 
 
SQL> analyze  index inx_owner_id validate structure; 
 
Index analyzed 
 
SQL>  
SQL> select height, 
  2         blocks, 
  3         lf_blks, 
  4         br_blks, 
  5         OPT_CMPR_COUNT, 
  6         OPT_CMPR_PCTSAVE 
  7    from index_stats 
  8   where name = 'INX_OWNER_ID'; 
 
    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS OPT_CMPR_COUNT OPT_CMPR_PCTSAVE 
---------- ---------- ---------- ---------- -------------- ---------------- 
         2        256        161          1              1                0 
          
索引压缩之后,Leaf block 由原来的219降低到161个,节约了58个block 现在再来看一看执行计划+统计信息 
 
SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT'; 
 
Elapsed: 00:00:00.03 
 
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 277464349 
 
--------------------------------------------------------------------------------------- 
| Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT       |              |     1 |    16 |    45   (3)| 00:00:01 | 
|   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          | 
|*  2 |   HASH JOIN            |              |  2416 | 38656 |    45   (3)| 00:00:01 | 
|*  3 |    INDEX RANGE SCAN    | INX_OWNER_ID |  2416 | 26576 |     7   (0)| 00:00:01 | 
|   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 | 
--------------------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
 
   2 - access("T1"."ID"="T2"."ID") 
   3 - access("T1"."OWNER"='SCOTT') 
 
 
Statistics 
---------------------------------------------------------- 
          1  recursive calls 
          0  db block gets 
        169  consistent gets 
          0  physical reads 
          0  redo size 
        422  bytes sent via SQL*Net to client 
        420  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 
 
由此可见,索引压缩之后,逻辑读并没有下降,还是169,但是索引的leaf blcok显著减少了,这样减少了存储空间,能降低物理IO 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24492954/viewspace-772670/,如需转载,请注明出处,否则将追究法律责任。

上一篇: with as
下一篇: 何去何从
请登录后发表评论 登录
全部评论

注册时间:2011-04-09

  • 博文量
    41
  • 访问量
    239384