ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 索引有时也低效

索引有时也低效

原创 Linux操作系统 作者:jcq0 时间:2011-02-21 12:16:21 0 删除 编辑

一般的情况下,合适的索引会大大提高数据访问的效率。今天针对昨天优化的程序检查时发现,效率并未有明显改善。
取到问题时间段的报告,如下:
Event                          Waits  Time(s)  Avg Wait(ms) % Total Call Time  Wait Class
db file sequential read    7,162,136    5,288             1             75.6     User I/O
CPU time                                2,344                           33.5  
db file parallel write       676,617    1,024             2             14.7     System I/O
db file scattered read       190,924      577             3              8.3      User I/O
log file sync                602,109      564             1              8.1      Commit
可以看到占用db time最多的是db file sequential read,占用总时间的75%
在一个正常运行的系统中db file sequential read排在top events的前5位也属正常现象。
但是为什么这个事件占用了如此多的db time呢?

经过与应用沟通,了解到在数据加载之前需要通过时间戳字段先对数据进行delete,而该操作应该走到了时间戳字段对应的索引。
现在使用select加时间戳字段取数据看走索引的情况
11:04:22 JCQ0> select count(*) from xxxq17.xxx_I_BS_CHANNEL where time_flag='20110217';

  COUNT(*)
----------
    553167

Elapsed: 00:00:19.08

Execution Plan
----------------------------------------------------------
Plan hash value: 4049405464

--------------------------------------------------------------------------------
-------------

| Id  | Operation             | Name                | Rows  | Bytes | Cost (%CPU
)| Time     |

--------------------------------------------------------------------------------
-------------

|   0 | SELECT STATEMENT      |                     |     1 |     9 |   106K  (2
)| 00:21:22 |

|   1 |  SORT AGGREGATE       |                     |     1 |     9 |
 |          |

|*  2 |   INDEX FAST FULL SCAN| PK_xxx_I_BS_CHANNEL |   127K|  1119K|   106K  (2
)| 00:21:22 |

--------------------------------------------------------------------------------
-------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("TIME_FLAG"='20110217')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     534189  consistent gets
        488  physical reads
       5388  redo size
        517  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

一致性读竟有53w之多,随着进一步确认该索引的相关信息
xxx_I_BS_CHANNEL PK_xxx_I_BS_CHANNEL CHANNEL_ID 1
xxx_I_BS_CHANNEL PK_xxx_I_BS_CHANNEL TIME_FLAG   2

11:08:25 JCQ0> select count(distinct(CHANNEL_ID)) from xxxq17.xxx_I_BS_CHANNEL;

COUNT(DISTINCT(CHANNEL_ID))
---------------------------
                     553167

Elapsed: 00:01:58.55
11:10:46 JCQ0> select count(*) from xxxq17.xxx_I_BS_CHANNEL;

  COUNT(*)
----------
 129930003

Elapsed: 00:00:10.24
11:13:52 JCQ0> select count(distinct(TIME_FLAG)) from xxxq17.xxx_I_BS_CHANNEL;

COUNT(DISTINCT(TIME_FLAG))
--------------------------
                       248

Elapsed: 00:01:16.19
发现具有低选择性的时间戳字段time_flag竟然放在了组合索引的第二列,而选择性高的channel_id放在了第一列
在上述的执行计划中发现对索引走了INDEX FAST FULL SCAN,而如果单独对time_flag建索引或者将time_flag放在组合索引的第一列,
则可以通过index range scan 获取数据,效率也必定大大提高。
11:19:37 JCQ0> create index xxx_ibschannel_timeflag on xxxq17.xxx_I_BS_CHANNEL(time_flag) tablespace TS_xxx_DATA_4M_05 parallel 16
11:19:48   2  nologging;

Index created.

Elapsed: 00:00:38.75
11:20:29 JCQ0> conn / as sysdba
Connected.
11:20:36 JCQ0> set autot on
11:20:42 JCQ0> select count(*) from xxxq17.xxx_I_BS_CHANNEL where time_flag='20110217';

  COUNT(*)
----------
    553167

Elapsed: 00:00:00.50

Execution Plan
----------------------------------------------------------
Plan hash value: 521375365

--------------------------------------------------------------------------------
-------------

| Id  | Operation         | Name                    | Rows  | Bytes | Cost (%CPU
)| Time     |

--------------------------------------------------------------------------------
-------------

|   0 | SELECT STATEMENT  |                         |     1 |     9 |   362   (1
)| 00:00:05 |

|   1 |  SORT AGGREGATE   |                         |     1 |     9 |
 |          |

|*  2 |   INDEX RANGE SCAN| xxx_IBSCHANNEL_TIMEFLAG |   127K|  1119K|   362   (1
)| 00:00:05 |

--------------------------------------------------------------------------------
-------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TIME_FLAG"='20110217')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1545  consistent gets
       1543  physical reads
          0  redo size
        517  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
此时的逻辑读和物理读总共3k左右,远远小于53w。
而语句的执行时间也由20sec下降至不到1sec。
除了类似上述的不合适的组合索引,在索引选择率较低,一次读取数据过多,或者聚簇因子太大等情况下,使用索引也将使效率降低。

(需要引用, 请注明出处:痴情甲骨文http://space.itpub.net/14130873)


 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14130873/viewspace-687606/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2008-11-25

  • 博文量
    93
  • 访问量
    347735