ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 关于索引聚簇因子(CLUSTERING_FACTOR)引起的sql优化过程

关于索引聚簇因子(CLUSTERING_FACTOR)引起的sql优化过程

原创 Linux操作系统 作者:YallonKing 时间:2012-02-07 16:36:28 0 删除 编辑

以前的一个关于索引聚簇因子引起的sql优化过程

 
索引聚簇因子/聚集因子(CLUSTERING_FACTOR):按照索引列值进行了排序的索引行序和对应表中的数据行序的相似程度。
主要影响对象是堆表。
 
问题sql
SQL> select * from stock.stockinfo  where  tradingdate >=to_date('20110801','yyyymmdd');
开发反映此语句执行异常缓慢(和之前类似的语句相差时间过长)

其执行计划如下:
SQL> select * from stock.stockinfo  where  tradingdate >=to_date('20110801','yyy
ymmdd');
41443 rows selected.
Elapsed: 00:01:13.69
Execution Plan
----------------------------------------------------------
Plan hash value: 4179235280
--------------------------------------------------------------------------------
-------------------------
| Id  | Operation                   | Name                      | Rows  | Bytes
| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
-------------------------
|   0 | SELECT STATEMENT            |                           | 23474 |  5020K
| 24879   (1)| 00:04:59 |
|   1 |  TABLE ACCESS BY INDEX ROWID| STOCKINFO                 | 23474 |  5020K
| 24879   (1)| 00:04:59 |
|*  2 |   INDEX RANGE SCAN          | STOCKSTOCKINFOTRADINGDATE | 24982 |
|    69   (0)| 00:00:01 |
--------------------------------------------------------------------------------
-------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd hh24:m
i:ss'))
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      43563  consistent gets
      25012  physical reads
          0  redo size
    9774561  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed

可见其走了B树索引
查看表分析时间
SQL> select to_char(LAST_ANALYZED,'yyyy/mm/dd hh24:mi:ss') from user_tables wher
e table_name='STOCKINFO';
TO_CHAR(LAST_ANALYZ
-------------------
2011/12/06 22:03:29

表结构如下:
SQL> desc stock.stockinfo;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 TRADINGDATE                                        DATE
 DAYID                                              NUMBER
 WEEKID1                                            NUMBER
 WEEKDAYID1                                         NUMBER
 WEEKID2                                            NUMBER
 WEEKDAYID2                                         NUMBER
 WEEKID3                                            NUMBER
 WEEKDAYID3                                         NUMBER
 WEEKID4                                            NUMBER
 WEEKDAYID4                                         NUMBER
 WEEKID5                                            NUMBER
 WEEKDAYID5                                         NUMBER
 INNERCODE                                          NUMBER(10)
 SECUCODE                                           VARCHAR2(10)
 OPENPRICE                                          NUMBER(10,4)
 HIGHPRICE                                          NUMBER(10,4)
 LOWPRICE                                           NUMBER(10,4)
 CLOSEPRICE                                         NUMBER(10,4)
 CPRICE_A_DAY                                       NUMBER(10,4)
 OPRICE_A_DAY                                       NUMBER
 HPRICE_A_DAY                                       NUMBER
 LPRICE_A_DAY                                       NUMBER
 TURNOVERVOLUME                                     NUMBER(20)
 TURNOVERVALUE                                      NUMBER(19,4)
 NONRESTRICTEDSHARES                                NUMBER(16)
 AFLOATS                                            NUMBER
 ASHARES                                            NUMBER
 N_A_F_VALUE                                        NUMBER
 A_F_VALUE                                          NUMBER
 A_T_VALUE                                          NUMBER
 DAILYRATIO                                         NUMBER(20,8)
 SW3                                                VARCHAR2(10)
 NORMAL                                             NUMBER(20)
 ST                                                 NUMBER(20)
 STST                                               NUMBER(20)
 PT                                                 NUMBER(20)
 DELISTING                                          NUMBER(20)
 NTRADINGDAY                                        NUMBER(20)
 IPO                                                NUMBER(20)
 LOCALUPTIME                                        DATE

表相关数据量如下:
SQL> select count(*) from stock.stockinfo;
  COUNT(*)
----------
   4070420
SQL> select count(*) from stock.stockinfo  where  tradingdate >=to_date('2011080
1','yyyymmdd');
  COUNT(*)
----------
     41443
 
尝试表全表扫描:
SQL> select /*+ no_index(stockinfo stockstockinfotradingdate) */ * from stock.st
ockinfo  where  tradingdate >=to_date('20110801','yyyymmdd');
41443 rows selected.
Elapsed: 00:00:11.94
Execution Plan
----------------------------------------------------------
Plan hash value: 580256601
-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 23474 |  5020K| 34800   (1)| 00:06:58 |
|*  1 |  TABLE ACCESS FULL| STOCKINFO | 23474 |  5020K| 34800   (1)| 00:06:58 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     130248  consistent gets
     127560  physical reads
          0  redo size
    8721445  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed
 
结论:全表扫描效率高于B树索引
问题:根据相关数据量来说,走索引应该比全表扫描效率会高些
 
设置10053事件抓取详细trace
部分内容如下:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: STOCKINFO  Alias: STOCKINFO
    #Rows: 4070420  #Blks:  128167  AvgRowLen:  219.00
Index Stats::
  Index: STOCKSTOCKINFOTRADINGDATE  Col#: 1
    LVLS: 2  #LB: 11460  #DK: 2815  LB/K: 4.00  DB/K: 1527.00  CLUF: 4300220.00
Access path analysis for STOCKINFO
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for STOCKINFO[STOCKINFO]
  Table: STOCKINFO  Alias: STOCKINFO
    Card: Original: 4070420.000000  Rounded: 23474  Computed: 23474.13  Non Adjusted: 23474.13
  Access Path: TableScan
    Cost:  34800.38  Resp: 34800.38  Degree: 0
      Cost_io: 34714.00  Cost_cpu: 1745127320
      Resp_io: 34714.00  Resp_cpu: 1745127320
  Access Path: index (RangeScan)
    Index: STOCKSTOCKINFOTRADINGDATE
    resc_io: 24869.00  resc_cpu: 205332751
    ix_sel: 0.005767  ix_sel_with_filters: 0.005767
    Cost: 24879.16  Resp: 24879.16  Degree: 1
  Best:: AccessPath: IndexRange
  Index: STOCKSTOCKINFOTRADINGDATE
         Cost: 24879.16  Degree: 1  Resp: 24879.16  Card: 23474.13  Bytes: 0
***************************************
可见  #Rows: 4070420 CLUF: 4300220.00  其中索引聚合因子和记录行数基本一样
具体如下
SQL> select table_name,index_name,num_rows,leaf_blocks,clustering_factor from us
er_indexes where index_name ='STOCKSTOCKINFOTRADINGDATE';
TABLE_NAME INDEX_NAME                       NUM_ROWS LEAF_BLOCKS
---------- ------------------------------ ---------- -----------
CLUSTERING_FACTOR
-----------------
STOCKINFO  STOCKSTOCKINFOTRADINGDATE         4331799       11460
          4300220
 
最好情况应该是索引块数和索引聚簇因子基本相同。

尝试按索引列重建该表及相关索引,即对比表进行重构。
SQL> select * from stock.stockinfo_bak  where  tradingdate >=to_date('20110801',
'yyyymmdd');
41443 rows selected.
Elapsed: 00:00:15.10
Execution Plan
----------------------------------------------------------
Plan hash value: 2415850697
--------------------------------------------------------------------------------
---
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time
  |
--------------------------------------------------------------------------------
---
|   0 | SELECT STATEMENT  |               |  1401 |   684K| 34588   (1)| 00:06:5
6 |
|*  1 |  TABLE ACCESS FULL| STOCKINFO_BAK |  1401 |   684K| 34588   (1)| 00:06:5
6 |
--------------------------------------------------------------------------------
---

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     256930  consistent gets
     127173  physical reads
          0  redo size
    7165405  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed
在tradingdate列创建位图索引
SQL> select * from stock.stockinfo_bak  where  tradingdate >=to_date('20110801',
'yyyymmdd');
41443 rows selected.
Elapsed: 00:00:04.15
Execution Plan
----------------------------------------------------------
Plan hash value: 3810964835
--------------------------------------------------------------------------------
------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost
(%CPU)| Time     |
--------------------------------------------------------------------------------
------------------
|   0 | SELECT STATEMENT             |                   |  2500 |  1220K| 29007
   (1)| 00:05:49 |
|   1 |  TABLE ACCESS BY INDEX ROWID | STOCKINFO_BAK     |  2500 |  1220K| 29007
   (1)| 00:05:49 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                   |       |       |
      |          |
|*  3 |    BITMAP INDEX RANGE SCAN   | IDX_BIT_STOCKINFO |       |       |
      |          |
--------------------------------------------------------------------------------
------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd hh24:m
i:ss'))
       filter("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd hh24:m
i:ss'))

Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5340  consistent gets
         77  physical reads
      94840  redo size
    9774561  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed
 
再次查看索引聚集因子值
SQL> select table_name,index_name,num_rows,leaf_blocks,clustering_factor from us
er_indexes where table_name ='STOCKINFO_BAK';
TABLE_NAME INDEX_NAME                       NUM_ROWS LEAF_BLOCKS
---------- ------------------------------ ---------- -----------
CLUSTERING_FACTOR
-----------------
STOCKINFO_ IDX_BIT_STOCKINFO                    2815         133
BAK
             2815
sql得以优化,问题得以解决!
心得:10g以后的cbo即使按正确的路径进行,也不一定最优,索引聚簇因子对sql的执行效率影响比较大。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26143577/viewspace-715765/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2011-08-07

  • 博文量
    72
  • 访问量
    247197