ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 索引扫描(一)-聚簇因子

索引扫描(一)-聚簇因子

原创 Linux操作系统 作者:lsq_008 时间:2009-01-08 16:42:41 0 删除 编辑

假设一个表中有9行记录,分布在3个block上,按照其中列col1的分布情况,分为以下两种情况:

Block 1       Block 2        Block 3
 -------       -------        --------
A  A  A       B  B  B        C  C  C


Block 1       Block 2        Block 3
 -------       -------        --------
A  B  C       A  B  C        A  B  C

第一种情况,相同的值分布在同一个block里,第二种情况,相同的三个值分布在3个block里,在列col1上创建一个索引。
显然,当进行如下查询时:

select * from t where col1='A';

oracle首先进行索引范围扫描,获取相应的列col1='A’的rowid,之后根据rowid从表中获取数据。对于第一种情况,只需要扫描block 1,即可获得想要的结果,而对于第二种情况,需要扫描block 1,block 2,block 3共3个block。显然,第二种情况下的需要进行的IO次数大于第一种情况。

以下例子,首先根据第一种情况构造一个测试表test1:

SQL> create table test1 (a char(2000),b char(100));
 
Table created.
 
SQL> insert into test1 values('A','X');
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> COMMIT;
 
Commit complete.
 
SQL> select dbms_rowid.rowid_block_number(rowid) from test1;
 
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                3442
                                3442
                                3442

可见,三行均位于同一个block里。
 
SQL> insert into test1 values('B','X');
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select dbms_rowid.rowid_block_number(rowid) from test1;
 
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                3442
                                3442
                                3442
                                3443
                                3443
                                3443
 
6 rows selected.

SQL> insert into test1 values('C','X');
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select dbms_rowid.rowid_block_number(rowid) from test1;
 
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                3442
                                3442
                                3442
                                3443
                                3443
                                3443
                                3444
                                3444
                                3444
 
9 rows selected.

以上构造出第一种情况,即值为'A','B','C'的列分别各自位于同一个block里。再构造第二种情况:

SQL> create table test2(a char(2000),b char(100));
 
Table created.
 
SQL> insert into test2 values('A','X');
 
1 row created.

SQL> insert into test2 values('B','X');
 
1 row created.
 
SQL>  insert into test2 values('C','X');
 
1 row created.
 
SQL> commit;
 
Commit complete.
SQL> select dbms_rowid.rowid_block_number(rowid) from test2;
 
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                3450
                                3450
                                3450
这样,'A','B','C'这三个不同值的列均位于同一个block里了。然后再构造2个类似的block:

SQL> insert into test2 select * from test2 where rownum<=3;
 
3 rows created.
 
SQL> /
 
3 rows created.
 
SQL>  select dbms_rowid.rowid_block_number(rowid) from test2;
 
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                3450
                                3450
                                3450
                                3451
                                3451
                                3451
                                3452
                                3452
                                3452

接下来执行多次的insert into test1 select * from test1和 insert into test2 select * from test2,对这两个表各自构造出大量数据。

SQL> select count(*) from test1;
 
  COUNT(*)
----------
      4608
 
SQL> select count(*) from test2;
 
  COUNT(*)
----------
      4608

分别在test1和test2上创建索引:

SQL> create index idx_test1 on test1(a);
 
Index created.
 
SQL> create index idx_test2 on test2(a);
 
Index created
.

执行索引范围扫描:

SQL> set autot traceonly statistics
SQL> select /*+ index(test1,idx_test1) */ * from test1 where a like 'A%';
 
1536 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1237  consistent gets
          0  physical reads
          0  redo size
      23223  bytes sent via SQL*Net to client
       1517  bytes received via SQL*Net from client
        104  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1536  rows processed
 
SQL> select /*+ index(test2,idx_test2) */ * from test2 where a like 'A%';
 
1536 rows selected.
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2158  consistent gets
          0  physical reads
          0  redo size
      23223  bytes sent via SQL*Net to client
       1517  bytes received via SQL*Net from client
        104  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1536  rows processed

显然,对test1的逻辑读要小于对test2的逻辑读,再看两个索引的聚簇因子以及两张表占用的block数量:

SQL> select index_name,CLUSTERING_FACTOR from user_Indexes where index_name in('IDX_TEST1','IDX_TEST2');
 
INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
IDX_TEST1                                   1536
IDX_TEST2                                   4608

SQL> select count(distinct(dbms_rowid.rowid_block_number(rowid))) from test1;
 
COUNT(DISTINCT(DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)))
-----------------------------------------------------
                                                 1536
 
SQL> select count(distinct(dbms_rowid.rowid_block_number(rowid))) from test2;
 
COUNT(DISTINCT(DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)))
-----------------------------------------------------
                                                 1536
 
SQL> select count(*) from test1;
 
  COUNT(*)
----------
      4608
 
SQL> select count(*) from test2;
 
  COUNT(*)
----------
      4608

可以看出,上述测试用例模拟了聚簇因子的两种极端情况,IDX_TEST1的聚簇因子与表的blcok数相等,而IDX_TEST2的聚簇因子与表的记录数相等。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/10972173/viewspace-531911/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
十余年大型金融及电信系统数据库管理经验,曾服务于中国建设银行、中国移动。对oracle,mysql数据库有深入了解。 擅长python开发,独立开发了开源数据库自动化监控运维平台Power Monitor。

注册时间:2008-02-29

  • 博文量
    325
  • 访问量
    1242180