ITPub博客

首页 > Linux操作系统 > Linux操作系统 > oracle实验记录 (oracle单表选择率与基数计算(1))

oracle实验记录 (oracle单表选择率与基数计算(1))

原创 Linux操作系统 作者:fufuh2o 时间:2009-09-09 23:13:46 0 删除 编辑

 

单表选择率与 基数(cardinality)有关系,也就是执行计划中的CARD 信息
CARD表示oracle 认为此操作将返回多少rows

基本的CARD 计算 就等于 num_rows*选择率
CARD 对于otimizer 选初始的连接顺序,选INDEX 都有很大影响,所以能够看明白计算过程方法,对优化是很有帮助的


实验 单表中选择率 (本次实验学习参考cost-based oracle)
SQL> create table t1(a int);

表已创建。


SQL> ed
已写入 file afiedt.buf

  1  declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t1 values(i);
  5  end loop;
  6  commit;
  7* end;
SQL> /

PL/SQL 过程已成功完成。

SQL> execute dbms_stats.gather_table_stats('SYS','T1');

PL/SQL 过程已成功完成。

SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T1';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T1                                  .0001          0        10000


SQL> set autotrace traceonly explain
SQL> select count(*) from t1;

执行计划
----------------------------------------------------------
Plan hash value: 3724264953

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |     6   (0)| 00:00:01 |
-------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

~没有谓词 没有过滤 扫描所有行 所以card=10000


SQL> select count(*) from t1 where a=2;

执行计划
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |     3 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=2)

card=num_row*1/num_distinct=1
SQL> select 10000/10000 from dual;

10000/10000
-----------
          1
如果要是这个表存在HISTOGRAM 那么,OPTIMIZER将使用DENSITY 列 ,CARD=NUM_rows*density

以上的如果 谓词 就是COLUMN=字面值  那么 card=num_row/num_distinct (=num_row*1/num_distinct)如果有histogram card=NUM_rows*density

~~~~~~~~~~~~~~~~~~~~~~~~~~没有统计信息,使用动态采集
SQL> create table t2 (a int);

表已创建。

SQL> declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t2 values(i);
  5  end loop;
  6  commmit;
  7  end;
  8  /
commmit;

SQL> ed
已写入 file afiedt.buf

  1  declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t2 values(i);
  5  end loop;
  6  commit;
  7* end;
  8  /

PL/SQL 过程已成功完成。

SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T2';~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~没有统计信息

未选定行

SQL> show parameter dyna

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     2~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~oracle将使用动态采集

SQL> select count(*) from t2;

执行计划
----------------------------------------------------------
Plan hash value: 3321871023

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T2   | 10000 |     6   (0)| 00:00:01 |
-------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

SQL> select count(*) from t2 where a=2;

执行计划
----------------------------------------------------------
Plan hash value: 3321871023

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    13 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |
|*  2 |   TABLE ACCESS FULL| T2   |     1 |    13 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=2)~~~~~~~~

Note
-----
   - dynamic sampling used for this statement~~~~~~~~~~~~~~~~~~~~~可以看到 ORACLE 利用动态采集的信息 也是很准确的

Dynamic Sampling Levels(以下是动态采集 各个LEVEL 说明)
The sampling levels are as follows if the dynamic sampling level used is from a cursor hint or from the OPTIMIZER_DYNAMIC_SAMPLING initialization parameter:

Level 0: Do not use dynamic sampling.

Level 1: Sample all tables that have not been analyzed if the following criteria are met: (1) there is at least 1 unanalyzed table in the query; (2) this

unanalyzed table is joined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes; (4) this unanalyzed

table has more blocks than the number of blocks that would be used for dynamic sampling of this table. The number of blocks sampled is the default number of

dynamic sampling blocks (32).

Level 2: Apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is two times the default number of dynamic sampling blocks.

Level 3: Apply dynamic sampling to all tables that meet Level 2 criteria, plus all tables for which standard selectivity estimation used a guess for some

predicate that is a potential dynamic sampling predicate. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed

tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.

Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all tables that have single-table predicates that reference 2 or more columns.

The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default

number of dynamic sampling blocks.

Levels 5, 6, 7, 8, and 9: Apply dynamic sampling to all tables that meet the previous level criteria using 2, 4, 8, 32, or 128 times the default number of

dynamic sampling blocks respectively.

Level 10: Apply dynamic sampling to all tables that meet the Level 9 criteria using all blocks in the table.

The sampling levels are as follows if the dynamic sampling level for a table is set using the DYNAMIC_SAMPLING optimizer hint:

Level 0: Do not use dynamic sampling.

Level 1: The number of blocks sampled is the default number of dynamic sampling blocks (32).

Levels 2, 3, 4, 5, 6, 7, 8, and 9: The number of blocks sampled is 2, 4, 8, 16, 32, 64, 128, or 256 times the default number of dynamic sampling blocks

respectively.

Level 10: Read all blocks in the table.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SQL> alter system set optimizer_dynamic_sampling=1;
系统已更改。
SQL> alter system set optimizer_dynamic_sampling=1; ~~~~关闭动态采集

系统已更改。
SSQL> set autotrace traceonly explain
SQL>  alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t2;

执行计划
----------------------------------------------------------
Plan hash value: 3321871023

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T2   |    82 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------

SQL> select count(*) from t2 where a=2;

执行计划
----------------------------------------------------------
Plan hash value: 3321871023

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |
|*  2 |   TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=2)

~~从trace中可以看到 信息十分不准确
Table Stats::
  Table: T2  Alias: T2  (NOT ANALYZED)
    #Rows: 82  #Blks:  1  AvgRowLen:  100.00
***************************************
SINGLE TABLE ACCESS PATH
  Table: T2  Alias: T2    
    Card: Original: 82  Rounded: 82  Computed: 82.00  Non Adjusted: 82.00
  Access Path: TableScan
    Cost:  2.00  Resp: 2.00  Degree: 0
      Cost_io: 2.00  Cost_cpu: 19421
      Resp_io: 2.00  Resp_cpu: 19421
  Best:: AccessPath: TableScan
         Cost: 2.00  Degree: 1  Resp: 2.00  Card: 82.00  Bytes: 0

 

 


Table Stats::
  Table: T2  Alias: T2  (NOT ANALYZED) ~~~~~~~~~~~没统计信息(按公式算的话oracle 自己决定num_distinct 为100)
    #Rows: 82  #Blks:  1  AvgRowLen:  100.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)  NO STATISTICS (using defaults)
    AvgLen: 22.00 NDV: 3 Nulls: 0 Density: 0.39024
  Table: T2  Alias: T2    
    Card: Original: 82  Rounded: 1  Computed: 0.82  Non Adjusted: 0.82
  Access Path: TableScan
    Cost:  2.00  Resp: 2.00  Degree: 0
      Cost_io: 2.00  Cost_cpu: 23521
      Resp_io: 2.00  Resp_cpu: 23521
  Best:: AccessPath: TableScan
         Cost: 2.00  Degree: 1  Resp: 2.00  Card: 0.82  Bytes: 0
````````````````````````````````````````````````````````````````````
一个问题
关于 列中 值 严重不均匀


create table t3(a int)
SQL> ed
已写入 file afiedt.buf

  1  declare
  2  begin
  3  for i in 1..1000 loop
  4  insert into t3 values(1);
  5  end loop;
  6  for i in 2..9001 loop
  7  insert into t3 values(i);
  8  end loop;
  9  commit;
 10* end;
 11  /

PL/SQL 过程已成功完成。

SQL> execute dbms_stats.gather_table_stats('SYS','T3');

PL/SQL 过程已成功完成。

SQL> select count(*) from t3;

  COUNT(*)
----------
     10000

SQL> select num_rows from user_tables where table_name='T3';

  NUM_ROWS
----------
     10000

SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T3';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T3                             .000111099          0         9001

SQL> select count(*) from t3 where a=1;

  COUNT(*)
----------
      1000

从上面信息可以看到 ~~列A 有10000ROWS  其中A=1的 1000 ROWS, A=2 到9001 的有9000 ROW
这样NUM_DISTINCT=9001


SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。
SQL> select count(*) from t3 where a=1;


执行计划
----------------------------------------------------------
Plan hash value: 463314188

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T3   |     1 |     3 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1)

可以看到CARD 非常不准确 还是按公式 num_rows*1/num_distinct算的
SQL> select 10000*1/9001 from dual;

10000*1/9001
------------
  1.11098767

***********************
Table Stats::
  Table: T3  Alias: T3
    #Rows: 10000  #Blks:  23  AvgRowLen:  3.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 9001 Nulls: 0 Density: 1.1110e-004 Min: 1 Max: 9001
  Table: T3  Alias: T3    
    Card: Original: 10000  Rounded: 1  Computed: 1.11  Non Adjusted: 1.11
  Access Path: TableScan
    Cost:  7.14  Resp: 7.14  Degree: 0
      Cost_io: 7.00  Cost_cpu: 2163793
      Resp_io: 7.00  Resp_cpu: 2163793
  Best:: AccessPath: TableScan
         Cost: 7.14  Degree: 1  Resp: 7.14  Card: 1.11  Bytes: 0~~~~~~~~~~~~~~~~~~看到trace 中card =1.11 ,实际应该为1000
我们可以用HISTOGREAM来解决

  1  begin
  2     dbms_stats.gather_table_stats(
  3             'SYS',
  4             't3',
  5             cascade => true,
  6             estimate_percent => null,
  7             method_opt => 'for all columns size 120'
  8     );
  9* end;
SQL> /

PL/SQL 过程已成功完成。

 

使用HISTOGRAM收集统计信息后 ORACLE 将使用DENSITY 来计算


SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T3';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T3                                  .0001          0         9001


SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t3 where a=1;

执行计划
----------------------------------------------------------
Plan hash value: 463314188

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T3   |   917 |  2751 |     7   (0)| 00:00:01 |~~~~~~~~~~可以看到CARD接近了=917
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1)

SQL> select count(*) from t3 where a=2;

执行计划
----------------------------------------------------------
Plan hash value: 463314188

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T3   |     1 |     3 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=2)

SQL>
上面可以看出 A=1时候CARD=917 ORACLE 并没有使用density来计算, A=2时候用的DENSITY计算的 card=10000*0.0001=1
针对a=1 oracle 使用的 histogram bucket来计算的

SQL> select endpoint_number,endpoint_value from dba_histograms where table_name=
'T3' and rownum<5;

ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
             11              1
             12              9
             13             93
             14            177

可以看出histogram是一个 高度均衡的(HISTOGRAM分2种 第一中 频率histogram 每个bucket存一个值 当列的histogram bucket >=列distinct时候为频率 ,另一种高度均衡

HISTOGRAM
当列DISTINCT比 histogram bucket多 时为高度均衡 每个bucket存相同数量的值)
此例为高度均衡 可以看出来 1-11 BUCKET 存 值1,其实第12个BUCKET中也存值1, 因为12-13中可以看出每个BUCKET 存 84个值(93-9)(此例中除值1外 其他都唯一,所以很好看

出来),而 第12个 BUCKET 最高存的值为9  这样表示 第12个BUCKET还存了  值1(应该是76 =84-8 ,8是2到9的个数) ,oracle 发现 值1 是一个高频率出现的值(跨越的多个

桶),oracle将采用BUCKET计算11/120(跨越bucket数/总bucket数)=.091666667 从这里看到了虽然BUCKET12中也有值1 但ORACLE没算进来(因为该bucket中还有其它值2-8) 用

11/120 这也就造成了CARD只是接近而不是准确的1000
关于histogram 后面将会有与histogram详细的实验~~
card=10000*.091666667= 916.66667
看下trace


SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 9001 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 9001
    Histogram: HtBal  #Bkts: 120  UncompBkts: 120  EndPtVals: 110
  Table: T3  Alias: T3    
    Card: Original: 10000  Rounded: 917  Computed: 916.67  Non Adjusted: 916.67
  Access Path: TableScan
    Cost:  7.14  Resp: 7.14  Degree: 0
      Cost_io: 7.00  Cost_cpu: 2163793
      Resp_io: 7.00  Resp_cpu: 2163793
  Best:: AccessPath: TableScan
         Cost: 7.14  Degree: 1  Resp: 7.14  Card: 916.67  Bytes: 0 ************ Card: 916.67
***************************************

 


~列中有NULL的情况

SQL> create table t6 (a int);

表已创建。


  1  declare
  2    begin
  3     for i in 1..1000 loop
  4      insert into t6 values(1);
  5    end loop;
  6     for i in 2..9001 loop
  7     insert into t6 values(i);
  8     end loop;
  9     commit;
 10*  end;
 11  /

PL/SQL 过程已成功完成。

 

SQL> update t6 set a=null where a>8001;

已更新1000行。

SQL> commit;

提交完成。

SQL> execute dbms_stats.gather_table_stats('SYS','T6');

PL/SQL 过程已成功完成。

 

SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T6';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T6                             .000124984       1000         8001

SQL> select num_rows from user_tables where table_name='T6';

  NUM_ROWS
----------
     10000
可以看出工10000ROWS 空值为1000


SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t6 where a=2;

执行计划
----------------------------------------------------------
Plan hash value: 4096694858

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T6   |     1 |     3 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=2)

 

包含NULL的 选择率按以下公式算
Adjusted selectivity = Base selectivity * (num_rows - num_nulls) / num_rows

=1/8001*((10000-1000)/10000)

card = NUM_ROWS*Adjusted selectivity=10000*(/8001*((10000-1000)/10000))

SQL> select 1/8001*((10000-1000)/10000) from dual;

1/8001*((10000-1000)/10000)
---------------------------
                 .000112486

SQL> select 10000*0.000112486 from dual;

10000*0.000112486
-----------------
          1.12486


SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 8001 Nulls: 1000 Density: 1.2498e-004 Min: 1 Max: 8001
  Table: T6  Alias: T6    
    Card: Original: 10000  Rounded: 1  Computed: 1.12  Non Adjusted: 1.12
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142429
      Resp_io: 6.00  Resp_cpu: 2142429
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 1.12  Bytes: 0====================card 1.12

 

 


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


关于使用BIND
SQL> set autotrace traceonly explain
SQL> set autotrace off
SQL> variable a number
SQL> set autotrace traceonly explain
SQL> select count(*) from t6 where  a=:a;*********************

执行计划
----------------------------------------------------------
Plan hash value: 4096694858

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T6   |     1 |     3 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=TO_NUMBER(:A))


对于BIND  所使用的 选择率=  COLUMN=字面值时候的选择率(字面值情况也要分是否列中有NULL来计算,此例中 列中有NULL)
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T6';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T6                             .000124984       1000         8001~~~~~~~~~~~~~~~~~包含NULL

SQL> select num_rows from user_tables where table_name='T6';

  NUM_ROWS
----------
     10000


Adjusted selectivity = Base selectivity * (num_rows - num_nulls) / num_rows~~~~~~~~~~~~~~~调整选择率

=1/8001*((10000-1000)/10000)
BIN选择率=Adjusted selectivity(是字面值时候 情况的选择率)
card = NUM_ROWS*Adjusted selectivity=10000*(/8001*((10000-1000)/10000))

SQL> select 1/8001*((10000-1000)/10000) from dual;

1/8001*((10000-1000)/10000)
---------------------------
                 .000112486

SQL> select 10000*0.000112486 from dual;

10000*0.000112486
-----------------
          1.12486
trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 8001 Nulls: 1000 Density: 1.2498e-004 Min: 1 Max: 8001
  Table: T6  Alias: T6    
    Card: Original: 10000  Rounded: 1  Computed: 1.12  Non Adjusted: 1.12
  Access Path: TableScan
    Cost:  6.21  Resp: 6.21  Degree: 0
      Cost_io: 6.00  Cost_cpu: 3142429
      Resp_io: 6.00  Resp_cpu: 3142429
  Best:: AccessPath: TableScan
         Cost: 6.21  Degree: 1  Resp: 6.21  Card: 1.12  Bytes: 0
***************************************

 

 

 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
使用INLIST
SQL> create table t4 (a int);

表已创建。

SQL> declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t4 values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL 过程已成功完成。

SQL> execute dbms_stats.gather_table_stats('SYS','T4');

PL/SQL 过程已成功完成。

SQL> select num_rows from  user_tables where table_name='T4';

  NUM_ROWS
----------
     10000

SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T4';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T4                                  .0001          0        10000
SQL> col table_name format a10
SQL> select table_name,density,num_nulls,num_distinct,HIGH_VALUE from user_tab_c
ol_statistics where table_name='T4';

TABLE_NAME    DENSITY  NUM_NULLS NUM_DISTINCT HIGH_VALUE
---------- ---------- ---------- ------------ --------------------
T4              .0001          0        10000 C302


SQL> variable a number
SQL> variable b number
SQL> set autotrace traceonly explain
SQL>  alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t4 where a in (1,2);~~~~~~~~~2个

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     2 |     6 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1 OR "A"=2)

SQL> select count(*) from t4 where a in (1,2,2);~~~~~~~~~~~~含重复值得

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     2 |     6 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1 OR "A"=2)

SQL> select count(*) from t4 where a in (1,2,3);~~~~~~~~~~3个

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     3 |     9 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1 OR "A"=2 OR "A"=3)

SQL> select count(*) from t4 where a in (1,2,2,null);~~~~~~~包含空值

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     3 |     9 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1 OR "A"=2 OR "A"=TO_NUMBER(NULL))

SQL> select count(*) from t4 where a in (1,100052);~~~~~~~~~超过 该列的最大值(HIGHT_VALUE)

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     2 |     6 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1 OR "A"=100052)

SQL> select count(*) from t4 where a in (:a,:b);~~~~~~~~~~使用变量

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     2 |     6 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=TO_NUMBER(:A) OR "A"=TO_NUMBER(:B))


Moreover, when you check the values for 9i and 10g, you see
that the cardinalities always come out as N * number of entries in the list until the number of
entries in the list exceeds the number of distinct values
结论:关于inlist card= n*(num_rows*1/num_distinct)  (N 为中  inlist (数目 ))
oracle可以判断出重复值,~~检查不出是否超出最大值,是否含NULL

 

 


另外INLIST (N,N)中N的数目超过列的Num_distinct数据那么 card=num_distinct*(num_rows*1/num_distinct)=num_rows
早期8I的 IN LIST 计算 是按OR算的
in (1,2,3)= 1 or 2 or 3    8I 是这么算的 选择率
sel(A or B or C) =
sel(A) + sel(B) + sel(C) – Sel(A)sel(B) – Sel(B)sel(C) – sel(C)sel(A) + Sel(A)Sel(B)Sel(C)

8I INLIST CARD=NUM_ROWS*1/(sel(A) + sel(B) + sel(C) – Sel(A)sel(B) – Sel(B)sel(C) – sel(C)sel(A) + Sel(A)Sel(B)Sel(C))****

~~~~~~~~~~~~~~~~~~~~~~~NOT IN
SQL> select count(*) from t4 where a in (1,2,2);用HINTS USE_CONCAT那么可以将 OR操作转换为 UNION ALL 操作(3个SELECT  FROM X WHERE =1 union all=2 union all=3)

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |     2 |     6 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"=1 OR "A"=2)

SQL> select count(*) from t4 where a not in (1,2);

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |  9998 | 29994 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<>1 AND "A"<>2)

SQL> select count(*) from t4 where a not in (1,2,2);

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |  9998 | 29994 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<>1 AND "A"<>2)

SQL> select count(*) from t4 where a not in (1,2,3);

执行计划
----------------------------------------------------------
Plan hash value: 405148644

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T4   |  9997 | 29991 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<>1 AND "A"<>2 AND "A"<>3)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~可以看出是自相容的 card=NUM_ROWS-n*(num_rows*1/num_distinct)


关于like*************

SQL> set autotrace traceonly explain
SQL> select count(*) from t6 where  a like '%1';

执行计划
----------------------------------------------------------
Plan hash value: 4096694858

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T6   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("A") LIKE '%1')

SQL> select count(*) from t6 where  a like '1%';

执行计划
----------------------------------------------------------
Plan hash value: 4096694858

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T6   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("A") LIKE '1%')

SQL> select count(*) from t6 where  a like '1%';

执行计划
----------------------------------------------------------
Plan hash value: 4096694858

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------ss----------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T6   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("A") LIKE '1%')

SQL> set autotrace off
SQL> select count(*) from t6 where  a like '1%';

  COUNT(*)
----------
      2110

SQL>

选择率 应该是5%   CARD=NUM_ROW*5%=500

另外 LIKE %a 是不走INDEX 的,而 like a% 是走index的   针对 LIKE%A 不 走 , 可以建一个reverse(X) 的函数index 查询时候 使用reverse(XX) like '%a' 可以走这个函数index

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12020513/viewspace-614301/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2009-06-26

  • 博文量
    182
  • 访问量
    426902