ITPub博客

首页 > Linux操作系统 > Linux操作系统 > oracle实验记录 (oracle单表选择率与基数计算(2))

oracle实验记录 (oracle单表选择率与基数计算(2))

原创 Linux操作系统 作者:fufuh2o 时间:2009-09-12 21:23:32 0 删除 编辑

 

在查询中使用区间谓词 选择率和基数card计算

先开始<,>,>=,<= 这几种无边界情况(所谓无边界 就是一侧没有边界 )
SQL> create table t5 (a int);

表已创建。

SQL> declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t5 values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL 过程已成功完成。

SQL> execute dbms_stats.gather_table_stats('SYS','T5');

PL/SQL 过程已成功完成。
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T5';

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T5                                  .0001          0        10000
SQL> select num_rows from dba_tables where table_name='T5';

  NUM_ROWS
----------
     10000

SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t5 where a>2;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |  9999 | 29997 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">2)

SQL> select count(*) from t5 where a>=2;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   | 10000 | 30000 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">=2)

SQL> select count(*) from t5 where a<2;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     1 |     3 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<2)

SQL> select count(*) from t5 where a<=2;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     2 |     6 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<=2)

>=,<=是闭区间, ><是 开区间

分析a>2 是一个无边 开区间(所谓无边界 就是一侧没有边界 ,a>10 and a<100就是一个有边界的 两边开区间)
a>2选择率=(high_value – limit) / (high_value – low_value)


SQL> select (10000-2)/(10000-1) from dual;

(10000-2)/(10000-1)
-------------------
          .99989999
card=num_rows*选择率
SQL> select  10000*.99989999 from dual;

10000*.99989999
---------------
      9998.9999~~~~~~~~~~~~~~~可以看到10gr2是向上舍入(ceil)

trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 9999  Computed: 9999.00  Non Adjusted: 9999.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142429
      Resp_io: 6.00  Resp_cpu: 2142429
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 9999.00  Bytes: 0

 

分析a>= 是一个无边界,闭区间 ,闭区间的特点就是必须包含闭合值的行(2)
选择率=(high_value – limit) / (high_value – low_value) + 1/num_distinct
SQL> select (10000-2)/(10000-1)+1/10000 from dual;

(10000-2)/(10000-1)+1/10000
---------------------------
                  .99999999

card=10000*.99999999=10000

SQL> select ceil(10000*.99999999) from dual;

CEIL(10000*.99999999)
---------------------
                10000
trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 10000  Computed: 10000.00  Non Adjusted: 10000.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142429
      Resp_io: 6.00  Resp_cpu: 2142429
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 10000.00  Bytes: 0


 分析a<2无边界 开区间
选择率=(limit – low_value) / (high_value – low_value)**********
card=num_rows*选择率
SQL> select (2-1)/(10000-1) from dual
  2  ;

(2-1)/(10000-1)
---------------
      .00010001

SQL> select 10000*.00010001 from dual;

10000*.00010001
---------------
         1.0001

trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142429
      Resp_io: 6.00  Resp_cpu: 2142429
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 1.00  Bytes: 0


分析a<=2 无边界,闭区间 需要将闭合值的行加进来
选择率=(limit – low_value) / (high_value – low_value)+1/num_distinct

card=num_rows*选择率
SQL> select (2-1)/(10000-1)+1/10000 from dual
  2  ;

(2-1)/(10000-1)+1/10000
-----------------------
              .00020001

SQL> select 10000*.00020001 from dual
  2  ;

10000*.00020001
---------------
         2.0001

trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 2  Computed: 2.00  Non Adjusted: 2.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142429
      Resp_io: 6.00  Resp_cpu: 2142429
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 2.00  Bytes: 0

 

*******************************
接下来看between and, a>x and a=x and a=x and a<=y ,a>x and a<=y 这几种类型 有边界情况


SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t5 where a between 2 and 10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |    10 |    30 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<=10 AND "A">=2)

SQL> select count(*) from t5 where a>2 and a<10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     8 |    24 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<10 AND "A">2)

 

SQL> select count(*) from t5 where a>=2 and a<10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     9 |    27 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<10 AND "A">=2)

SQL> select count(*) from t5 where a>2 and a<=10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     9 |    27 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<=10 AND "A">2)

SQL> select count(*) from t5 where a>=2 and a<=10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |    10 |    30 |     6   (0)| 00:00:01 |~~~~~~~~~看到与 between and card一样
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<=10 AND "A">=2)


分析 a between 2 and 10   就等于>= and <=  
选择率=(hight_limit-low_limit)/(hight_value-low_value) +1/num_distinct+1/num_distinct(因为是2个闭区间 所以2个闭合值行都要加进来)
card=num_rows*选择率
SQL> select (10-2)/(10000-1)+1/10000+1/10000 from dual;

(10-2)/(10000-1)+1/10000+1/10000
--------------------------------
                       .00100008

SQL> select 10000*.00100008 from dual;

10000*.00100008
---------------
        10.0008

between and 等同于 〉=and<= 所以  a>=2 and a<=10  card也等于10 计算方法也是一样的
 
trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 10  Computed: 10.00  Non Adjusted: 10.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142929
      Resp_io: 6.00  Resp_cpu: 2142929
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 10.00  Bytes: 0

 

分析a>2 and a<10  有边界,两边都开区间
选择率=(hight_limit-low_limit)/(hight_value-low_value)
card=num_rows*选择率
SQL> select (10-2)/(10000-1) from dual;

(10-2)/(10000-1)
----------------
       .00080008

SQL> select 10000*((10-2)/(10000-1)) from dual;

10000*((10-2)/(10000-1))
------------------------
              8.00080008
trace
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 8  Computed: 8.00  Non Adjusted: 8.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142879
      Resp_io: 6.00  Resp_cpu: 2142879
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 8.00  Bytes: 0


分析 a>=2 and a<10 ,a>2 and a<=10 这两个都是 有边界,半开区间(有一闭合的,所以要加上闭合值的行(闭合值分别是 2和10))
所以他们公式相同~ card相同

选择率=(hight_limit-low_limit)/(hight_value-low_value)+1/num_distinct
card=num_rows*选择率

SQL> select (10-2)/(10000-1)+1/10000 from dual;

(10-2)/(10000-1)+1/10000
------------------------
               .00090008

SQL> select 10000*((10-2)/(10000-1)+1/10000) from dual;

10000*((10-2)/(10000-1)+1/10000)
--------------------------------
                      9.00080008
SINGLE TABLE ACCESS PATH
  Column (#1): A(NUMBER)
    AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
  Table: T5  Alias: T5    
    Card: Original: 10000  Rounded: 9  Computed: 9.00  Non Adjusted: 9.00
  Access Path: TableScan
    Cost:  6.14  Resp: 6.14  Degree: 0
      Cost_io: 6.00  Cost_cpu: 2142929
      Resp_io: 6.00  Resp_cpu: 2142929
  Best:: AccessPath: TableScan
         Cost: 6.14  Degree: 1  Resp: 6.14  Card: 9.00  Bytes: 0

 


针对bind 变量情况

SQL> variable a number
SQL> variable b number
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t5 where a>:a;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">TO_NUMBER(:A))

SQL> select count(*) from t5 where a>=:a;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">=TO_NUMBER(:A))

SQL> select count(*) from t5 where a<:a;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"

SQL> select count(*) from t5 where a<=:a;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |   500 |  1500 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<=TO_NUMBER(:A))

SQL> select count(*) from t5 where a between :a and :b;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |    25 |    75 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)<=TO_NUMBER(:B))
   3 - filter("A">=TO_NUMBER(:A) AND "A"<=TO_NUMBER(:B))

SQL> select count(*) from t5 where a>:a and a<:b;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |    25 |    75 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)   3 - filter("A">TO_NUMBER(:A) AND "A"

SQL> select count(*) from t5 where a>=:a and a<=:b;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |    25 |    75 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)<=TO_NUMBER(:B))
   3 - filter("A">=TO_NUMBER(:A) AND "A"<=TO_NUMBER(:B))

SQL> select count(*) from t5 where a>:a and a<=:b;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |    25 |    75 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)   3 - filter("A">TO_NUMBER(:A) AND "A"<=TO_NUMBER(:B))

 


SQL> select count(*) from t5 where a>=:a and a<:b;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |    25 |    75 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)   3 - filter("A">=TO_NUMBER(:A) AND "A"


分析以上可以看出 规则很简单

只是按 是否具有边界计算的
无边界(<,>,>=,<=) 选择率为5%
card=num_rows*选择率
SQL> select 10000*.05 from dual
  2  ;

 10000*.05
----------
       500
有边界 (between and,> and <,>= and<= ,> and<= ,>= and<)
选择率=5%*5%
card=num_rows*选择率
SQL> select 10000*.0025 from dual
  2  ;

10000*.0025
-----------
         25

 

其它情况

首先是超过边界情况
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';

会话已更改。

SQL> select count(*) from t5 where a>10001;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     1 |     3 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">10001)


分析超过边界的时候oracle 按边 字面值来计算 
选择率=1/num_distinct
card=10000*1/10000=1


SQL> select count(*) from t5 where a>2 and a<=10002;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |  9999 | 29997 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">2 AND "A"<=10002)

这个oracle检测出 半部分超出边界
选择率=(hight_limit-low_limit)/(hight_value-low_value)+1/num_distinct  oracle并没有用这个公式
SQL> select (10002-2)/(10000-1)+1/10000 from dual;

(10002-2)/(10000-1)+1/10000
---------------------------
                 1.00020001~~~~~~~~~~~~~~~~~~~~~~没这样计算 返回的card当然也不对(10000*1.00020001)

而是只用了左边 a>2的来计算
(high_value – limit) / (high_value – low_value)

SQL> select 10000*((10000-2)/(10000-1)) from dual;

10000*((10000-2)/(10000-1))
---------------------------
                  9998.9999           CARD*选择率

 

SQL> select count(*) from t5 where a>=2 and a<=10002;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   | 10000 | 30000 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">=2 AND "A"<=10002)

 

这个一样 只用了左边 右边超过边界没算
a>=2
(high_value – limit) / (high_value – low_value)+1/num_distinct
SQL> select 10000*((10000-2)/(10000-1)+1/10000) from dual;

10000*((10000-2)/(10000-1)+1/10000)
-----------------------------------
                          9999.9999~~~~~~~~~~~~~~~~~card


SQL> select count(*) from t5 where a>=-3 and a<=10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |    10 |    30 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<=10 AND "A">=(-3))

oracle 发现左边 边界太小
只计算了 a<=10

选择率=(limit – low_value) / (high_value – low_value)+1/num_distinct

SQL> select 10000*((10-1)/(10000-1)+1/10000) from dual;

10000*((10-1)/(10000-1)+1/10000)
--------------------------------
                      10.0009001

 


SQL> select count(*) from t5 where a>=-3 and a<10;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     9 |    27 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"<10 AND "A">=(-3))

 

同上边一样 发现边界不对,只用了
a<10
选择率=(limit – low_value) / (high_value – low_value)
SQL> select 10000*(10-1)/(10000-1) from dual;

10000*(10-1)/(10000-1)
----------------------
            9.00090009

 

 

 

SQL> select count(*) from t5 where a>10001 and a<=10002;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |     1 |     3 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">10001 AND "A"<=10002)

当超出了 边界 2边 那么 card=num_rows*1/num_distinct=1

 


另一种情况 or

SQL> select count(*) from t5 where a>3 or a<10 ;

执行计划
----------------------------------------------------------
Plan hash value: 1231860717

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T5   |  9998 | 29994 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 

or的 选择率公式

a>3+a<10-(a>3*a<10)
a>3选择率=(high_value – limit) / (high_value – low_value)=.99979998
a<10选择率=(limit – low_value) / (high_value – low_value)=.00020002
a>3*a<10=.00019998

SQL> select (10000-3)/(10000-1) from dual;

(10000-3)/(10000-1)
-------------------
          .99979998

SQL> select (3-1)/(10000-1) from dual;

(3-1)/(10000-1)
---------------
      .00020002

SQL> select .99979998*.00020002 from dual;

.99979998*.00020002
-------------------
          .00019998

 

SQL> select .99979998+.00020002-.00019998 from dual;

.99979998+.00020002-.00019998
-----------------------------
                    .99980002

SQL> select 10000*.99980002 from dual;

10000*.99980002
---------------
      9998.0002~~~~~~~~~~~~~~~~~~~card

3个or 是这样的
sel(A) + sel(B) + sel(C) – Sel(A)sel(B) – Sel(B)sel(C) – sel(C)sel(A) + Sel(A)Sel(B)Sel(C)

如果要都是 bind
5% (column < :b1) +5% (column > :b2) -0.25% (column < :b1 and column > :b2) =9.75%

 

混合类~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~简单看下下面类型
SQL> variable a number
SQL> set autotrace traceonly explain
SQL> select count(*) from t5 where a>2 and a<:a;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |   500 |  1500 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(2   3 - filter("A"2)

SQL> select count(*) from t5 where a>100 and a<:a;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |   495 |  1485 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(100   3 - filter("A"100)

SQL> select count(*) from t5 where a>:a and a<100;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |     5 |    15 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)<100)
   3 - filter("A"<100 AND "A">TO_NUMBER(:A))

SQL> select count(*) from t5 where a>:a and a<=100;

执行计划
----------------------------------------------------------
Plan hash value: 4287340780

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     3 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| T5   |     5 |    15 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:A)<100)
   3 - filter("A"<=100 AND "A">TO_NUMBER(:A))

上面的 计算采用的是  双谓词计算
选择率=谓词1 选择率+谓词2选择率

a>2 and a<:a;
A>2无边界开区间 选择率 =(high_value – limit) / (high_value – low_value)=.99989999
SQL> select (10000-2)/(10000-1) from dual;

(10000-2)/(10000-1)
-------------------
          .99989999
a<:a=5%
所以a>2 and a<:a =.049995
SQL> select 0.05*.99989999 from dual;

0.05*.99989999
--------------
       .049995

CARD=NUM_ROWS*选择率
SQL> select 10000*.049995 from dual;

10000*.049995
-------------
       499.95


a>:a and a<=100
A<=100是一个无边界 闭区间 选择率=(limit-low_value) / (high_value – low_value)+1/num_distinct=.01000099

SQL> select (100-1)/(10000-1)+1/10000 from dual;

(100-1)/(10000-1)+1/10000
-------------------------
                .01000099

a>:a选择率为5%

a>:a and a<=100 选择率=.00050005
SQL> select 0.05*((100-1)/(10000-1)+1/10000) from dual;

0.05*((100-1)/(10000-1)+1/10000)
--------------------------------
                       .00050005
card=下面值

SQL> select 10000*(0.05*((100-1)/(10000-1)+1/10000)) from dual;

10000*(0.05*((100-1)/(10000-1)+1/10000))
----------------------------------------
                              5.00049505

SQL>

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12020513/viewspace-614518/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2009-06-26

  • 博文量
    182
  • 访问量
    427019