ITPub博客

首页 > Linux操作系统 > Linux操作系统 > nested loop join探讨

nested loop join探讨

原创 Linux操作系统 作者:lsq_008 时间:2009-01-05 20:44:54 0 删除 编辑

概述:
本测试创建两个测试表,t1中有记录10万条,t2中记录10条。分别创造不同的条件,进行nested loop join,并比较不同条件下产生的逻辑读数量。
在测试中均加了相应的hints,以避开优化器自动产生的执行计划。

1. 创建测试表t1,t2

SQL> create table t1(a varchar2(10),b number(10));
 
Table created.
 
SQL> create table t2(c number(10),d varchar2(10));
 
Table created.

2. 插入记录,大表t1,记录数100000条,小表t2,记录数10条

SQL> begin
  2  for i in 1 .. 100000 loop
  3  insert into t1 values(to_char(i),i);
  4  end loop;
  5  commit;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
SQL> begin
  2  for i in 1 .. 10 loop
  3  insert into t2 values(i,to_char(i));
  4  end loop;
  5  commit;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.

3.以t1表上的b字段和t2表上的c字段做关联,测试不同条件下的nest loop join对资源的消耗。
(1)b和c字段上均无索引,且以t1为驱动表。

SQL> select /*+ use_nl(t1,t2) ordered */ * from t1,t2 where t1.b=t2.c;
 
 。。。。
10 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1967407726
 
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    10 |   400 | 28598   (1)| 00:05:44 |
|   1 |  NESTED LOOPS      |      |    10 |   400 | 28598   (1)| 00:05:44 |
|   2 |   TABLE ACCESS FULL| T1   |   105K|  2052K|    63   (2)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |    20 |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("T1"."B"="T2"."C")
 
Note
-----
   - dynamic sampling used for this statement
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     300228  consistent gets
          0  physical reads
          0  redo size
        732  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

(2)b和c字段上均无索引,且以t2为驱动表。

SQL> select /*+ use_nl(t1,t2) ordered */ * from  t2,t1 where t1.b=t2.c;
 
 。。。。。
10 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 4016936828
 
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    10 |   400 |   619   (3)| 00:00:08 |
|   1 |  NESTED LOOPS      |      |    10 |   400 |   619   (3)| 00:00:08 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   200 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T1   |     1 |    20 |    62   (4)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("T1"."B"="T2"."C")
 
Note
-----
   - dynamic sampling used for this statement
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2265  consistent gets
          0  physical reads
          0  redo size
        732  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed


从以上两种情况可以看出,虽然两个表上均没有索引,都是采用全表扫描的方式进行连接,当选用不同的驱动表时,逻辑读数量是相差很大的,显然,选择小表作为驱动表,产生的io次数较少。

(3)在表t1的b字段上创建索引,且以t2表为驱动表,再次执行查询:
SQL>  select /*+ use_nl(t1,t2) ordered */ * from  t2,t1 where t1.b=t2.c;

 。。。。
10 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3405354859
 
-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |    10 |   400 |    12   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |          |    10 |   400 |    12   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL          | T2       |    10 |   200 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1       |     1 |    20 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | IDX_B_T1 |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("T1"."B"="T2"."C")
 
Note
-----
   - dynamic sampling used for this statement
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         26  consistent gets
          0  physical reads
          0  redo size
        732  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed
显然,这是一种有效的查询方式,以小表为驱动表,走全表扫描,大表上有索引,这将产生最小的逻辑读数量。

(4)删除步骤3中创建的索引,在表t2的c列上创建索引,且以t1表为驱动表,执行查询:

SQL> select /*+ use_nl(t1,t2) ordered */ * from  t1,t2 where t1.b=t2.c;
 
。。。。
10 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2868788451
 
----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |    10 |   400 |    81  (12)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2       |     1 |    20 |     1   (0)| 00:00:01 |
|   2 |   NESTED LOOPS              |          |    10 |   400 |    81  (12)| 00:00:01 |
|   3 |    TABLE ACCESS FULL        | T1       |   105K|  2052K|    63   (2)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN         | IDX_C_T2 |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("T1"."B"="T2"."C")
 
Note
-----
   - dynamic sampling used for this statement
 
 
Statistics
----------------------------------------------------------
          7  recursive calls
          0  db block gets
        309  consistent gets
          0  physical reads
          0  redo size
        732  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
         10  rows processed

通过上述4次测试,可以验证,两个表进行连接时,当满足如下条件时,适合采用nested loop join的方式。
(1)两个表中有个一个大表,一个小表
(2)在大表的连接字段的列上建有索引。

最后,再看一下优化器是产生什么样的执行计划:

SQL>  select * from t1,t2 where t1.b=t2.c;
 
。。。。
10 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2577780516
 
------------------------------------------------
| Id  | Operation                   | Name     |
------------------------------------------------
|   0 | SELECT STATEMENT            |          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1       |
|   2 |   NESTED LOOPS              |          |
|   3 |    TABLE ACCESS FULL        | T2       |
|*  4 |    INDEX RANGE SCAN         | IDX_B_T1 |
------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("T1"."B"="T2"."C")
 
Note
-----
   - rule based optimizer used (consider using cbo)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         19  consistent gets
          0  physical reads
          0  redo size
        732  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 惊奇的发现,优化器产生的确实是最优化的执行计划,逻辑读只有19,比上述4种测试方式中最小逻辑读26还要少。

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/10972173/viewspace-528394/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
十余年大型金融及电信系统数据库管理经验,曾服务于中国建设银行、中国移动。对oracle,mysql数据库有深入了解。 擅长python开发,独立开发了开源数据库自动化监控运维平台Power Monitor。

注册时间:2008-02-29

  • 博文量
    325
  • 访问量
    1241708