ITPub博客

首页 > 数据库 > Oracle > [20180928]避免表达式在sql语句中.txt

[20180928]避免表达式在sql语句中.txt

原创 Oracle 作者:lfree 时间:2018-09-28 20:20:27 0 删除 编辑

[20180928]避免表达式在sql语句中.txt

--//在sql语句中避免表达式是很困难的,但是链接https://blog.jooq.org/2016/11/01/why-you-should-avoid-expressions-in-sql-predicates/
--//提到这么大的差异,我个人还是非常怀疑的,还是以自己的测试为准:

1.环境与建立测试:
SCOTT@test> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- ----------------------------------------------------------------
x86_64/Linux 2.4.xx            10.2.0.4.0     Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

CREATE TABLE payment (
  id            NOT NULL PRIMARY KEY,
  payment_date  NOT NULL,
  text
) AS
SELECT
  level,
  SYSDATE - dbms_random.value(1, 500),
  LPAD ('a', 500, 'a')
--    dbms_random.string('a', 500)
FROM dual
CONNECT BY level <= 50000
ORDER BY dbms_random.value;

--//执行有点慢,可以修改dbms_random.string('a', 500)=>lpad('a',500,'a'),这样可以快一些,不影响测试结果.
 
--//CREATE INDEX i_payment_date ON payment(payment_date);
--//先不建立索引,实际上索引没用,查询范围很大.因为作者使用ORDER BY dbms_random.value,不按日期排序.
--//走全表扫描更能看出问题在那里,因为这样每行都要比较.
 
EXEC dbms_stats.gather_table_stats('SCOTT', 'PAYMENT');

2.建立测试脚本:
$ cat expr.txt
SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 500;
  v_range CONSTANT NUMBER := 470;
 -- v_date CONSTANT DATE := SYSDATE - v_range;
  v_date  DATE ;
BEGIN
  v_ts := SYSTIMESTAMP;

  -- Original query with inline expression
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT *
      FROM payment
      WHERE payment_date < SYSDATE - v_range
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Statement 1 : ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;

  -- Pre-calculated PL/SQL local variable
  FOR i IN 1..v_repeat LOOP
    v_date  := SYSDATE - v_range;
    FOR rec IN (
      SELECT *
      FROM payment
      WHERE payment_date < v_date
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Statement 2 : ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;

  -- Magical 11g scalar subquery caching
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT *
      FROM payment
      WHERE payment_date < (SELECT SYSDATE - v_range FROM dual)
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Statement 3 : ' || (SYSTIMESTAMP - v_ts));
END;
/
--//测试前说明一下,v_repeat原来定义100,我修改500.这样测试差距明显一点点.
--//注:我修改计算v_date应该重复500次.这样测试公平一些.

3.测试:
--//第1次执行使用表达式:
SELECT * FROM payment WHERE payment_date < SYSDATE - v_range
--//第2次执行使用变量.感觉这里有问题,v_date仅仅计算1次.我个人认为作为比较,应该每次要重新计算v_date.
SELECT * FROM payment WHERE payment_date < v_date
--//第3次执行使用标量子查询.
SELECT * FROM payment WHERE payment_date < (SELECT SYSDATE - v_range FROM dual)

--//执行计划应该都是一样的(我没有建立索引,仅仅标量子查询有一点点不同).测试结果:
SCOTT@test> @expr.txt
Statement 1 : +000000000 00:00:16.639160000
Statement 2 : +000000000 00:00:07.997651000
Statement 3 : +000000000 00:00:09.716406000
PL/SQL procedure successfully completed.

SCOTT@test> @expr.txt
Statement 1 : +000000000 00:00:17.143361000
Statement 2 : +000000000 00:00:08.381461000
Statement 3 : +000000000 00:00:10.362664000
PL/SQL procedure successfully completed.

SCOTT@test> @expr.txt
Statement 1 : +000000000 00:00:18.866096000
Statement 2 : +000000000 00:00:09.923500000
Statement 3 : +000000000 00:00:11.905165000
PL/SQL procedure successfully completed.

--//没想到在10g下存在这么大的差异.可以发现使用表达式最慢,而标量子查询能缓存结果,这样看上去第2,第3执行时间相差感觉还是有点大.
--//差别在于多执行500次的SELECT SYSDATE - v_range FROM dual. 感觉这里将近2秒的差距还是有点大.
--//可以看出第2种直接使用变量快一些.
--//不过并不像作者测试那样,使用标量子查询最快,估计三种方式可能有一些使用索引.
--//建立索引重复测试:

SCOTT@test> CREATE INDEX i_payment_date ON payment(payment_date);
Index created.

SCOTT@test> alter system flush shared_pool;
System altered.

SCOTT@test> @expr.txt
Statement 1 : +000000000 00:00:15.888974000
Statement 2 : +000000000 00:00:07.082431000
Statement 3 : +000000000 00:00:04.654471000
PL/SQL procedure successfully completed.

SCOTT@test> @expr.txt
Statement 1 : +000000000 00:00:15.862751000
Statement 2 : +000000000 00:00:07.081248000
Statement 3 : +000000000 00:00:04.524595000

PL/SQL procedure successfully completed.

SCOTT@test> @expr.txt
Statement 1 : +000000000 00:00:15.854954000
Statement 2 : +000000000 00:00:07.104939000
Statement 3 : +000000000 00:00:04.487967000
PL/SQL procedure successfully completed.

--//建立索引后结果与作者测试相近,标量子查询最快.实际上作者没有注意执行计划发生了变化:

SCOTT@test> @ &r/dpc  bnf6jkmr2sh8c ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  bnf6jkmr2sh8c, child number 0
-------------------------------------
SELECT * FROM PAYMENT WHERE PAYMENT_DATE < SYSDATE - :B1
Plan hash value: 684176532
------------------------------------------------------------------------------
| Id  | Operation         | Name    | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| PAYMENT |   3002 |  1503K|   864   (1)| 00:00:11 |
------------------------------------------------------------------------------

SCOTT@test> @ &r/dpc 5tqgbsqwxsjtt ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  5tqgbsqwxsjtt, child number 0
-------------------------------------
SELECT * FROM PAYMENT WHERE PAYMENT_DATE < :B1
Plan hash value: 684176532
------------------------------------------------------------------------------
| Id  | Operation         | Name    | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| PAYMENT |   3002 |  1503K|   862   (1)| 00:00:11 |
------------------------------------------------------------------------------

SCOTT@test> @ &r/dpc bumyan15pbchp ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  bumyan15pbchp, child number 0
-------------------------------------
SELECT * FROM PAYMENT WHERE PAYMENT_DATE < (SELECT SYSDATE - :B1 FROM DUAL)
Plan hash value: 2871123539
-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
-----------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID| PAYMENT        |   2500 |  1252K|   453   (0)| 00:00:06 |
|*  2 |   INDEX RANGE SCAN          | I_PAYMENT_DATE |    450 |       |     3   (0)| 00:00:01 |
|   3 |    FAST DUAL                |                |      1 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
--//450/50000 = .009,真不知道这个比率如何确定的.

--//只有标量子查询的语句走了索引,问题在于估算的返回记录450,这样比较的次数减少许多,所以出现了使用标量子查询CPU消耗时间更
--//少,出现标量子查询更快的假象.
--//不过给承认在10g下确实存在存在表达式执行比较慢的情况.

4.继续11g下测试:
SCOTT@book> @ ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

SCOTT@book> alter session set "_serial_direct_read"=never;
Session altered.
--//避免采用直接路径读.
--//重复测试脚本略,看前面:
--//先不建立索引看看:
SCOTT@book> @ expr.txt
Statement 1 : +000000000 00:00:04.520582000
Statement 2 : +000000000 00:00:03.725714000
Statement 3 : +000000000 00:00:04.309029000
PL/SQL procedure successfully completed.

SCOTT@book> @ expr.txt
Statement 1 : +000000000 00:00:05.054053000
Statement 2 : +000000000 00:00:04.325762000
Statement 3 : +000000000 00:00:05.000122000
PL/SQL procedure successfully completed.

SCOTT@book> @ expr.txt
Statement 1 : +000000000 00:00:06.489944000
Statement 2 : +000000000 00:00:05.142213000
Statement 3 : +000000000 00:00:06.300671000
PL/SQL procedure successfully completed.

--//还是可以发现使用表达式比较慢,不过这个需要量的累积.

--//建立索引重复测试:
CREATE INDEX i_payment_date ON payment(payment_date);
Index created.

SCOTT@book> CREATE INDEX i_payment_date ON payment(payment_date);
Index created.

--//退出再登录:
SCOTT@book> alter system flush shared_pool;
System altered.

SCOTT@book> alter session set "_serial_direct_read"=never;
Session altered.

SCOTT@book> @ expr.txt
Statement 1 : +000000000 00:00:05.703140000
Statement 2 : +000000000 00:00:04.711262000
Statement 3 : +000000000 00:00:05.617969000
PL/SQL procedure successfully completed.

SCOTT@book> @ expr.txt
Statement 1 : +000000000 00:00:05.700170000
Statement 2 : +000000000 00:00:04.738476000
Statement 3 : +000000000 00:00:05.615570000
PL/SQL procedure successfully completed.

SCOTT@book> @ expr.txt
Statement 1 : +000000000 00:00:05.719636000
Statement 2 : +000000000 00:00:04.709207000
Statement 3 : +000000000 00:00:05.611807000
PL/SQL procedure successfully completed.

--//执行计划全部是全表扫描,不再贴出,不像10g那样出现标量子查询语句走索引的情况

5.测试11g使用提示使用索引的情况:
$ cat expr1.txt
SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 500;
  v_range CONSTANT NUMBER := 470;
--  v_date CONSTANT DATE := SYSDATE - v_range;
  v_date  DATE ;
BEGIN
  v_ts := SYSTIMESTAMP;

  -- Original query with inline expression
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT /*+ index(payment i_payment_date) */ *
      FROM payment
      WHERE payment_date < SYSDATE - v_range
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Statement 1 : ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;

  -- Pre-calculated PL/SQL local variable
  FOR i IN 1..v_repeat LOOP
  v_date   := SYSDATE - v_range;
    FOR rec IN (
      SELECT /*+ index(payment i_payment_date) */ *
      FROM payment
      WHERE payment_date < v_date
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Statement 2 : ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;

  -- Magical 11g scalar subquery caching
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT /*+ index(payment i_payment_date) */ *
      FROM payment
      WHERE payment_date < (SELECT SYSDATE - v_range FROM dual)
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Statement 3 : ' || (SYSTIMESTAMP - v_ts));
END;
/

SCOTT@book> alter system flush shared_pool;
System altered.

SCOTT@book> alter session set "_serial_direct_read"=never;
Session altered.

SCOTT@book> @ expr1.txt
Statement 1 : +000000000 00:00:02.385992000
Statement 2 : +000000000 00:00:02.257972000
Statement 3 : +000000000 00:00:02.249990000
PL/SQL procedure successfully completed.

SCOTT@book> @ expr1.txt
Statement 1 : +000000000 00:00:02.197627000
Statement 2 : +000000000 00:00:02.002312000
Statement 3 : +000000000 00:00:01.987825000
PL/SQL procedure successfully completed.

SCOTT@book> @ expr1.txt
Statement 1 : +000000000 00:00:02.279598000
Statement 2 : +000000000 00:00:01.991588000
Statement 3 : +000000000 00:00:01.986305000
PL/SQL procedure successfully completed.

--//你可以发现比较次数少量,3者差距很小.

6.在12c下测试如下:
SCOTT@test01p> @ver1
PORT_STRING                    VERSION        BANNER                                                                               CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0           12.1.0.1.0     Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0

SCOTT@test01p> @ expr.txt
Statement 1 : +000000000 00:00:06.851000000
Statement 2 : +000000000 00:00:06.048000000
Statement 3 : +000000000 00:00:06.812000000
PL/SQL procedure successfully completed.

SCOTT@test01p> @ expr.txt
Statement 1 : +000000000 00:00:06.803000000
Statement 2 : +000000000 00:00:06.187000000
Statement 3 : +000000000 00:00:07.083000000
PL/SQL procedure successfully completed.

--//注:建立不建立索引,在12c先全部是全表扫描.
--//3者差距很小.不过使用表达式还是有点慢,不是很明显.

7.总结:
1.可以看出在sql语句出现表达式最慢的,重点还是在于比较的数量上.不管10g,11g,12c.特别注意的情况是10g差距明显.有2倍的差距.
2.500次,大约有1秒的差距(11g的情况,12c小一些),也就是1000/500=2ms,每次2ms的差距,如果语句大量累积还是很可观的.
3.另外作者10g下测试我猜测一定是标量子查询的语句使用索引,这样比较次数明显减少.
4.给人一种感觉走索引更快,当然我仅仅一个用户执行.
5.我自己还有1个疑问,我在10g,11g的测试环境硬件环境相似.全表扫描执行时间上为什么存在这么大的差异.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2215206/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    2673
  • 访问量
    6430652