ITPub博客

首页 > 数据库 > Oracle > [20140209]行迁移和expdp导出.txt

[20140209]行迁移和expdp导出.txt

原创 Oracle 作者:lfree 时间:2014-02-10 11:36:41 0 删除 编辑

[20140209]行迁移和expdp导出.txt

前一阵子与别人聊天,谈到一个系统升级expdp导出很慢,我比较熟悉这个系统,当他说出导出很慢的那张表的时候,
我随口讲不会这个表存在大量的行迁移吧。我记得以前学习oracle,听别人讲课,讲过一句话,如果你看这个系统
的用户模式下所有表的pctfree设置都是10的话,那么这个系统没有dba管理。按照这样看,中国大部分数据库系统
没有dba管理。

我自己以前对行迁移还是比较重视的,当然现在变懒了。我发现许多dba也不是太重视这个问题,慢慢我对这些调整
也不重视。而且那上面提到的那张表,里面存在一个修改过程,把一个标志从'0'=>'1',同时在另外的备注字段保存
修改的日期,保存的日期实际上以字符的形式来保存的,这样内容'2012-02-02 10:10:10'占用的空间19个字符,这
样修改记录长度每条都要增加19个字节。如果原来记录长度很短的情况下,保存在一个数据块的记录数会很多,发
生行迁移的行会非常多。我自己以前对这个表也很重视,总想彻底消除行迁移情况,但是我最终放弃了这个想法,
我发现要彻底消除pctfree要接近40才基本消除行迁移,但是这样带来另外一个副作用就是导致空间的浪费。因为
记录并不总等你达到pctfree的设置才进行修改,因为插入与修改是交互进行的。另外的方法就是使用ALTER TABLE
MINIMIZE RECORDS_PER_BLOCK.参考如下链接:

http://blog.itpub.net/267265/viewspace-763315/

下面做一个测试看看出现大量行迁移时expdp的情况。

1.建立测试例子:
SCOTT@test> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

create table t (id number, name varchar2(20)) pctfree 0;
insert into t
with a as (select /*+ Materialize */ rownum id from dual connect by level<=1000)
select rownum,null from a x,a where rownum<=1e6;

commit ;
update t set name=lpad(id,20,'y');
commit;

--建立CHAINED_ROWS表。
--SQL> analyze table t list chained rows into chained_rows;
--注: 执行前要建立chained_rows表,最好不要放在system表空间.$ORACLE_HOME/rdbms/admin/utlchain.sql.
--truncate table chained_rows;

SCOTT@test> select count(*) from chained_rows;
  COUNT(*)
----------
   1000000
--存在大量的行迁移。

2.使用expdp导出测试:

--CREATE OR REPLACE DIRECTORY DATA_PUMP_DIR AS '/u01/app/oracle11g/admin/test/dpdump/';
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO EXP_FULL_DATABASE;
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO IMP_FULL_DATABASE;
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO SCOTT;

--exec dbms_workload_repository.create_snapshot();
expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
--exec dbms_workload_repository.create_snapshot();

--如果查看select * from v$session_wait where wait_class<>'Idle',可以发现大量的是db file sequential read等待事件。

$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 10:38:30 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01":  scott/******** directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T"                                28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle11g/admin/test/dpdump/t.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 10:38:55
--测试多次,大约在25秒完成。

--如果我建立一张新表tx,导出看看。
SCOTT@test> create table tx as select * from t;
Table created.

$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=tx  dumpfile=tx.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 10:50:51 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01":  scott/******** directory=DATA_PUMP_DIR tables=tx dumpfile=tx.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 35 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."TX"                                28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle11g/admin/test/dpdump/tx.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 10:51:00

--测试多次,大约在10秒完成.

3.使用strace跟踪看看。
看到db file sequential read等待事件,如果执行多次,数据块应该都集中在内存,应该看到的db file sequential read不多,那是什么情况导
致缓慢呢?

$ strace -c -o /tmp/aa1  expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
36.82    0.006311          37       169         2 read
12.45    0.002134          14       158        71 open
10.89    0.001866          30        62           write
  8.61    0.001476          16        95           close
  4.56    0.000782           9        89           mmap
  3.99    0.000683         683         1           clone
  3.61    0.000618          12        53           munmap
  2.67    0.000457           7        64           fstat
  2.38    0.000408          11        38           brk
  1.70    0.000291           7        42           rt_sigaction
  1.64    0.000281           7        38           fcntl
  1.47    0.000252           7        36           lseek
  1.35    0.000232           9        27        18 stat
  1.02    0.000174           9        19           mprotect
  0.88    0.000151         151         1           execve
  0.65    0.000112          16         7           socket
  0.65    0.000111           8        14           gettimeofday
  0.58    0.000100           8        13           rt_sigprocmask
  0.54    0.000093          12         8         6 access
  0.46    0.000079          16         5         4 connect
  0.40    0.000069           7        10           uname
  0.40    0.000068          34         2           getdents64
  0.34    0.000058           7         8           getrlimit
  0.29    0.000050           7         7           getuid
  0.20    0.000035          12         3           recvmsg
  0.18    0.000030          30         1           readlink
  0.16    0.000027           9         3           getcwd
  0.15    0.000025           8         3           futex
  0.13    0.000023          12         2           pipe
  0.13    0.000022          22         1           sendto
  0.12    0.000021          11         2           bind
  0.08    0.000014           7         2           getsockname
  0.08    0.000014           7         2           setrlimit
  0.08    0.000013           7         2           times
  0.08    0.000013          13         1           _sysctl
  0.06    0.000010          10         1           ioctl
  0.04    0.000007           7         1           getppid
  0.04    0.000007           7         1           arch_prctl
  0.04    0.000007           7         1           gettid
  0.04    0.000007           7         1           time
  0.04    0.000007           7         1           set_tid_address
------ ----------- ----------- --------- --------- ----------------
100.00    0.017138                   994       101 total

$ strace -c -o /tmp/aa1 expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=tx  dumpfile=tx.dmp
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
33.23    0.004392          26       169         2 read
15.27    0.002018          33        62           write
13.14    0.001737          11       158        71 open
  5.23    0.000691         691         1           clone
  4.84    0.000640           7        89           mmap
  4.72    0.000624           7        95           close
  4.35    0.000575          11        53           munmap
  2.50    0.000330           5        64           fstat
  2.38    0.000315           8        38           brk
  1.65    0.000218           5        42           rt_sigaction
  1.58    0.000209           6        38           fcntl
  1.40    0.000185           5        36           lseek
  1.31    0.000173           6        27        18 stat
  1.18    0.000156         156         1           execve
  1.01    0.000133           7        19           mprotect
  0.63    0.000083          12         7           socket
  0.62    0.000082          16         5         4 connect
  0.61    0.000081          10         8         6 access
  0.57    0.000076           5        14           gettimeofday
  0.54    0.000071           5        13           rt_sigprocmask
  0.49    0.000065          33         2           getdents64
  0.34    0.000045           5        10           uname
  0.30    0.000039           5         8           getrlimit
  0.26    0.000034           5         7           getuid
  0.25    0.000033          33         1           readlink
  0.22    0.000029          10         3           recvmsg
  0.20    0.000026          26         1           sendto
  0.17    0.000023           8         3           getcwd
  0.17    0.000022          11         2           pipe
  0.15    0.000020           7         3           futex
  0.14    0.000018           9         2           bind
  0.08    0.000011           6         2           getsockname
  0.08    0.000011          11         1           _sysctl
  0.08    0.000011           6         2           setrlimit
  0.08    0.000010           5         2           times
  0.06    0.000008           8         1           ioctl
  0.04    0.000005           5         1           arch_prctl
  0.04    0.000005           5         1           gettid
  0.04    0.000005           5         1           time
  0.04    0.000005           5         1           set_tid_address
  0.03    0.000004           4         1           getppid
------ ----------- ----------- --------- --------- ----------------
100.00    0.013218                   994       101 total

--很难看出差异在那里?

4.常见解决方法:
通过上面对比,如果表很大,行迁移很多的情况下,时间差异还是很明显的。
最根本的解决是检查应用,设置大的pctfree来消除行迁移。
move 表空间以及在线重定义表或者ctas都可以解决已经出现的情况。但是非常不合理,特别是表很大的情况下。
一些索引要重建。

如何在这种情况下加快expdp操作呢?我google许多链接查询: Row Migration expdp

http://www.ora-solutions.net/web/2012/09/12/datapump-export-suffering-from-oracle-row-migration/
http://nzdba.wordpress.com/2013/06/30/migration-vs-datapump/

--里面都提到access methods,摘要如下:
With datapump there are 2 different access methods: EXTERNAL_TABLE and DIRECT_PATH. Usually, the datapump utility
decides on it's own which method to use. It turned out that with EXTERNAL_TABLE, the table export takes only 10 minutes
and does not perform these single-block I/O. It only appears with DIRECT_PATH.

--当我使用expdp help=y时,我发现access_method参数并不存在,难道又是某个隐含秘密?我执行如下:

$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t access_method=external_table dumpfile=t_good.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 11:25:46 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01":  scott/******** directory=DATA_PUMP_DIR tables=t access_method=external_table dumpfile=t_good.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T"                                 28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle11g/admin/test/dpdump/t_good.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 11:25:56
--测试多次,大约在10秒完成。

5.补充:
--链接http://blog.csdn.net/tianlesoftware/article/details/6090757

Important Need-To-Know's when the parameter ACCESS_METHOD is specified for a job:

    The parameter ACCESS_METHOD is an undocumented parameter and should only be used when requested by Oracle Support.
    If the parameter is not specified, then Data Pump will automatically choose the best method to load or unload the data.
    If import Data Pump cannot choose due to conflicting restrictions, an error will be reported:
    ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."EMP" using client specified AUTOMATIC method
    The parameter can only be specified when the Data Pump job is initially started (i.e. the parameter cannot be specified when the job is restarted).
    If the parameter is specified, the method of loading or unloading the data is enforced on all tables that need to be loaded or unloaded with the job.
    Enforcing a specific method may result in a slower performance of the overall Data Pump job, or errors such as:

...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."MY_TAB" using client specified DIRECT_PATH method ...

总结:
如果行迁移很大,expdp导致加入access_method=external_table可以加快导出的速度。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-1078757/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    2485
  • 访问量
    6292277