ITPub博客

首页 > 数据库 > Oracle > 移动数据

移动数据

Oracle 作者:thinking_succes 时间:2014-03-05 13:58:44 0 删除 编辑
Objectives
目的

After completing this lesson, you should be able to:
完成这课后,你可以:
Describe ways to move data
描述移动数据的方法
Create and use directory objects
创建和使用地址簿对象
Use SQL*Loader to load data from a non-Oracle database (or user files)
用SQL*Loader从一个非oracle数据库(或者用户文件)去读取数据
Use external tables to move data via platform-independent files
platform-independent文件用外部表去移动数据
Explain the general architecture of Oracle Data Pump
解释Oracle data pump的大致结构
Use Data Pump Export and Import to move data between Oracle databases
用data pump 导出和导入数据


Moving Data: General Architecture
移动数据:大致结构
Major functional components:
主要功能组件:
DBMS_DATAPUMP: Contains the API for high-speed export and import utilities for bulk data and metadata movement
DBMS_DATAPUMP:包含给予主体数据和源数据的移动的高速导出和导入组件的API。
Direct Path API (DPAPI): Oracle Database 11g supports a Direct Path API interface that minimizes data conversion and parsing at both unload and load time.
Direct Path API(DPAPI):oracle 数据库 11g支持Direct Path API接口,减小读取和删除数据转换和分析的时候
DBMS_METADATA: Used by worker processes for all metadata unloading and loading. Database object definitions are stored using XML rather than SQL.
DBMS_METADATA: 用于员工进行所有源数据的读取和删除。数据库对象定义是用XML储存的而不是SQL。
External Table API: With the ORACLE_DATAPUMP and ORACLE_LOADER access drivers, you can store data in external tables (that is, in platform-independent files). The SELECT statement reads external tables as though they were stored in an Oracle database.
External Table API: 通过ORACLE_DATAPUMP和ORACLE_LOADER访问驱动,你可以把数据储存在外部表(独立平台的文件)。SELECT语句读取外部表就像他们被储存在oracle数据库中。
SQL*Loader: Has been integrated with external tables, providing automatic migration of loader control files to external table access parameters
SQL*Loader: 通过外部表整合,通过参数提供自动整合Loader的control file到外部表
expdp and impdp: Thin layers that make calls to the DBMS_DATAPUMP package to initiate and monitor Data Pump operations
expdp和impdp:调用DBMS_DATAPUMP包去初始化和监控Data Pump操作。


Oracle Data Pump: Overview
As a server-based facility for high-speed data and metadata movement, Oracle Data Pump:
作为一个基于服务器功能的高速数据和源数据的迁移,oracle data pump:
Is callable via DBMS_DATAPUMP
通过DBMS_DATAPUMP被调用
Provides the following tools:
通过下列工具:
expdp
impdp
Web-based interface
基于网页的接口
Provides four data movement methods:
提供数据迁移的技术:
Data file copying
数据文件复制
Direct path
直接路径
External tables
外部表
Network link support
网络连接支持
Detaches from and reattaches to long-running jobs
分开或者再次运行长时间job
Restarts Data Pump jobs
重启一个data pump job

Oracle Data Pump enables very high-speed data and metadata loading and unloading of Oracle databases. 
oracle data pump 确保非常告诉的数据和源数据的读取和卸载。
The Data Pump infrastructure is callable via the DBMS_DATAPUMP PL/SQL package. 
data pump结构可以通过DBMS_DATAPUMP PL/SQL包调用
Thus, custom data movement utilities can be built by using Data Pump.
所以,自定义数据移动功能可以用Data pump创建
Oracle Database 11g provides the following tools:
oracle数据库11g 提供下面工具:
Command-line export and import clients called expdp and impdp, respectively
命令行的导出和导入客户端叫做expdp和impdp。
A Web-based export and import interface that is accessible from Database Control
一个基于网页的导出和导入借口,通过database control访问。
Data Pump automatically decides the data access methods to use; 
data pump自动决定数据访问技术;
these can be either direct path or external tables. 
这个可能是直接路径或者外部表。
Data Pump uses direct path load and unload when a table’s structure allows it and when maximum single-stream performance is desired. 
data pump用直接路径加载和卸载当数据库的结构同意他还有当被设计成最大的单线程性能。
However, if there are clustered tables, referential integrity constraints, encrypted columns, or a number of other items, Data Pump uses external tables rather than direct path to move the data.
然而,如果有cluster tale, 参考约束,密文列,或者一些其他项目,data pump用外部表而不是直接路径加载。
The ability to detach from and reattach to long-running jobs without affecting the job itself enables you to monitor jobs from multiple locations while they are running. 
分开和重新运行长时间job,而不是影响job本身的能力确保你去从多个地方监控job
All stopped Data Pump jobs can be restarted without loss of data as long as the metainformation remains undisturbed. 
所有结束的data pump job可以被重启,没有数据丢失只要源数据没有被删除
It does not matter whether the job is stopped voluntarily or involuntarily due to a crash.
job是否主动被关闭没有关系。


Oracle Data Pump: Benefits
Data Pump offers many benefits and some new features over earlier data movement tools,
such as:
data pump 提供许多优点和新的功能: 
Fine-grained object and data selection
颗粒化对象和数据选择
Explicit specification of database version
明确的记述数据库版本
Parallel execution
平行的执行
Estimation of export job space consumption
分析到处job的空间消耗
Network mode in a distributed environment
网络模式在分布式的环境中
Remapping capabilities
再映射的能力
Data sampling and metadata compression
数据取样和源数据压缩
Compression of data during a Data Pump export
data pump导出的时候的数据压缩
Security through encryption
通过密文的安全性
Ability to export XMLType data as CLOBs
到处CLOB成XMLtype的能力
Legacy mode to support old import and export files
兼容模式,支持旧的导入和导出文件

The EXCLUDE, INCLUDE, and CONTENT parameters are used for fine-grained object and data selection.
EXCLUDE,INCLUDE和CONTENT 参数被用作颗粒度对象和数据的选择。
You can specify the database version for objects to be moved (using the VERSION parameter) to create a dump file set that is compatible with a previous release of the Oracle database that supports Data Pump.
你可以声明数据版本给被移动的对象(通过VERSION 参数)去创建一个dump文件,他可以是以前版本的oracle数据库可以支持data pump。
You can use the PARALLEL parameter to specify the maximum number of threads of active execution servers operating on behalf of the export job.
你可以使用PARALLEL 参数去声明执行导出服务操作的进程的最大数。
You can estimate how much space an export job would consume (without actually performing the export) by using the ESTIMATE_ONLY parameter.
你可以通过ESTIMATE_ONLY参数分析多少空间一个导出job会消费(而不是执行导出)。
Network mode enables you to export from a remote database directly to a dump file set. This can be done by using a database link to the source system.
网络模式确保你从一个远程数据库导出到一个dump 文件集。这个可以通过使用database link到资源系统来实现。
During import, you can change the target data file names, schemas, and tablespaces.
在导入的时候,你可以改变目标数据文件的名字,schema和tablespace。
In addition you can specify a percentage of data to be sampled and unloaded from the source database when performing a Data Pump export. This can be done by specifying the SAMPLE parameter.
另外你可以声明一个数据的百分比作为取样还能从资源数据库卸载当执行一个data pump导出的时候。可以通过声明SAMPLE 参数来实现。
You can use the COMPRESSION parameter to indicate whether the metadata should be compressed in the export dump file so that it consumes less disk space. 
你可以使用COMPRESSION 参数去指示是否元数据需要被压缩在导出 dump file的时候,那样可以消耗较小的空间。
If you compress the metadata, it is automatically uncompressed during import.
如果你压缩元数据,他是自动解压缩当导入的时候。
In Oracle Database 11g, new features have been added that enable you to:
在oracle 数据库11g,新的特性被增加确保你能:
Compress both data and metadata, only data, only metadata, or no data during an export
压缩数据和元数据,只压缩数据,只压缩元数据或者没有数据当导出的时候
Specify additional encryption options in the following areas:
声明另外的密文选项在下列区域:
-You can choose to encrypt both data and metadata, only data, only metadata, no data, or only encrypted columns during an export.
你可以选择加密给数据和源数据,只有数据,只有源数据,没有数据或者只加密列当导出的时候。
-You can specify a specific encryption algorithm to use during an export.
你可以声明一个特殊的加密算法当导出的时候。
-You can specify the type of security to use for performing encryption and decryption during an export. 
你可以声明安全类型用作执行加密和解密当导出的时候。
For example, perhaps the dump file set will be imported into a different or remote database and it must remain secure in transit. 
例如,或许dump 文件集将要被导入到一个不同的或者远程数据库,他必须保持安全在传送的时候。
Or perhaps the dump file set will be imported onsite using the Oracle Encryption Wallet but it may also need to be imported offsite where the Oracle Encryption Wallet is not available.
或者可能这dump文件集要导入到真实环境通过oracle encryption wallet但是他可能需要导入到非真实环境oracle encryption wallet是不支持的。
Perform table mode exports and imports using the transportable method; specify how partitioned tables should be handled during import operations
执行表模式导出和导入使用可运输的技术;声明分区表怎么执行当导入操作的时候。
Overwrite existing dump files during an export operation
重写存在的dump file当一个导出操作执行的时候
Rename tables during an import operation
重命名表当导入的时候
Specify that a data load should proceed even if nondeferred constraint violations are encountered (This is valid only for import operations that use the external tables access method.)
声明数据加载应该执行即使碰到了非延迟的约束(这个只有当使用外部表导入的时候才有用)
Specify that XMLType columns are to be exported in uncompressed CLOB format regardless of the XMLType storage format that was defined for them
声明XMLtype列被导出到没有压缩的CLOB格式不管XMLTpye出错模式被定义。
During an export, specify a remap function that takes as a source the original value of the designated column and returns a remapped value that will replace the original value in the dump file
当导出的时候定义一个再映射功能,作为一个资源更换制定列的初始值,返回一个再映射的值。
Remap data as it is being imported into a new database
再映射的数据被导入到一个新的数据库
Legacy mode to support the use of original Export (exp) and Import (imp) scripts
兼容模式,支持原来的导出和导入脚本。


Data Pump Export and Import Clients: Overview
Data Pump 导出和导入客户端:总览
Data Pump Export is a utility for unloading data and metadata into a set of operating system files called dump file sets
Data pumo导出是一个导出数据和源数据到叫做dump文件集的操作系统文件集的工具
Data Pump Import is used to load metadata and data stored in an export dump file set into a target system.
Data Pump 导入被用作导入储存在dump 文件集的源数据和数据到一个目标系统。
The Data Pump API accesses its files on the server rather than on the client.
Data pump API 访问在服务器端他的文件而不是客户端。
These utilities can also be used to export from a remote database directly to a dump file set, or to load the target database directly from a source database with no intervening files. 
这些工具也可以被用作从一个远程数据库直接导出到一个dump文件集,或者直接从一个资源数据库导入目标数据库没有中间文件。
This is known as network mode. This mode is particularly useful to export data from a read-only source database.
这个被叫做网络模式。这个模式特别有用对于从一个只读资源数据库导出数据。
At the center of every Data Pump operation is the master table (MT), which is a table created in the schema of the user running the Data Pump job. 
在每一个data pump操作的中心是master table(MT),MT是创建在用户的schema的table,运行data pumpjob
The MT maintains all aspects of the job. 
MT维护job的所有方面。
The MT is built during a file-based export job and is written to the dump file set as the last step. 
MT当一个基于文件的到处工作开始的时候被建立,写入dump file作为最后一步。
Conversely, loading the MT into the current user’s schema is the first step of a file-based import operation and is used to sequence the creation of all objects imported.
相反的,加载MT到目前用户的schema是基于文件导入的第一步操作,他被用作连续穿件所有的导入对象。
Note: The MT is the key to Data Pump’s restart capability in the event of a planned or unplanned stopping of the job. The MT is dropped when the Data Pump job finishes normally.
注意:在计划的或者非计划停止的job的实践中,MT是data pump 重启功能的关键。MT被删除当data pump job正常的完成。


SQL*Loader: Overview
SQL*Loader:总览

SQL*Loader loads data from external files into tables of an Oracle database. 
SQL*Loader从外部文件加载数据到Oracle数据库的表中。
It has a powerful data parsing engine that puts little limitation on the format of the data in the data file.
它有一个强大的数据分析引擎,data file的格式上只有很小的限制。
SQL*Loader uses the following files:
SQL *Loader使用下列文件:
Input data files: SQL*Loader reads data from one or more files (or operating system equivalents of files) that are specified in the control file. 
Input data files:SQL*Loader 从一个或者多个文件(或者操作系统文件)声明在control file中。
From SQL*Loader’s perspective, the data in the data file is organized as records. 
从SQL *Loader的观点,data file里的数据是被组织称记录的。
A particular data file can be in fixed record format, variable record format, or stream record format. 
一个特定的data file可以是一个固定的格式,不同的格式或者流记录格式。
The record format can be specified in the control file with the INFILE parameter. 
记录格式可以被声明在control 文件中通过INFILE参数。
If no record format is specified, the default is stream record format.
如果没有格式被声明,默认的是流格式。
Control file: The control file is a text file that is written in a language that SQL*Loader understands. 
Control file:Control文件是一个text文件记录SQL *Loader可以理解的语言。
The control file indicates to SQL*Loader where to find the data, how to parse and interpret the data, where to insert the data, and so on. 
control 文件标示SQL *Loader那里找到这个数据,如何分析和解释数据,那里插入数据等等。
Although not precisely defined, a control file can be said to have three sections.
虽然没有精确的定义,一个control文件可以三个区块。
The first section contains such session-wide information as the following:
第一个区块包含类似于session范围的信息,例如:
-Global options, such as the input data file name and records to be skipped
全局的选项,例如输入data file名字和多少记录被忽略
-INFILE clauses to specify where the input data is located
INFILE 语句声明了输入数据放在那里
-Data to be loaded
被加载的数据
The second section consists of one or more INTO TABLE blocks. 
第二个区块包含一个或者多个 INTO TABLE块。
Each of these blocks contains information about the table (such as the table name and the columns of the table) into which the data is to be loaded.
每一个这些块包含了被加载的数据到表的信息(例如表名字和表的列)。
The third section is optional and, if present, contains input data.
第三块是可选的,如果有,包含输入数据。
Log file: When SQL*Loader begins execution, it creates a log file. If it cannot create a log file, execution terminates. The log file contains a detailed summary of the load, including a description of any errors that occurred during the load.
Bad file: The bad file contains records that are rejected, either by SQL*Loader or by the Oracle database. Data file records are rejected by SQL*Loader when the input format is invalid. After a data file record is accepted for processing by SQL*Loader, it is sent to the Oracle database for insertion into a table as a row. If the Oracle database determines that the row is valid, the row is inserted into the table. If the row is determined to be invalid, the record is rejected and SQL*Loader puts it in the bad file.
Discard file: This file is created only when it is needed and only if you have specified that a discard file should be enabled. The discard file contains records that are filtered out of the load because they do not match any record-selection criteria specified in the control file.
For more information about SQL*Loader, see the Oracle Database Utilities guide.



External Table Population with ORACLE_DATAPUMP
This example shows you how the external table population operation can help to export a selective set of records resulting from the join of the EMPLOYEES and DEPARTMENTS tables.
Because the external table can be large, you can use a parallel populate operation to unload your data to an external table. As opposed to a parallel query from an external table, the degree of parallelism of a parallel populate operation is constrained by the number of concurrent files that can be written to by the access driver. There is never more than one parallel execution server writing into one file at a particular point in time.
The number of files in the LOCATION clause must match the specified degree of parallelism because each input/output (I/O) server process requires its own file. Any extra files that are specified are ignored. If there are not enough files for the specified degree of parallelism, the degree of parallelization is lowered to match the number of files in the LOCATION clause.
The external table is read-only after is has been populated. The SELECT command can be very complex, allowing specific information to be populated in the external table. The external table, having the same file structure as binary data pump files,  can then be migrated to another system, and imported with the impdp utility or read as an external table.
Note: For more information about the ORACLE_DATAPUMP access driver parameters, see the Oracle Database Utilities 11g Release 2 (11.2) guide.
<!-- 正文结束 -->

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/21338864/viewspace-1116192/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 没有了~
下一篇: 没有了~
请登录后发表评论 登录
全部评论

注册时间:2009-04-09