ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Export/Import and NLS considerations

Export/Import and NLS considerations

原创 Linux操作系统 作者:xhailiang 时间:2007-02-02 00:00:00 0 删除 编辑
Export/Import and NLS Considerations
Doc ID: Note:15095.1Type: FAQ
Last Revision Date: 07-JUL-2006Status: PUBLISHED

Introduction
------------
This note provides an in-depth overview of exp/imp and NLS, you may not need to
know all this if you just want to take a export and make sure you don't lose any
data. Please read this first: 
Note 227332.1 NLS considerations in Import/Export - Frequently Asked Questions


Export/Import and NLS considerations
------------------------------------
If you have exported/imported a database or table(s) and are now encountering
character set conversion problems, use the following information to confirm 
whether the export/import procedure was performed correctly.

+ In Oracle9 and higher, most data is automaticaly exported in the character set
  of the database that is exported. The only exception to that, is that 
  data-dictionary information is exported in the character set that is set in 
  the NLS_LANG environment variable when making the export.
  In pre-Oracle9 export, ALL data is exported in the character set that is set
  in the NLS_LANG environment variable when making the export.
  This means that if the character set part of the NLS_LANG environment variable
  during export is different than the database character set, there will be a 
  automatic conversion between those (for 9i and higher only for data-dictionary
  data since "normal" data is exported in the database character set, regardless
  of the NLS_LANG, as described above).

+ The export file contains the character set that the data was exported in, in
  the first few bytes of the export file. Export stores the character set ID 
  (not the text string) of the "exporting character set". See Note 48644.1.  
  This is relevant in pre-Oracle9 export files because all data will be exported
  in the same character set and this way we can "see" what that character set 
  is, and we can even "update" these fields to work around certain problems 
  (only when advised by Oracle Support!). 
  For Oracle9 and higher export files this is less relevant because the 
  character set is stored in many more places in the export file, and can be 
  different depending on the data. So for Oracle9 and higher we can never 
  "update" the export file manualy.

+ An import session runs in the character set that is set in the NLS_LANG 
  environment variable for import session.

+ Import will check the character set of the data in the dump file and compare 
  it with the session's character set as defined in NLS_LANG.

+ No conversion occurs if the export data character set and the import session 
  character set are the same. If they are not the same, conversion is performed
  from the export data character set to the import session character set prior 
  to the data being inserted into the database. 
  However, import can only perform this conversion for single-byte character 
  sets !

+ For imports into multibyte character set databases (i.e. UTF8) the character 
  set part of NLS_LANG should be set identical to that of the export session. 
  Otherwise an 
  IMP-16 "Required character set conversion (type %lu to %lu) not supported" 
  error will come up during import. See also Bug 896407.
  
+ The import session character set should be a the same, or a superset of the 
  export data character set, otherwise special characters will not be correctly
  converted.

+ If you import pre-oracle7 export files include the parameter 'CHARSET' when 
  defining the import parameter set. CHARSET identifies the character set of the
  export file. The CHARSET option was developed to import older export files 
  which did not have stored character set ID information.
  If you define CHARSET but the export file does contain character set 
  information (export version 7 and higher) then the value in CHARSET should
  match the export file character set. If they do not match, IMP-42 will 
  result (so it's better not to use this parameter at all, unless you import
  pre-oracle7 data).

+ After the data has been converted to the import session character set, it is 
  then converted to the database character set if they differ. The database 
  character set should be a superset (or the same) of the import's session 
  character set otherwise special characters will not be correctly converted.

So this means that there is a potential for 3 seperate conversions when moving
data from between databases, as described in the following graph:

   -------------------                       for all data pre-oracle9 and data-
   | db in character |     export            dictionary data in oracle9 and up:
   | set A           |---------------------> exp session is in character
   -------------------                       set B as defined by NLS_LANG.
        source                               Therefore the dump file is in
                                             character set B. Character set
                                             conversion may occur.
                                                       |
                                                       |
                                                       | move file over to
                                                       | another machine
                                                       |
                                                       V
        destination
   -------------------
   | db in character |     import            imp session is in character
   | set D           | <-------------------  set C as defined by NLS_LANG.
   -------------------                       The dump file is still in
                                             character set B. Character set
                                             conversion may occur.
   During the import process
   character set conversion
   may occur between character
   set C and the db's character
   set D if they differ.


If you have any problems related to character set conversion in export/import 
then you need to identify the following:
-- What is (was) the database character set in the source database (character 
   set A in the above)?
-- What was the client character set specified in NLS_LANG when the data was 
   exported (character set B in the above)?
-- What was the client character set specified in NLS_LANG when the data was 
   imported (character set C in the above) ?
-- What is the database character set of the destination database (character 
   set D in the above)?


Minimizing character set conversions during export/import
---------------------------------------------------------
As described, it is important to note import will do up to 3 character set 
conversions depending on:

(a) character set of exported database 
(b) NLS_LANG of export session (-> this is the "export file character set")
(c) NLS_LANG of import session 
(d) character set of taget database.

Obviously there really is only the need for 1 conversion only (from original
database character set to target database character set). Minimizing the number
of conversions means that you minimize the potential of "loosing" data. 

To minimize the number of conversions you are advised to follow these rules:
+ Set the NLS_LANG during export to the same as the character set of the 
exported database -> this means no conversion takes place, all data is still 
stored in the export file as it was stored in the database.
+ Set the NLS_LANG during import to the same value as during the export -> this
means that no conversion takes place in the import session.
+ If the character set of the target database is different, then the data will
be automatically converted when import inserts the data into the database, you 
do not have to "set" anything for this, it's automatic.

Of course you do not have to follow these rules, but complying with these rules
does minimize the risk of loosing data during export/import.


Regarding the National Characterset:
------------------------------------
NCLOBs are always exported/imported in UCS-2/AL16UTF16. 
NCHAR/NVARCHAR2s are always exported in the database's national character set.
This is something you can't influence by setting any parameters.


Further reading
---------------
usefull notes:
Note 158577.1 NLS_LANG Explained (How does Client-Server Character Conversion Work?)
Note 66320.1 Changing the Database Character Set or the Database National Character Set

For further NLS / Globalization information you may start here:
Note 267942.1 - Globalization Technology (NLS) Knowledge Browser Product Page

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/308563/viewspace-171969/,如需转载,请注明出处,否则将追究法律责任。

下一篇: LogExplore简介
请登录后发表评论 登录
全部评论

注册时间:2008-02-17

  • 博文量
    270
  • 访问量
    425188