ITPub博客

首页 > Linux操作系统 > Linux操作系统 > oracle的字符集检查工具CSSCAN(一)

oracle的字符集检查工具CSSCAN(一)

原创 Linux操作系统 作者:杨奇龙 时间:2011-03-14 13:30:20 0 删除 编辑
使用CSSCAN 工具在源库检查数据。
指定扫描的schame,只能指定一个schame
$ csscan \"sys/@ as sysdba\" LOG=/tmp/expcheck USER=scott CAPTURE=Y TOCHAR=AL32UTF8 ARRAY=1024000 PROCESS=3
指定扫描的表,在unix环境下当指定多个表时,' 是必须的,否则会报错:0403-057 Syntax error at line 21 : `(' is not expected. ):
$ csscan \"sys/@ as sysdba\" LOG=/tmp/expcheck TABLE='(SCOTT.DEPT,SCOTT.EMP)' CAPTURE=Y TOCHAR=AL32UTF8 ARRAY=1024000 PROCESS=2
在windows环境下,' 可以省略!
c:\>csscan \"sys/@ as sysdba\" LOG=c:\temp\expcheck TABLE=(SCOTT.DEPT,SCOTT.EMP) CAPTURE=Y TOCHAR=AL32UTF8 ARRAY=1024000 PROCESS=2
如果你有很多表要扫描,或许要避免不同os环境下语法的麻烦。就可以使用参数文件。
Example contents of csscan.par:
LOG=/tmp/expcheck
TABLE=(SCOTT.DEPT,SCOTT.EMP)
FULL=N
CAPTURE=Y
TOCHAR=AL32UTF8
  PROCESS=6
ARRAY=1024000

使用参数文件执行:
$ csscan \"sys/@ as sysdba\" PARFILE=/tmp/csscan.par
or
C:\>csscan \"sys/@ as sysdba\" PARFILE=c:\temp\csscan.par

Csscan will create 3 files :
expcheck.out csscan 输出的日志 a log of the output of csscan
expcheck.txt 数据库扫描概要the Database Scan Summary Report
expcheck.err 扫描时出错的记录,包含了出错的表,字段 (contains the rowid's of the Convertible , Truncation and Lossy rows reported in expcheck.txt)
下面是一些注意点:
You can only specify on user at the time, if you are exporting the majority of a database then use FULL=Y
$ csscan \"sys/@ as sysdba\" FULL=Y TOCHAR=AL32UTF8 LOG=expcheck CAPTURE=Y ARRAY=1000000 PROCESS=2
* Always run Csscan connecting with a 'sysdba' connection/user, do not use the "system" or "csmig" user.
* The PROCESS= parameter influences the load on your system, the higher this is (6 or 8 for example) the faster Csscan will be done, the lower this is the less impact it will have on your system. Adapt if needed.
* Do not specify the TONCHAR or FROMNCHAR csscan parameters , those are to change the NLS_NCHAR_CHARACTERSET. Again they are not needed and should not be specified.
* The csscan SUPPRESS parameter limits the size of the .err file by limiting the amount of information logged / table. Using SUPPRESS=1000 will log max 1000 rows for each table in the .err file. It will not affect the information in the .txt file. It WILL affect the data logged in the .err file. This is mainly useful for the first scan of big databases, if you have no idea how much "Convertible" or "Lossy" there is in a database then this will avoid that the .err file becomes 100's of MB big and it limits also the space used by the csscan tables under the Csmig schema.
note that to have correct result of the following select you should NOT use the Csscan  SUPPRESS option.

Once Csscan has been run you then need to check the .txt file

If there is any "Lossy" data - this is data that CANNOT be converted to the new NLS_CHARACTERSET
- you need to check further and see why this data is "Lossy", discussing "Lossy" is outside the scope of this note。
This select will give all the lossy objects found in the last Cssan run:
conn / AS sysdba
SELECT DISTINCT z.owner_name
|| '.'
|| z.table_name
|| '('
|| z.column_name
|| ') - '
|| z.column_type
|| ' ' LossyColumns
FROM csmig.csmv$errors z
WHERE z.error_type ='DATA_LOSS'
ORDER BY LossyColumns
/


For solving the ORA-01401 / ORA-12899 error you need to know which columns are logged as "Truncation" and what the new size of the data will be after import.

You can find that in the expcheck.err file as "Max Post Conversion Data Size"
For example, check in the expcheck.txt file wich table has "Truncation", let's assume you have there a row that say's:

-- snip from expcheck.txt
[Distribution of Convertible, Truncated and Lossy Data by Table]

USER.TABLE Convertible Truncation Lossy
--------------------- ---------------- ---------------- ----------------
...
SCOTT.TESTUTF8 69 6 0
...

then look in the expcheck.err file for "TESTUTF8" until the "Max Post Conversion Data Size" is bigger then the column size for that table.

-- snip from expcheck.err
User : SCOTT
Table : TESTUTF8
Column: ITEM_NAME
Type : VARCHAR2(80)
Number of Exceptions : 6
Max Post Conversion Data Size: 81
the max size after going to AL32UTF8 will be 81 bytes for this column.

Or you can use this select to have a list of the columns that have "Truncation" and the new size in bytes that is minimally needed:

conn / AS sysdba
SET serveroutput ON
DECLARE
newmaxsz NUMBER;
BEGIN
FOR rec IN
( SELECT DISTINCT u.owner_name,
u.table_name,
u.column_name ,
u.column_type,
u.owner_id,
u.table_id,
u.column_id,
u.column_intid
FROM csmv$errors u
WHERE u.error_type='EXCEED_SIZE'
ORDER BY u.owner_name,
u.table_name,
u.column_name
)
LOOP
SELECT MAX(cnvsize)
INTO newmaxsz
FROM csm$errors
WHERE usr# =rec.owner_id
AND obj# =rec.table_id
AND col# =rec.column_id
AND intcol#=rec.column_intid;

DBMS_OUTPUT.PUT_LINE(rec.owner_name ||'.'|| rec.table_name||' ('|| rec.column_name ||') - '|| rec.column_type ||' - '|| newmaxsz || ' Bytes');
END LOOP;
END;
/

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22664653/viewspace-689381/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
MySQL DBA NoSQL DEVOPS

注册时间:2009-10-07

  • 博文量
    1026
  • 访问量
    7701922