ITPub博客

首页 > Linux操作系统 > Linux操作系统 > ORA-600(17069)错误的解决过程

ORA-600(17069)错误的解决过程

原创 Linux操作系统 作者:yangtingkun 时间:2009-02-20 23:51:23 0 删除 编辑

今天在一个报表数据库后台发现了这个错误。简单描述一下问题的解决过程。

 

 

详细的错误信息为:

Fri Feb 20 08:16:44 2009
Errors in file /u1/oracle/admin/repdb01/bdump/repdb01_j015_5099.trc:
ORA-00600: internal error code, arguments: [17069], [0x6A5DEE1E0], [], [], [], [], [], []
Fri Feb 20 08:16:47 2009
Errors in file /u1/oracle/admin/repdb01/bdump/repdb01_j015_5099.trc:
ORA-00600: internal error code, arguments: [17069], [0x6A5DEE1E0], [], [], [], [], [], []

进一步检查对应的trace文件:

bash-2.03$ more /u1/oracle/admin/repdb01/bdump/repdb01_j015_5099.trc
/u1/oracle/admin/repdb01/bdump/repdb01_j015_5099.trc
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /data/oracle/product/920
System name:    SunOS
Node name:      newreport
Release:        5.8
Version:        Generic_117350-26
Machine:        sun4u
Instance name: repdb01
Redo thread mounted by this instance: 1
Oracle process number: 35
Unix process pid: 5099, image: oracle@newreport (J015)

*** SESSION ID:(12.28191) 2009-02-20 08:16:44.060
*** 2009-02-20 08:16:44.060
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [17069], [0x6A5DEE1E0], [], [], [], [], [], []
Current SQL statement for this session:
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;  broken BOOLEAN := FALSE; BEGIN P_GENERATE_REPDATA('FR20T000002000000
0000032'); :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END;
----- Call Stack Trace -----
calling              call     entry                argument values in hex     
location             type     point                (? means dubious value)    
-------------------- -------- -------------------- ----------------------------
ksedmp()+328         CALL     ksedst()+0           FFFFFFFF7FFF6430 ?
                                                   000000000 ? 000000000 ?
                                                   00000003E ?
                                                   FFFFFFFF7FFF6CC8 ?
                                                   1031D56C8 ?
kgeriv()+208         PTR_CALL 0000000000000000     000000000 ? 000103400 ?
                                                   0001035D9 ? 000102C00 ?
                                                   1035D9000 ? 1035D9C28 ?
kgesiv()+108         CALL     kgeriv()+0           1035D9E88 ? 1036C7148 ?
                                                   000000258 ? 0000013C8 ?
                                                   FFFFFFFF7FFF7608 ?
                                                   1035DB258 ?
kgesic1()+32         CALL     kgesiv()+0           1035D9E88 ? 1036C7148 ?
                                                   0000042AD ? 000000001 ?
                                                   FFFFFFFF7FFF7608 ?
                                                   004000000 ?
kglgob()+1972        CALL     kgesic1()+0          1035D9E88 ? 1036C7148 ?
                                                   0000042AD ? 000000002 ?
                                                   6A5DEE1E0 ? 0000010A0 ?
kgldpo()+524         CALL     kglgob()+0           000000000 ? 000000000 ?
                                                   6A5DEE1E0 ?
                                                   FFFFFFFF7FFF77A8 ?
                                                   000080000 ? 000000010 ?
kgldon()+248         CALL     kgldpo()+0           000000000 ? 000000000 ?
                                                   69A79DA60 ? 000000001 ?
                                                   000000002 ?
                                                   FFFFFFFF7FFF7BEE ?
pkldon()+108         CALL     kgldon()+0           1035D9E88 ?
                                                   FFFFFFFF7FFF7DE0 ?
                                                   69A79DA60 ? 000000001 ?
                                                   000000000 ?
                                                   FFFFFFFF7FFF7D8E ?
pkloud()+204         CALL     pkldon()+0           FFFFFFFF7FFFA1A0 ?
                                                   FFFFFFFF7FFF7DE0 ?
                                                   69A79DA60 ? 000000001 ?
                                                   000000000 ?
                                                   FFFFFFFF7FFF7D8E ?
phnnrl_name_resolve  CALL     pkloud()+0           1033FCA90 ?
_by_loading()+280                                  FFFFFFFF7FFF7E3C ?
                                                   000000000 ? 000000000 ?
                                                   000030000 ? 6A475CE18 ?
phngdl_get_defining  CALL     phnnrl_name_resolve  000000000 ? 000020015 ?
_libunit()+124                _by_loading()+0      FFFFFFFF7FFF9830 ?
                                                   FFFFFFFF7FFF8160 ?
                                                   000020015 ? 000000000 ?
phnrpls_resolve_pre  CALL     phngdl_get_defining  FFFFFFFF7FFF9830 ?
fix_libscope()+12             _libunit()+0         FFFFFFFF7FFF85A8 ?
                                                   FFFFFFFF7FFF843C ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?

无论是从trace文件对应的名称,还是从trace文件中对应的语句都可以确定,引起问题的是一个JOB

检查metalinkOracle在文档Doc ID:  39616.1中对这个错误的已知bug,进行了汇总,不过这些bug的描述似乎没有和当前十分相符的。

查看文档的描述,发现ORA-600错误的第二个参数,这里是0x6A5DEE1E0,代表Library Cache Object Handle

看来问题可能和LATCH有关。

但是根据信息在V$LATCHV$LATCH_CHILDREN视图中,没有找到有价值的信息。

这个JOB由于失败会自动再次执行,检查JOB运行时的V$LOCK信息:

SQL> SELECT ADDR, TYPE, ID1, ID2, LMODE, REQUEST, BLOCK
  2  FROM V$LOCK
  3  WHERE SID = 75;

ADDR             TY        ID1        ID2      LMODE    REQUEST      BLOCK
---------------- -- ---------- ---------- ---------- ---------- ----------
0000000690342780 CU -1.703E+09          6          6          0          0
00000006903426F8 JQ          0         63          6          0          0

V$LOCK中看不到什么特别有价值的信息,接着检查V$SESSION_WAIT,看看这个JOB在等待什么:

SQL> SELECT EVENT, P1TEXT, P1RAW, P2TEXT, P2RAW, STATE
  2  FROM V$SESSION_WAIT  
  3  WHERE SID = 75;

EVENT             P1TEXT          P1RAW            P2TEXT       P2RAW            STATE
----------------- --------------- ---------------- ------------ ---------------- -------
library cache pin handle address  00000006A5DEE1E0 pin address  00000006B1A971A8 WAITING

这次的信息就明显了,ORA-600错误的第二个参数就是V$SESSION_WAIT视图的P1RAW的值,而且从等待事件上也可以看到,问题就是出现在LIBRARY CACHE PIN的过程中。

重新查看METALINK的信息,这个错误可能发生在一个长时间运行的进程,在其运行过程中,所依赖的对象被编译或者删除了。

检查JOB调用的过程的状态:

SQL> SELECT OWNER, OBJECT_NAME, OBJECT_TYPE, STATUS
  2  FROM DBA_OBJECTS
  3  WHERE WNER = 'FUJIANREP'
  4  AND OBJECT_NAME = 'P_GENERATE_REPDATA';

OWNER                          OBJECT_NAME                    OBJECT_TYPE        STATUS
------------------------------ ------------------------------ ------------------ -------
FUJIANREP                      P_GENERATE_REPDATA             PROCEDURE          INVALID

果然问题过程处于不正常的状态。

JOB至于BROKEN状态,避免JOB再次运行:

SQL> EXEC DBMS_JOB.BROKEN(63, TRUE)

PL/SQL procedure successfully completed.

SQL> COMMIT;

Commit complete.

杀掉JOB对应的PROCESS

SQL> SELECT SPID FROM V$PROCESS WHERE ADDR IN (SELECT PADDR FROM V$SESSION WHERE SID = 75);

SPID
------------
14927

SQL> HOST kill -9 14927

下面用重新编译该过程:

SQL> ALTER PROCEDURE P_GENERATE_REPDATA COMPILE;
ALTER PROCEDURE P_GENERATE_REPDATA COMPILE
*
ERROR at line 1:
ORA-04021: timeout occurred while waiting to lock object FUJIANREP.P_GENERATE_REPDATA

由于从V$LOCKV$LATCH无法得到信息,只能看看有没有其他人当前在访问P_GENERATE_REPDATA所依赖的对象:

SQL> SELECT * FROM V$ACCESS
  2  WHERE (OWNER, OBJECT) IN
  3  (SELECT REFERENCED_OWNER, REFERENCED_NAME
  4  FROM DBA_DEPENDENCIES
  5  WHERE WNER = 'FUJIANREP'
  6  AND NAME = 'P_GENERATE_REPDATA');

       SID OWNER                          OBJECT                         TYPE
---------- ------------------------------ ------------------------------ ------------------------
        54 FUJIANREP                      CAT_BUYER                      SYNONYM
        54 FUJIANREP                      CAT_CATEGORY                   SYNONYM
        54 FUJIANREP                      CAT_DOSEAGE_FORM               SYNONYM
        54 FUJIANREP                      CAT_DRUG                       SYNONYM
        54 FUJIANREP                      CAT_ENTERPRISE                 SYNONYM
        54 FUJIANREP                      CAT_METRIC                     SYNONYM
        54 FUJIANREP                      CAT_ORG                        SYNONYM
        54 FUJIANREP                      CAT_PRODUCT                    SYNONYM
        54 FUJIANREP                      CAT_QUALITY_DEFINE             SYNONYM
        54 FUJIANREP                      GOV_CAT_BUYER                  TABLE
        54 FUJIANREP                      GOV_CAT_ENTERPRISE             TABLE
        54 FUJIANREP                      GOV_S_MO_BU                    TABLE
        54 FUJIANREP                      GOV_S_MO_BU_EN                 TABLE
        54 FUJIANREP                      GOV_S_MO_BU_PR                 TABLE
        54 FUJIANREP                      GOV_S_MO_EN                    TABLE
        54 FUJIANREP                      GOV_S_MO_ME                    TABLE
        54 FUJIANREP                      GOV_S_MO_ME_CA                 TABLE
        54 FUJIANREP                      GOV_S_MO_ME_PR                 TABLE
        54 FUJIANREP                      GOV_S_MO_ORDER                 TABLE
        54 FUJIANREP                      GOV_S_YE_ORDER                 TABLE
        54 FUJIANREP                      GRP_HOSPITAL                   TABLE
        54 FUJIANREP                      GRP_LEVEL                      TABLE
        54 FUJIANREP                      ORD_ORDER                      TABLE
        54 FUJIANREP                      ORD_ORDER_ITEM                 TABLE
        54 FUJIANREP                      ORD_ORDER_ITEM_REP             CURSOR
        54 FUJIANREP                      ORD_ORDER_RECEIVE              TABLE
        54 FUJIANREP                      ORD_ORDER_RECEIVE_REP          SYNONYM
        54 FUJIANREP                      ORD_ORDER_REP                  CURSOR
        54 FUJIANREP                      ORD_ORDER_RETURN               TABLE
        54 FUJIANREP                      ORD_ORDER_RETURN_REP           CURSOR
        54 FUJIANREP                      PLT_PLAT                       CURSOR
        54 FUJIANREP                      USER_TAB_PARTITIONS            CURSOR
        54 NDMAIN                         CAT_BUYER                      TABLE
        54 NDMAIN                         CAT_CATEGORY                   TABLE
        54 NDMAIN                         CAT_DOSEAGE_FORM               TABLE
        54 NDMAIN                         CAT_DRUG                       TABLE
        54 NDMAIN                         CAT_ENTERPRISE                 TABLE
        54 NDMAIN                         CAT_METRIC                     TABLE
        54 NDMAIN                         CAT_ORG                        TABLE
        54 NDMAIN                         CAT_PRODUCT                    TABLE
        54 NDMAIN                         CAT_QUALITY_DEFINE             TABLE
        54 NDMAIN                         ORD_ORDER                      VIEW
        54 NDMAIN                         ORD_ORDER_ITEM                 VIEW
        54 NDMAIN                         ORD_ORDER_RECEIVE              VIEW
        54 NDMAIN                         ORD_ORDER_RETURN               VIEW
        54 NDMAIN                         PLT_PLAT                       TABLE
        54 PUBLIC                         USER_TAB_PARTITIONS            SYNONYM
        54 SYS                            STANDARD                       PACKAGE
       145 SYS                            STANDARD                       PACKAGE
        54 SYS                            SYS_STUB_FOR_PURITY_ANALYSIS   PACKAGE
        54 SYS                            USER_TAB_PARTITIONS            VIEW

51 rows selected.

对象果然被其他人所访问,看看这个会话在做什么:

SQL> SELECT SID, SERIAL#, USERNAME, PROGRAM, TERMINAL
  2  FROM V$SESSION
  3  WHERE SID = 54;

       SID    SERIAL# USERNAME                       PROGRAM      TERMINAL
---------- ---------- ------------------------------ ------------ ----------
        54      26216 FUJIANREP                      PlSqlDev.exe LIBY

没想到是同事的连接的会话,看看他在干什么:

SQL> SELECT SQL_TEXT FROM V$SQL
  2  WHERE ADDRESS IN
  3  (SELECT SQL_ADDRESS FROM V$SESSION
  4  WHERE SID = 54);

SQL_TEXT
---------------------------------------------------------------------
ALTER TABLE GOV_S_MO_EN TRUNCATE PARTITION P200901

居然是TRUNCATE分区操作,难怪会导致过程处于INVALID状态,不过这个操作应该不会持续很长时间的,难道这个操作一直没有完成吗:

SQL> SELECT EVENT, P1TEXT, P1, P2TEXT, P2, P3TEXT, P3, SECONDS_IN_WAIT
  2  FROM V$SESSION_WAIT WHERE SID = 54;

EVENT                     P1TEXT    P1 P2TEXT         P2 P3TEXT     P3 SECONDS_IN_WAIT
------------------------- ------- ---- -------- -------- -------- ---- ---------------
db file sequential read   file#      1 block#     170158 blocks      1         3995643

这个等待已经发生了几十天了,显然这是一个僵死的会话。

从后台kill掉对应的进程:

SQL> SELECT SPID FROM V$PROCESS
  2  WHERE ADDR IN (SELECT PADDR FROM V$SESSION WHERE SID = 54);

SPID
------------
12974

SQL> HOST kill -9 12974

切换为FUJIANREP用户,再次编译过程:

SQL> ALTER PROCEDURE P_GENERATE_REPDATA COMPILE;

Procedure altered.

至此,问题解决。将JOB重新设置BROKEN即可。

SQL> EXEC DBMS_JOB.BROKEN(63, FALSE)

PL/SQL procedure successfully completed.

SQL> COMMIT;

Commit complete.

 

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-557248/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
暂无介绍

注册时间:2007-12-29

  • 博文量
    1955
  • 访问量
    10455284