ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 10g ORACLE_HOME空间满导致SYSAUX表空间脱机

10g ORACLE_HOME空间满导致SYSAUX表空间脱机

原创 Linux操作系统 作者:yangtingkun 时间:2010-12-08 23:57:30 0 删除 编辑

客户服务器上数据库的SYSAUX表空间自动脱机,帮客户诊断导致问题的原因。

 

 

问题是SYSAUX表空间自动脱机并需要恢复,有归档的存在,表空间的恢复很简单,需要找出是什么问题导致了表空间脱机。

Wed Dec 01 16:49:59 2010
LGWR: Standby redo logfile selected for thread 1 sequence 66443 for destination LOG_ARCHIVE_DEST_2
Thread 1 advanced to log sequence 66443
  Current log# 6 seq# 66443 mem# 0: +DG00/ora00/onlinelog/group_6.5108.697636779
Wed Dec 01 17:14:11 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_ora_4262.trc:
ORA-03137: TTC
协议内部错误: [12333] [5] [113] [105] [] [] [] []
Wed Dec 01 17:14:12 2010
Trace dumping is performing id=[cdmp_20101201171412]
Wed Dec 01 17:27:39 2010
LGWR: Standby redo logfile selected for thread 1 sequence 66444 for destination LOG_ARCHIVE_DEST_2
Thread 1 advanced to log sequence 66444
  Current log# 2 seq# 66444 mem# 0: +DG00/ora00/onlinelog/group_2.262.696608025
Wed Dec 01 18:01:15 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_ora_5879.trc  (incident=409507):
ORA-00600:
内部错误代码, 参数: [kkocxj : pjpCtx], [], [], [], [], [], [], []
Incident details in: /db/oracle/app/oracle/diag/rdbms/ora00/ora001/incident/incdir_409507/ora001_ora_5879_i409507.trc
Wed Dec 01 18:01:38 2010
Trace dumping is performing id=[cdmp_20101201180138]
Wed Dec 01 18:01:40 2010
Sweep Incident[409507]: completed
Wed Dec 01 18:02:07 2010
Trace dumping is performing id=[cdmp_20101201180100]
Wed Dec 01 18:21:00 2010
LGWR: Standby redo logfile selected for thread 1 sequence 66445 for destination LOG_ARCHIVE_DEST_2
Thread 1 advanced to log sequence 66445
  Current log# 1 seq# 66445 mem# 0: +DG00/ora00/onlinelog/group_1.261.696608025
Wed Dec 01 20:04:34 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012:
自动执行作业 29 出错
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 53
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 1157
ORA-06512:
line 1
Wed Dec 01 20:04:41 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q000_22441.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:05:12 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q003_27619.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:05:34 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012:
自动执行作业 29 出错
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 53
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 1157
ORA-06512:
line 1
Wed Dec 01 20:05:44 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q002_22446.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:06:14 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012: error on auto execute of job 4002
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512: at "FLOWS_030000.WWV_FLOW_PLATFORM", line 9
ORA-06512: at line 1
Wed Dec 01 20:06:14 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q001_4013.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:06:34 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012:
自动执行作业 29 出错
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 53
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 1157
ORA-06512:
line 1
Wed Dec 01 20:06:45 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q000_6265.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:07:16 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q003_6575.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:07:34 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012:
自动执行作业 29 出错
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 53
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 1157
ORA-06512:
line 1
Wed Dec 01 20:07:48 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q002_8014.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:08:14 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012: error on auto execute of job 4002
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512: at "FLOWS_030000.WWV_FLOW_PLATFORM", line 9
ORA-06512: at line 1
Wed Dec 01 20:08:18 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_q001_9342.trc:
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
Wed Dec 01 20:08:34 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_j000_2866.trc:
ORA-12012:
自动执行作业 29 出错
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 53
ORA-06512:
"SYSMAN.EMD_MAINTENANCE", line 1157
ORA-06512:
line 1

后面类似的错误还有很多,不过后面的所有错误都是表现,导致错误的原因是数据文件2脱机造成的。而在错误第一次发生之前,存在一个ORA-03137和一个ORA-00600[kkocxj : pjpCtx]错误。

这两个错误都是Oraclebug,但是显然都不是造成表空间脱机的主要原因。

我不是直接从客户处获取信息,因此所有的信息都来自日志文件,虽然获取的日志文件只有最后两三天的,但是从中已经足够找到一些有用的内容了。

从上面贴出的部分内容就可以看到,这个数据库配置了STANDBY环境,从归档的路径可以看出,这个数据库使用了ASM磁盘组。

一般来说,使用ASM的数据库99%RAC环境,确实ASM支持单实例环境,但是正式环境没有人会为单实例数据库去部署ASM

而且在错误日志中还看到这样的信息:

Wed Dec 01 15:12:05 2010
Global Enqueue Services Deadlock detected. More info in file
 /db/oracle/app/oracle/diag/rdbms/ora00/ora001/trace/ora001_lmd0_21641.trc.

显然这是一个RAC环境数据库,那么导致问题的多半是其他的实例,联系客户获得实例2上的告警日志,导致错误的原因这次一目了然了:

Wed Dec 01 17:26:26 2010
LGWR: Standby redo logfile selected for thread 2 sequence 25580 for destination LOG_ARCHIVE_DEST_2
Thread 2 advanced to log sequence 25580
  Current log# 15 seq# 25580 mem# 0: +DG00/ora00/onlinelog/group_15.774.729989389
Wed Dec 01 18:00:32 2010
Trace dumping is performing id=[cdmp_20101201180138]
Wed Dec 01 18:00:40 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora002/trace/ora002_ora_28318.trc  (incident=217991):
ORA-00600:
内部错误代码, 参数: [kkocxj : pjpCtx], [], [], [], [], [], [], []
Incident details in: /db/oracle/app/oracle/diag/rdbms/ora00/ora002/incident/incdir_217991/ora002_ora_28318_i217991.trc
Wed Dec 01 18:01:00 2010
Trace dumping is performing id=[cdmp_20101201180100]
Wed Dec 01 18:01:01 2010
Sweep Incident[217991]: completed
Wed Dec 01 18:42:10 2010
Thread 2 advanced to log sequence 25581
  Current log# 16 seq# 25581 mem# 0: +DG00/ora00/onlinelog/group_16.775.729989389
Wed Dec 01 19:15:51 2010
Non critical error ORA-00001 caught while writing to trace file "/db/oracle/app/oracle/diag/rdbms/ora00/ora002/trace/ora002_dia0_1187.trc"
Error message: SVR4 Error: 28: No space left on device
Additional information: 1
Writing to the above trace file is disabled for now on...
Wed Dec 01 19:59:15 2010
Non critical error ORA-00001 caught while writing to trace file "/db/oracle/app/oracle/diag/rdbms/ora00/ora002/trace/ora002_m000_23401.trc"
Error message: SVR4 Error: 28: No space left on device
Additional information: 1
Writing to the above trace file is disabled for now on...
Wed Dec 01 20:03:17 2010
Non critical error ORA-00001 caught while writing to trace file "/db/oracle/app/oracle/diag/rdbms/ora00/ora002/trace/ora002_dbw0_1221.trc"
Error message: SVR4 Error: 28: No space left on device
Additional information: 1
Writing to the above trace file is disabled for now on...
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora002/trace/ora002_dbw0_1221.trc:
ORA-01148:
无法刷新数据文件 2 的文件大小
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'
ORA-09817: Write to audit file failed.
SVR4 Error: 28: No space left on device
Automatic datafile offline due to media error on
file 2: +DG00/ora00/datafile/sysaux.257.696607929
Wed Dec 01 20:08:10 2010
Non critical error ORA-0Thu Dec 02 08:59:07 2010
Errors in file /db/oracle/app/oracle/diag/rdbms/ora00/ora002/trace/ora002_smon_1235.trc:
ORA-00604:
递归 SQL 级别 1 出现错误
ORA-00376:
此时无法读取文件 2
ORA-01110:
数据文件 2: '+DG00/ora00/datafile/sysaux.257.696607929'

导致错误的原因很简单,节点2上的ORACLE_BASE/ORACLE_HOME目录没有空间了。在错误的最开始也产生了一个ORA-600错误,同样这个错误不是导致问题的真正原因,但是这是诱因,由于这个ORA-600产生的trace文件,使得ORACLE_BASE目录没有空闲空间,当随后的内部事件发生,Oracle将事件记录到SYSAUX表空间的时候,发现出现了问题,由于目录已经没有空间,Oracle无法向ORACLE_BASE写入信息,使得写SYSAUX表空间的内部事务无法正确完成,于是Oracle将数据文件置于了OFFLINE状态。

这个文件本身并没有任何的问题,而导致问题的原因仅仅是ORACLE_BASE磁盘空间不足,而不是数据文件所在的ASM磁盘组空间不足,Oraclemetalink文档ID 759486.1描述了这个问题。

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-681316/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
暂无介绍

注册时间:2007-12-29

  • 博文量
    1955
  • 访问量
    10454140