ITPub博客

首页 > Linux操作系统 > Linux操作系统 > OGG-01031错误处理一例

OGG-01031错误处理一例

原创 Linux操作系统 作者:skuary 时间:2012-05-24 10:49:54 0 删除 编辑
刚刚观察测试库gg运行状况发现,pump进程异常abend,err log报错日志如下:
 
2012-05-24 10:45:54  ERROR   OGG-01031  Oracle GoldenGate Capture for Oracle, dpump.prm:  There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Reply received is Unable to open file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable)).
2012-05-24 10:45:54  ERROR   OGG-01668  Oracle GoldenGate Capture for Oracle, dpump.prm:  PROCESS ABENDING.

 
该错误官方说明如下:
 
OGG-01031: There is a problem in network communication, a remote file problem,
encryption keys for target and source do not match (if using ENCRYPT) or an
unknown error. (Reply received is {0})
Cause: Extract was not able to send data to the target.
Action: Examine the network for an outage between the source and target system.
Make certain that MGRPORT in RMTHOST matches the one in the parameter file
of the remote Manager. Check for errors on the target system that indicate the
Manager or Collector process is not running, or that Collector does not have
privileges to write to the remote trail. Check for encryption errors if the ENCRYPT
option is specified with the RMTHOST or RMTHOSTOPTIONS parameter: This
might indicate that the encryption key that was sent from the source does not
match the one in the ENCKEYS file on the target, or that the key or ENCKEYS file
on the target is missing.
 
貌似是网络原因造成的,排查源库和目标库的相关参数发现,都是一致的,也没什么问题,再到目标库查看相关进程都是running状态,但是出现了一个比较明显的告警信息,如下:
 
2012-05-24 10:57:50  WARNING OGG-01223  Oracle GoldenGate Collector for Oracle:  Unable to lock file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable).  Lock currently held by process id (PID) 1731.
2012-05-24 10:57:50  WARNING OGG-01223  Oracle GoldenGate Collector for Oracle:  Unable to open file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable).
2012-05-24 10:57:50  INFO    OGG-01676  Oracle GoldenGate Collector for Oracle:  Terminating after client disconnect.
2012-05-24 10:59:22  INFO    OGG-00963  Oracle GoldenGate Manager for Oracle, mgr.prm:  Command received from EXTRACT on host (START SERVER CPU -1 PRI -1  TIMEOUT 300 PARAMS ).
2012-05-24 10:59:22  INFO    OGG-01677  Oracle GoldenGate Collector for Oracle:  Waiting for connection (started dynamically).
2012-05-24 10:59:22  INFO    OGG-00963  Oracle GoldenGate Manager for Oracle, mgr.prm:  Command received from SERVER on host localhost.localdomain (REPORT 10710 7810).
2012-05-24 10:59:22  INFO    OGG-00974  Oracle GoldenGate Manager for Oracle, mgr.prm:  Manager started collector process (Port 7810).
2012-05-24 10:59:22  INFO    OGG-01228  Oracle GoldenGate Collector for Oracle:  Timeout in 300 seconds.

 
仔细观察发现,源和目标库的相关操作都是对目标库的trail文件k1005466进行操作的,推测大致的过程是这样的:
启动pump进程后,该进程要往目标库的第5466号trail文件写数据,而此时目标库的该文件被进程编号为1731的进程所锁定,pump无法获得该文件上的锁,造成等待,延时,最终超时而abend掉(这种情况表面看来很像是由于网络原因造成的,其实不然),之后在目标库查看1731号进程的具体情况如下:
 
[oracle@trptdb ggs]$ ps -ef | grep 1731
oracle    1731     1  0 May23 ?        00:00:13 ./server -w 300 -p 7819-12818 -m 7809 -k -l /home/oracle/ggs/ggserr.log
oracle   10699 10451  0 10:58 pts/1    00:00:00 grep 1731

 
发现是ogg的server进程锁住了第5466号文件,推测应该是读取5466号文件写入ggserr log所造成吧,具体有待查证,之后干脆直接kill掉该进程,然后再次启动源库的pump进程,2端开始正常应用:
 
源库ggserr.log:
 
2012-05-24 10:52:26  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): start dpump.
2012-05-24 10:52:26  INFO    OGG-00963  Oracle GoldenGate Manager for Oracle, mgr.prm:  Command received from GGSCI on host 127.0.0.1 (START EXTRACT DPUMP ).
2012-05-24 10:52:26  INFO    OGG-00975  Oracle GoldenGate Manager for Oracle, mgr.prm:  EXTRACT DPUMP starting.
2012-05-24 10:52:26  INFO    OGG-00992  Oracle GoldenGate Capture for Oracle, dpump.prm:  EXTRACT DPUMP starting.
2012-05-24 10:52:26  INFO    OGG-00993  Oracle GoldenGate Capture for Oracle, dpump.prm:  EXTRACT DPUMP started.
2012-05-24 10:52:31  INFO    OGG-01226  Oracle GoldenGate Capture for Oracle, dpump.prm:  Socket buffer size set to 27985 (flush size 27985).
2012-05-24 10:52:32  INFO    OGG-01056  Oracle GoldenGate Capture for Oracle, dpump.prm:  Recovery initialization completed for target file /home/oracle/ggs/dirdat/k1005466, at RBA 5777739, CSN 76400322526.
2012-05-24 10:52:32  INFO    OGG-01478  Oracle GoldenGate Capture for Oracle, dpump.prm:  Output file /home/oracle/ggs/dirdat/k1 is using format RELEASE 10.4/11.1.
2012-05-24 10:52:32  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, dpump.prm:  Rolling over remote file /home/oracle/ggs/dirdat/k1005467.
2012-05-24 10:52:32  INFO    OGG-01054  Oracle GoldenGate Capture for Oracle, dpump.prm:  Recovery completed for target file /home/oracle/ggs/dirdat/k1005467, at RBA 1393, CSN 76400322526.
2012-05-24 10:52:32  INFO    OGG-01057  Oracle GoldenGate Capture for Oracle, dpump.prm:  Recovery completed for all targets.
2012-05-24 10:52:33  WARNING OGG-00869  Oracle GoldenGate Capture for Oracle, dpump.prm:  No unique key is defined for table M_GOODS_STOCK2. All viable columns will be used to represent the key, but may not guarantee uniqueness.  KEYCOLS may be used to define the key.
2012-05-24 10:52:33  WARNING OGG-00869  Oracle GoldenGate Capture for Oracle, dpump.prm:  No unique key is defined for table M_GOODS_STOCK. All viable columns will be used to represent the key, but may not guarantee uniqueness.  KEYCOLS may be used to define the key.
2012-05-24 10:52:34  WARNING OGG-00869  Oracle GoldenGate Capture for Oracle, dpump.prm:  No unique key is defined for table M_PAGE_CLICK. All viable columns will be used to represent the key, but may not guarantee uniqueness.  KEYCOLS may be used to define the key.

 
目标库ggserr.log:
 
2012-05-24 10:59:27  INFO    OGG-01229  Oracle GoldenGate Collector for Oracle:  Connected to :22983.
2012-05-24 10:59:27  INFO    OGG-01669  Oracle GoldenGate Collector for Oracle:  Opening /home/oracle/ggs/dirdat/k1005466 (byte -1, current EOF 5777739).
2012-05-24 10:59:27  INFO    OGG-01670  Oracle GoldenGate Collector for Oracle:  Closing /home/oracle/ggs/dirdat/k1005466.
2012-05-24 10:59:27  INFO    OGG-01669  Oracle GoldenGate Collector for Oracle:  Opening /home/oracle/ggs/dirdat/k1005466 (byte 5777739, current EOF 5777739).
2012-05-24 10:59:27  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005466 to disk.
2012-05-24 10:59:27  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005466 to disk.
2012-05-24 10:59:27  INFO    OGG-01670  Oracle GoldenGate Collector for Oracle:  Closing /home/oracle/ggs/dirdat/k1005466.
2012-05-24 10:59:27  INFO    OGG-01669  Oracle GoldenGate Collector for Oracle:  Opening /home/oracle/ggs/dirdat/k1005467 (byte -1, current EOF 0).
2012-05-24 10:59:27  INFO    OGG-01020  Oracle GoldenGate Delivery for Oracle, rep1.prm:  Processed extract process RESTART_ABEND record at seq 5467, rba 1333 (aborted 0 records).
2012-05-24 10:59:30  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005467 to disk.
2012-05-24 10:59:30  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005467 to disk.
2012-05-24 10:59:30  INFO    OGG-01670  Oracle GoldenGate Collector for Oracle:  Closing /home/oracle/ggs/dirdat/k1005467.
2012-05-24 10:59:30  INFO    OGG-01669  Oracle GoldenGate Collector for Oracle:  Opening /home/oracle/ggs/dirdat/k1005468 (byte -1, current EOF 0).
2012-05-24 10:59:31  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005468 to disk.
2012-05-24 10:59:31  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005468 to disk.
2012-05-24 10:59:31  INFO    OGG-01670  Oracle GoldenGate Collector for Oracle:  Closing /home/oracle/ggs/dirdat/k1005468.
2012-05-24 10:59:31  INFO    OGG-01669  Oracle GoldenGate Collector for Oracle:  Opening /home/oracle/ggs/dirdat/k1005469 (byte -1, current EOF 0).
2012-05-24 10:59:33  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005469 to disk.
2012-05-24 10:59:33  INFO    OGG-01735  Oracle GoldenGate Collector for Oracle:  Synchronizing /home/oracle/ggs/dirdat/k1005469 to disk.
2012-05-24 10:59:33  INFO    OGG-01670  Oracle GoldenGate Collector for Oracle:  Closing /home/oracle/ggs/dirdat/k1005469.
2012-05-24 10:59:33  INFO    OGG-01669  Oracle GoldenGate Collector for Oracle:  Opening /home/oracle/ggs/dirdat/k1005470 (byte -1, current EOF 0).
2012-05-24 10:59:41  WARNING OGG-00869  Oracle GoldenGate Delivery for Oracle, rep1.prm:  No unique key is defined for table 'M_GOODS_STOCK2'. All viable columns will be used to represent the key, but may not guarantee uniqueness.  KEYCOLS may be used to define the key.

 
由此看来遇到问题的时候,errlog里面会提供大量有用的排错信息,这一点也需要细心和耐心,同时可以查阅官方对错误和警告的详细说明,对解决问题很有帮助。
 
记录一下~~
 
 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25618347/viewspace-730797/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2011-03-31

  • 博文量
    88
  • 访问量
    315805