ITPub博客

首页 > Linux操作系统 > Linux操作系统 > DB2RELOCATEDB does not update the eventmonitor information

DB2RELOCATEDB does not update the eventmonitor information

原创 Linux操作系统 作者:fengjin821 时间:2009-09-18 10:58:50 0 删除 编辑
Problem summary
USERS AFFECTED:  ALL
PROBLEM DESCRIPTION:  The DB2RELOCATEDB tool does not update the
necessary meta. information for an associated event monitor when
a database is renamed or the database path is changed.  So the
event monitor does not start after moving a database with the
db2relocateDB tool.
Problem conclusion
Temporary fix 
Since the database path for the event monitor is not updated
when using the DB2RELOCATEDB tool to modify the database path
for a database.  And it is not possible to manually edit the
database path for the event monitor, you will need to drop and
recreate the event monitor.
 
 
自从看了wang很多的经典案例后,很是佩服,觉得db2trc是个好东东,
一直想尝试用db2trc解决问题,不过一来这玩意比较高深,许多内部函数,
返回值都不知道,另外也没有合适案例,所以一直未能得逞。
今天总算逮着一个机会搞了一把db2trc,虽然问题很小,不如wang的那么精彩纷呈,
但总算是自己的一点小收获,因此把过程贴出来与大家分享一下。

场景如下:

两个实例db2inst2, db2inst3
数据库mytestdb,
某一天,我用db2relocatedb把 mytestdb 从db2inst2迁到了db2inst3下。
(db2 catalog 目录的内容是tar过去的)。
然后平安度过好多天,

再某一天,我需要抓一些数据库的deadlock情况
我想起mytestdb中有个deadlock monitor DB2DETAILDEADLOCK,
但当我试图打开的时候报如下错:

$db2 "set event monitor DB2DETAILDEADLOCK state 1"                                                                                                        
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL1618N  The target path of the specified event monitor is in use by another
event monitor.  SQLSTATE=51026

db2diag.log里的错误如下:

2009-09-17-17.53.53.411328+480 I710483C334        LEVEL: Warning
PID     : 344552               TID  : 1           PROC : db2event (DB2DETAILDEADLOCK) 0
INSTANCE: db2inst3             NODE : 000
FUNCTION: DB2 UDB, database monitor, sqm___sqlmeiot, probe:70
MESSAGE : target type 0 failed to open DB2DETAILDEADLOCK with rc -2146631636

2009-09-17-17.53.53.411948+480 I710818C407        LEVEL: Severe
PID     : 451068               TID  : 1           PROC : db2agent (MYTESTDB) 0
INSTANCE: db2inst3             NODE : 000         DB   : MYTESTDB
APPHDL  : 0-48                 APPID: *LOCAL.db2inst3.090917093706
FUNCTION: DB2 UDB, database monitor, sqm___start_evmon, probe:40
RETCODE : ZRC=0x800D002C=-2146631636=SQLM_RC_EVPATH "path in use"

2009-09-17-17.53.53.412376+480 I711226C464        LEVEL: Severe
PID     : 451068               TID  : 1           PROC : db2agent (MYTESTDB) 0
INSTANCE: db2inst3             NODE : 000         DB   : MYTESTDB
APPHDL  : 0-48                 APPID: *LOCAL.db2inst3.090917093706
MESSAGE : event monitor name
DATA #1 : Hexdump, 17 bytes
0x2005AA08 : 4442 3244 4554 4149 4C44 4541 444C 4F43    DB2DETAILDEADLOC
0x2005AA18 : 4B                                         K

2009-09-17-17.53.53.412570+480 I711691C688        LEVEL: Severe
PID     : 451068               TID  : 1           PROC : db2agent (MYTESTDB) 0
INSTANCE: db2inst3             NODE : 000         DB   : MYTESTDB
APPHDL  : 0-48                 APPID: *LOCAL.db2inst3.090917093706
MESSAGE : event monitor target
DATA #1 : Hexdump, 68 bytes
0x2005AA30 : 2F68 6F6D 652F 6462 3269 6E73 7433 2F64    /home/db2inst3/d
0x2005AA40 : 6232 696E 7374 332F 4E4F 4445 3030 3030    b2inst3/NODE0000
0x2005AA50 : 2F53 514C 3030 3030 312F 6462 3265 7665    /SQL00001/db2eve
0x2005AA60 : 6E74 2F64 6232 6465 7461 696C 6465 6164    nt/db2detaildead
0x2005AA70 : 6C6F 636B                                  lock


本来想直接drop 再create,不过解决问题不能不求甚解,因此决定用db2trc看一下啥问题。
于是:

db2trc on -t -f db2trc.dump;db2 "set event monitor DB2DETAILDEADLOCK state 1";db2trc off
db2trc flw db2trc.dump db2trc.flw;
db2trc fmt db2trc.dump db2trc.fmt;

然后打开db2diag.log
直接拿sqm___sqlmeiot去db2trc.flw去找,找到如下几行:
sqm___sqlmeiot entry
1859        | sqloGetEnvInternal entry
1860        | sqloGetEnvInternal exit [rc = 0x870F0104 = -2029059836 = RC_ENV_NOT_FOUND]
1861        | sqm___iot::que_init entry
1862        | | sqloCSemCreate entry
1863        | | sqloCSemCreate exit
1864        | sqm___iot::que_init exit
1865        | sqlogmblkEx entry
1866        | sqlogmblkEx mbt [Marker:PD_OSS_ALLOCATED_MEMORY ]
1867        | sqlogmblkEx exit
1868        | sqm___sqm_evmon_ftarget entry
1869        | | sqlogmblkEx entry
1870        | | sqlogmblkEx mbt [Marker:PD_OSS_ALLOCATED_MEMORY ]
1871        | | sqlogmblkEx exit
1872        | | sqm___sqm_evmon_ctrlf entry
1873        | | sqm___sqm_evmon_ctrlf data [probe 1]
1874        | | | sqloopenp entry
1875        | | | sqloopenp data [probe 30]
1876        | | | sqloopenp exit [rc = 0x070F00CB = 118423755 = SQLO_FREV]
1877        | | | sqloread entry
1878        | | | sqloread exit
1879        | | sqm___sqm_evmon_ctrlf data [probe 25]
1880        | | | sqloclose entry
1881        | | | sqloclose exit
1882        | | sqm___sqm_evmon_ctrlf data [probe 30]
1883        | | sqm___sqm_evmon_ctrlf exit [rc = 0x800D002C = -2146631636 = SQLM_RC_EVPATH]
1884        | sqm___sqm_evmon_ftarget exit [rc = 0x800D002C = -2146631636 = SQLM_RC_EVPATH]
1885        | sqlt_logerr_str entry
1886        | sqlt_logerr_str data [probe 10]
1887        | | pdLog entry
1888        | | | _pdlogInt entry
1889        | | | | sqltGetDiagPath entry
1890        | | | | sqltGetDiagPath data [probe 10]
1891        | | | | sqltGetDiagPath exit
1892        | | | _pdlogInt data [probe 130]
1893        | | | | sqltGetDiagPath entry
1894        | | | | sqltGetDiagPath data [probe 10]
1895        | | | | sqltGetDiagPath exit
1896        | | | | ossPathExists entry

在这里看到了rc = 0x800D002C = -2146631636,跟diag.log一致,估计应该就是这里有问题了,
1874        | | | sqloopenp entry
1875        | | | sqloopenp data [probe 30]
1876        | | | sqloopenp exit [rc = 0x070F00CB = 118423755 = SQLO_FREV]

看上去像打开什么文件出错,看看fmt文件里到底是什么文件?
fmt内容如下:

1874    entry DB2 UDB oper system services sqloopenp cei (1.3.15.811.2)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99027637
        bytes 113

        Data1   (PD_TYPE_FILE_NAME,81) File name:
        /home/db2inst3/db2inst3/NODE0000/SQL00001/db2event/db2detaildeadlock/db2event.ctl
        Data2   (PD_TYPE_UINT,4) unsigned integer:
        95
        Data3   (PD_TYPE_UINT,4) unsigned integer:
        416

1875    data DB2 UDB oper system services sqloopenp cei (3.3.15.811.2.30)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99076076 probe 30
        bytes 24

        Data1   (PD_TYPE_HEXINT,4) Hex integer:
        0x00401102
        Data2   (PD_TYPE_HEXINT,4) Hex integer:
        0x00000000

1876    exit DB2 UDB oper system services sqloopenp cei (2.3.15.811.2)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99097253
        rc = 0x070F00CB = 118423755 = SQLO_FREV
        bytes 16

        Data1   (PD_TYPE_SQO_FILE_HDL,8) File handle:
        0000 0003 0000 0000                        ........

哦,原来是这个文件/home/db2inst3/db2inst3/NODE0000/SQL00001/db2event/db2detaildeadlock/db2event.ctl,
难道这个文件不存在?
ls -l /home/db2inst3/db2inst3/NODE0000/SQL00001/db2event/db2detaildeadlock/db2event.ctl看一下
-rw-r-----   1 db2inst3 staff           300 Aug 31 16:02 /home/db2inst3/db2inst3/NODE0000/SQL00001/db2event/db2detaildeadlock/db2event.ctl
文件好好的在这里啊,
怎么回事,
再看看错误返回值是 SQLO_FREV,这个东西不知道是啥意思,google了一下,有个链接里说是file aready exsits,不过不敢确定。
想起wang的很多例子,这个时候应该继续往下看,
1879        | | sqm___sqm_evmon_ctrlf data [probe 25]
1880        | | | sqloclose entry
1881        | | | sqloclose exit
1882        | | sqm___sqm_evmon_ctrlf data [probe 30]
1883        | | sqm___sqm_evmon_ctrlf exit [rc = 0x800D002C = -2146631636 = SQLM_RC_EVPATH]
再去对应的fmt里看
1879    data DB2 UDB database monitor sqm___sqm_evmon_ctrlf fnc (3.3.13.215.0.25)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99112551 probe 25
        bytes 308

        Data1   (PD_TYPE_DEFAULT,300) Hexdump:
      
0000 0000 4442 3244 4554 4149 4C44 4541    ....DB2DETAILDEA
        444C 4F43 4B20 2020 4D59 5445 5354 4442    DLOCK   MYTESTDB
        2020 2020 2020 2020 2020 2020 2F68 6F6D                /hom
        652F 6462 3269 6E73 7432 2F64 6232 696E    e/db2inst2/db2in
        7374 322F 4E4F 4445 3030 3030 2F53 514C    st2/NODE0000/SQL
        3030 3030 312F 2020 2020 2020 2020 2020    00001/

        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020 2020 2020
        2020 2020 2020 2020 2020 2020


1880    entry DB2 UDB oper system services sqloclose cei (1.3.15.674.2)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99114186
        bytes 16

        Data1   (PD_TYPE_SQO_FILE_HDL,8) File handle:
        0000 0003 0000 0000                        ........


1881    exit DB2 UDB oper system services sqloclose cei (2.3.15.674.2)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99117110
        rc = 0

1882    data DB2 UDB database monitor sqm___sqm_evmon_ctrlf fnc (3.3.13.215.0.30)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99117388 probe 30
        bytes 12

        Data1   (PD_TYPE_DEFAULT,4) Hexdump:
        800D 002C                                  ...,


1883    exit DB2 UDB database monitor sqm___sqm_evmon_ctrlf fnc (2.3.13.215.0)

1884    exit DB2 UDB database monitor sqm___sqm_evmon_ftarget fnc (2.3.13.79.0)
        pid 680346 tid 1 cpid -1 node 0 sec 0 nsec 99118476
        rc = 0x800D002C = -2146631636 = SQLM_RC_EVPATH
终于发现了点有用的东西
....DB2DETAILDEADLOCK   MYTESTDB
/home/db2inst2/db2inst2/NODE0000/SQL00001/
db2去db2inst2里干嘛?再看看错误信息      
The target path of the specified event monitor is in use by another
event monitor,错误函数 sqm___sqm_evmon_ftarget,错误码 SQLO_FREV
难道是想从db2inst2把db2event.ctl拷到db2inst3来,而文件又已经存在了?
不管,试一下,
mv /home/db2inst3/db2inst3/NODE0000/SQL00001/db2event/db2detaildeadlock/db2event.ctl  mv /home/db2inst3/db2inst3/NODE0000/SQL00001/db2event/db2detaildeadlock/db2event.ctl.bak
再执行一下
db2 set event monitor DB2DETAILDEADLOCK state 1
果然成功了。

带来问题,
1、SQLO_FREV可能就是file aready exsits的意思。
2、不知道为什么db2要去原来的instance里拷db2event.ctl文件。
这样的话,如果relocate到另一台机器上去,原先create的event岂不是都不能用了
 
 
one comment:

1858        sqm___sqlmeiot entry
1859        | sqloGetEnvInternal entry
1860        | sqloGetEnvInternal exit [rc = 0x870F0104 = -2029059836 = RC_ENV_NOT_FOUND]
1861        | sqm___iot::que_init entry
1862        | | sqloCSemCreate entry
1863        | | sqloCSemCreate exit
1864        | sqm___iot::que_init exit
1865        | sqlogmblkEx entry
1866        | sqlogmblkEx mbt [Marker:PD_OSS_ALLOCATED_MEMORY ]
1867        | sqlogmblkEx exit
1868        | sqm___sqm_evmon_ftarget entry
1869        | | sqlogmblkEx entry
1870        | | sqlogmblkEx mbt [Marker:PD_OSS_ALLOCATED_MEMORY ]
1871        | | sqlogmblkEx exit
1872        | | sqm___sqm_evmon_ctrlf entry
1873        | | sqm___sqm_evmon_ctrlf data [probe 1] 
 ---> used to open the current control file
1874        | | | sqloopenp entry     -----------------> try to open the file
1875        | | | sqloopenp data [probe 30]
1876        | | | sqloopenp exit [rc = 0x070F00CB = 118423755 = SQLO_FREV] 
 ---------> file already exist
1877        | | | sqloread entry   
--------------> read the file
1878        | | | sqloread exit
1879        | | sqm___sqm_evmon_ctrlf data [probe 25]   
-----------> show the information being read
1880        | | | sqloclose entry      
------------> close the file
1881        | | | sqloclose exit
1882        | | sqm___sqm_evmon_ctrlf data [probe 30]
-----------> if data path in the control file not match the current db path, something wrong
1883        | | sqm___sqm_evmon_ctrlf exit [rc = 0x800D002C = -2146631636 = SQLM_RC_EVPATH]
1884        | sqm___sqm_evmon_ftarget exit [rc = 0x800D002C = -2146631636 = SQLM_RC_EVPATH]


So the problem is that after doing db2relocatedb, the database path in the control file was not changed, and still point to inst2

that means, when db2 trying to open the deadlock control file it still found the path pointing to inst2, so it confused and didn't allow monitor to start-up...
 

 
 
 
 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/13165828/viewspace-614971/,如需转载,请注明出处,否则将追究法律责任。

下一篇: db2 创建工具目录
请登录后发表评论 登录
全部评论

注册时间:2009-04-29

  • 博文量
    191
  • 访问量
    505113