ITPub博客

首页 > 数据库 > Oracle > 记一次RAC节点连接共享存储失败重启问题

记一次RAC节点连接共享存储失败重启问题

原创 Oracle 作者:SmartWilson 时间:2016-01-20 16:16:54 0 删除 编辑

    

问题背景:

    RAC环境
   DB版本:11.2.0.4

    因公司搬机柜原因,需要暂时关闭RAC节点,采取的方式是关闭节点一RAC服务和关机,待完成搬动后,启动节点一RAC服务;再关闭节点二RAC服务和关机,待完成搬动后,启动节点二RAC服务;
发生的问题是,节点一完成搬动后成功启动节点一相关服务,此时进行关闭节点二RAC服务和关机操作,待节点二成功关机后,客户可在节点一继续操作,但就在节点二关机不到5分钟,节点一居然自动关闭RAC服务并且重启机器了(此时时间约为2016年1月20日10:15);询问过搬机柜的人员,他们说没有碰到节点一的线,也不清楚为什么发生这样的情况,待节点一重启后,我们开始寻找原因;

    一开始个人认为是votedisk将节点一踢出,进行IO隔离从而重启节点一服务器,但是经过查看log,跟votedisk非直接原因,少废话,先看log;

节点一 grid alert-rac1.log

  1. 2016-01-20 09:38:39.292: 
    [ctssd(11368)]CRS-2408:The clock on host rac1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
    2016-01-20 09:38:57.565: 
    [crsd(11498)]CRS-1012:The OCR service started on node rac1.
    2016-01-20 09:38:57.580: 
    [evmd(11389)]CRS-1401:EVMD started on node rac1.
    2016-01-20 09:38:58.721: 
    [crsd(11498)]CRS-1201:CRSD started on node rac1.
    2016-01-20 10:08:07.231: 
    [client(16966)]CRS-4743:File /u01/11.2.0/grid/oc4j/j2ee/home/OC4J_DBWLM_config/system-jazn-data.xml was updated from OCR(Size: 13365(New), 13378(Old) bytes)
    2016-01-20 10:08:37.992: 
    [ctssd(11368)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac1.
    2016-01-20 10:08:40.775: 
    [cssd(11233)]CRS-1625:Node rac2, number 2, was manually shut down
    2016-01-20 10:08:40.784: 
    [cssd(11233)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 .
    2016-01-20 10:08:40.786: 
    [crsd(11498)]CRS-5504:Node down event reported for node 'rac2'.
    2016-01-20 10:08:40.787: 
    [crsd(11498)]CRS-2773:Server 'rac2' has been removed from pool 'Generic'.
    2016-01-20 10:08:40.787: 
    [crsd(11498)]CRS-2773:Server 'rac2' has been removed from pool 'ora.msr'.
    2016-01-20 10:15:42.490: 
    [/u01/11.2.0/grid/bin/orarootagent.bin(11616)]CRS-5822:Agent '/u01/11.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:5:9} in /u01/11.2.0/grid/log/rac1/agent/crsd/orarootagent_root/orarootagent_root.log.
    2016-01-20 10:15:43.416: 
    [mdnsd(11155)]CRS-5602:mDNS service stopping by request.
    2016-01-20 10:15:43.419: 
    [ctssd(11368)]CRS-2405:The Cluster Time Synchronization Service on host rac1 is shutdown by user
    2016-01-20 10:15:53.300: 
    [cssd(11233)]CRS-1603:CSSD on node rac1 shutdown by user.
    2016-01-20 10:15:53.406: 
    [ohasd(10794)]CRS-2767:Resource state recovery not attempted for 'ora.cssdmonitor' as its target state is OFFLINE
    2016-01-20 10:15:53.406: 
    [ohasd(10794)]CRS-2769:Unable to failover resource 'ora.cssdmonitor'.
    2016-01-20 10:15:55.653: 
    [gpnpd(11165)]CRS-2329:GPNPD on node rac1 shutdown. 
    2016-01-20 10:21:06.907: 
    [ohasd(10802)]CRS-2112:The OLR service started on node rac1.
    2016-01-20 10:21:07.000: 
    [ohasd(10802)]CRS-1301:Oracle High Availability Service started on node rac1.
    2016-01-20 10:21:07.037: 
    [ohasd(10802)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
    2016-01-20 10:21:11.030: 
    [/u01/11.2.0/grid/bin/orarootagent.bin(10937)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 
    2016-01-20 10:21:15.360: 
    [gpnpd(11195)]CRS-2328:GPNPD started on node rac1. 
    2016-01-20 10:21:17.731: 
    [cssd(11270)]CRS-1713:CSSD daemon is started in clustered mode
    2016-01-20 10:21:19.543: 
    [ohasd(10802)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
    2016-01-20 10:21:19.551: 
    [ohasd(10802)]CRS-2769:Unable to failover resource 'ora.diskmon'.
    2016-01-20 10:21:37.425: 
    [cssd(11270)]CRS-1707:Lease acquisition for node rac1 number 1 completed
    2016-01-20 10:21:38.766: 
    [cssd(11270)]CRS-1605:CSSD voting file is online: ORCL:OCR_VOTE03; details in /u01/11.2.0/grid/log/rac1/cssd/ocssd.log.
    2016-01-20 10:21:38.773: 
    [cssd(11270)]CRS-1605:CSSD voting file is online: ORCL:OCR_VOTE02; details in /u01/11.2.0/grid/log/rac1/cssd/ocssd.log.
    2016-01-20 10:21:38.784: 
    [cssd(11270)]CRS-1605:CSSD voting file is online: ORCL:OCR_VOTE01; details in /u01/11.2.0/grid/log/rac1/cssd/ocssd.log.
    2016-01-20 10:21:47.897: 
    [cssd(11270)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 .
    2016-01-20 10:21:49.968: 
    [ctssd(11535)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac1.
    ......

    可以看到在2016-01-20 10:15:43的时候节点一RAC服务开始停止,但是这并非人为操作;

节点一 DB实例 log

  1. Reconfiguration complete
    Wed Jan 20 10:08:27 2016
    Decreasing number of real time LMS from 2 to 0
    Wed Jan 20 10:12:13 2016
    SKGXP: ospid 11859: network interface with IP address ** no longer running (check cable)
    Wed Jan 20 10:15:31 2016
    Shutting down instance (abort)
    License high water mark = 83
    USER (ospid: 19115): terminating the instance
    Instance terminated by USER, pid = 19115
    Wed Jan 20 10:15:33 2016
    Instance shutdown complete
    Wed Jan 20 10:22:28 2016
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Initial number of CPU is 4
    Number of processor cores in the system is 4
    Number of processor sockets in the system is 1
    Private Interface 'bond1:1' configured from GPnP for use as a private interconnect.
      [name='bond1:1', type=1, ip=**, mac=**, net=**, mask=255.255.0.0, use=haip:cluster_interconnect/62]
    Public Interface 'bond0' configured from GPnP for use as a public interface.
      [name='bond0', type=1, ip=**, mac=**, net=**, mask=255.255.255.224, use=public/1]
    Public Interface 'bond0:1' configured from GPnP for use as a public interface.
      [name='bond0:1', type=1, ip=**, mac=**, net=**, mask=255.255.255.224, use=public/1]
    Public Interface 'bond0:2' configured from GPnP for use as a public interface.
      [name='bond0:2', type=1, ip=**, mac=**, net=**, mask=255.255.255.224, use=public/1]
    Public Interface 'bond0:3' configured from GPnP for use as a public interface.
      [name='bond0:3', type=1, ip=**, mac=**, net=**, mask=255.255.255.224, use=public/1]
    Public Interface 'bond0:4' configured from GPnP for use as a public interface.
      [name='bond0:4', type=1, ip=**, mac=**, net=**, mask=255.255.255.224, use=public/1]
    Public Interface 'bond0:5' configured from GPnP for use as a public interface.
      [name='bond0:5', type=1, ip=**, mac=**, net=**, mask=255.255.255.224, use=public/1]
    CELL communication is configured to use 0 interface(s):
    CELL IP affinity details:
        NUMA status: non-NUMA system
        cellaffinity.ora status: N/A
    CELL communication will use 1 IP group(s):
        Grp 0: 
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
    WARNING: db_recovery_file_dest is same as db_create_online_log_dest_2
    Autotune of undo retention is turned on. 
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    Starting up:
    Oracle Database 11g Release 11.2.0.4.0 - 64bit Production
    With the Real Application Clusters option.
    ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
    System name: Linux
    ......

    在2016-01-20 10:15:31 节点一DB实例被关闭;此时觉得是否真的有人人为关闭DB实例并重启?经查,在这段时间里,只有我这台机器的IP登录过节点一的服务器,但实际上我在这个时刻并没有对节点一做任何操作,此时我们再去看节点一的ASM log;

节点一 +ASM1实例 log


  1. Reconfiguration complete
    Wed Jan 20 10:12:35 2016
    SKGXP: ospid 11426: network interface with IP address ** no longer running (check cable)
    Wed Jan 20 10:15:31 2016
    SQL> ALTER DISKGROUP OCR_VOTE DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    Wed Jan 20 10:15:33 2016
    SQL> ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    Wed Jan 20 10:15:33 2016
    NOTE: ASM client RACDB1:RACDB disconnected unexpectedly.
    NOTE: check client alert log.
    Wed Jan 20 10:15:33 2016
    SQL> ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    NOTE: Trace records dumped in trace file /u01/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_11891.trc
    NOTE: cache dismounting (clean) group 2/0x82B9443F (DBFRA) 
    NOTE: cache dismounting (clean) group 1/0x82B9443E (DBDATA) 
    NOTE: messaging CKPT to quiesce pins Unix process pid: 11666, image: oracle@rac1 (TNS V1-V3)
    NOTE: messaging CKPT to quiesce pins Unix process pid: 11668, image: oracle@rac1 (TNS V1-V3)
    Wed Jan 20 10:15:33 2016
    NOTE: LGWR doing clean dismount of group 2 (DBFRA)
    NOTE: LGWR closing thread 1 of diskgroup 2 (DBFRA) at ABA 27.10488
    NOTE: LGWR released thread recovery enqueue
    NOTE: LGWR doing clean dismount of group 1 (DBDATA)
    NOTE: LGWR closing thread 1 of diskgroup 1 (DBDATA) at ABA 16.3387
    NOTE: LGWR released thread recovery enqueue
    Wed Jan 20 10:15:33 2016
    freeing rdom 2
    Wed Jan 20 10:15:34 2016
    NOTE: detached from domain 2
    NOTE: cache dismounted group 2/0x82B9443F (DBFRA) 
    Wed Jan 20 10:15:34 2016
    GMON dismounting group 2 at 10 for pid 30, osid 11666
    NOTE: Disk ASM_FRA01 in mode 0x7f marked for de-assignment
    SUCCESS: diskgroup DBFRA was dismounted
    NOTE: cache deleting context for group DBFRA 2/0x82b9443f
    SUCCESS: ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBFRA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBFRA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBFRA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBFRA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBFRA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBFRA DISMOUNT  /* asm agent *//* {1:57967:201} */
    freeing rdom 1
    NOTE: detached from domain 1
    NOTE: cache dismounted group 1/0x82B9443E (DBDATA) 
    GMON dismounting group 1 at 11 for pid 31, osid 11668
    NOTE: Disk ASM_DATA01 in mode 0x7f marked for de-assignment
    SUCCESS: diskgroup DBDATA was dismounted
    NOTE: cache deleting context for group DBDATA 1/0x82b9443e
    SUCCESS: ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBDATA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBDATA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBDATA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBDATA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */
    SQL> ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */ 
    ORA-15032: not all alterations performed
    ORA-15001: diskgroup "DBDATA" does not exist or is not mounted
    ERROR: ALTER DISKGROUP DBDATA DISMOUNT  /* asm agent *//* {1:57967:201} */
    Wed Jan 20 10:15:41 2016
    NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 3
    Wed Jan 20 10:15:41 2016
    ORA-15032: not all alterations performed
    ORA-15027: active use of diskgroup "OCR_VOTE" precludes its dismount
    ERROR: ALTER DISKGROUP OCR_VOTE DISMOUNT  /* asm agent *//* {1:57967:201} */
    Wed Jan 20 10:15:42 2016
    NOTE: client exited [11498]
    Wed Jan 20 10:15:43 2016
    NOTE: ASMB process exiting due ASM instance shutdown (inactive for 1 seconds)
    Wed Jan 20 10:15:43 2016
    Shutting down instance (immediate)
    Shutting down instance: further logons disabled
    Stopping background process MMNL
    Stopping background process MMON
    License high water mark = 12
    SQL> ALTER DISKGROUP ALL DISMOUNT /* asm agent *//* {0:0:169} */ 
    NOTE: cache dismounting (clean) group 3/0x82C94440 (OCR_VOTE) 
    NOTE: messaging CKPT to quiesce pins Unix process pid: 11495, image: oracle@rac1 (TNS V1-V3)
    Wed Jan 20 10:15:45 2016
    NOTE: LGWR doing clean dismount of group 3 (OCR_VOTE)
    NOTE: LGWR closing thread 1 of diskgroup 3 (OCR_VOTE) at ABA 9.73
    NOTE: LGWR released thread recovery enqueue
    Wed Jan 20 10:15:45 2016
    freeing rdom 3
    Wed Jan 20 10:15:45 2016
    NOTE: detached from domain 3
    NOTE: cache dismounted group 3/0x82C94440 (OCR_VOTE) 
    Wed Jan 20 10:15:45 2016
    GMON dismounting group 3 at 13 for pid 23, osid 11495
    NOTE: Disk OCR_VOTE01 in mode 0x7f marked for de-assignment
    NOTE: Disk OCR_VOTE02 in mode 0x7f marked for de-assignment
    NOTE: Disk OCR_VOTE03 in mode 0x7f marked for de-assignment
    SUCCESS: diskgroup OCR_VOTE was dismounted
    NOTE: cache deleting context for group OCR_VOTE 3/0x82c94440
    SUCCESS: ALTER DISKGROUP ALL DISMOUNT /* asm agent *//* {0:0:169} */
    Wed Jan 20 10:15:46 2016
    Stopping background process VKTM
    Wed Jan 20 10:15:48 2016
    freeing rdom 0
    Wed Jan 20 10:15:52 2016
    Instance shutdown complete
    ......

在2016-01-20 10:15:33 节点一ASM实例出现disconnected unexpectedly,猜测共享存储是否出现问题,导致节点重启?????此时我们再看OS层面上的一份log;

 /var/log/messages

节点一 /var/log/messages

  1. Jan 20 10:10:39 rac1 kernel: pcieport 0000:00:02.0: eth2: Link is down
    Jan 20 10:10:40 rac1 kernel: bonding: bond1: link status definitely down for interface eth2, disabling it
    Jan 20 10:10:40 rac1 kernel: bonding: bond1: making interface eth3 the new active one.
    Jan 20 10:10:40 rac1 kernel: pcieport 0000:00:02.0: eth3: Link is down
    Jan 20 10:10:40 rac1 kernel: bonding: bond1: link status definitely down for interface eth3, disabling it
    Jan 20 10:10:40 rac1 kernel: bonding: bond1: now running without any active interface !
    Jan 20 10:10:42 rac1 kernel: pcieport 0000:00:02.0: eth2: Link is up at 1000 Mbps, full duplex
    Jan 20 10:10:42 rac1 kernel: pcieport 0000:00:02.0: eth2: Flow control is on for TX and on for RX
    Jan 20 10:10:42 rac1 kernel: pcieport 0000:00:02.0: eth2: EEE is disabled
    Jan 20 10:10:43 rac1 kernel: bonding: bond1: link status definitely up for interface eth2, 1000 Mbps full duplex.
    Jan 20 10:10:43 rac1 kernel: bonding: bond1: making interface eth2 the new active one.
    Jan 20 10:10:43 rac1 kernel: bonding: bond1: first active interface up!
    Jan 20 10:10:43 rac1 kernel: pcieport 0000:00:02.0: eth3: Link is up at 1000 Mbps, full duplex
    Jan 20 10:10:43 rac1 kernel: pcieport 0000:00:02.0: eth3: Flow control is on for TX and on for RX
    Jan 20 10:10:43 rac1 kernel: pcieport 0000:00:02.0: eth3: EEE is disabled
    Jan 20 10:10:43 rac1 kernel: bonding: bond1: link status definitely up for interface eth3, 1000 Mbps full duplex.
    Jan 20 10:10:45 rac1 kernel: pcieport 0000:00:02.0: eth3: Link is down
    Jan 20 10:10:46 rac1 kernel: bonding: bond1: link status definitely down for interface eth3, disabling it
    Jan 20 10:10:46 rac1 kernel: pcieport 0000:00:02.0: eth2: Link is down
    Jan 20 10:10:46 rac1 kernel: bonding: bond1: link status definitely down for interface eth2, disabling it
    Jan 20 10:10:46 rac1 kernel: bonding: bond1: now running without any active interface !
    Jan 20 10:10:48 rac1 kernel: pcieport 0000:00:02.0: eth3: Link is up at 100 Mbps, full duplex
    Jan 20 10:10:48 rac1 kernel: pcieport 0000:00:02.0: eth3: Flow control is on for TX and on for RX
    Jan 20 10:10:48 rac1 kernel: pcieport 0000:00:02.0: eth3: EEE is disabled
    Jan 20 10:10:48 rac1 kernel: bonding: bond1: link status definitely up for interface eth3, 100 Mbps full duplex.
    Jan 20 10:10:48 rac1 kernel: bonding: bond1: making interface eth3 the new active one.
    Jan 20 10:10:48 rac1 kernel: bonding: bond1: first active interface up!
    Jan 20 10:10:48 rac1 kernel: pcieport 0000:00:02.0: eth2: Link is up at 100 Mbps, full duplex
    Jan 20 10:10:48 rac1 kernel: pcieport 0000:00:02.0: eth2: Flow control is on for TX and on for RX
    Jan 20 10:10:48 rac1 kernel: pcieport 0000:00:02.0: eth2: EEE is disabled
    Jan 20 10:10:48 rac1 kernel: bonding: bond1: link status definitely up for interface eth2, 100 Mbps full duplex.
    Jan 20 10:10:48 rac1 kernel: bonding: bond1: making interface eth2 the new active one.
    Jan 20 10:11:39 rac1 kernel: pcieport 0000:00:02.0: eth3: Link is down
    Jan 20 10:11:39 rac1 kernel: pcieport 0000:00:02.0: eth2: Link is down
    Jan 20 10:11:39 rac1 kernel: bonding: bond1: link status definitely down for interface eth2, disabling it
    Jan 20 10:11:39 rac1 kernel: bonding: bond1: now running without any active interface !
    Jan 20 10:11:39 rac1 kernel: bonding: bond1: link status definitely down for interface eth3, disabling it
    Jan 20 10:15:31 rac1 init: oracle-tfa main process (9074) killed by TERM signal
    Jan 20 10:15:31 rac1 init: oracle-ohasd main process (9075) killed by TERM signal
    Jan 20 10:15:31 rac1 init: tty (/dev/tty2) main process (10827) killed by TERM signal
    Jan 20 10:15:31 rac1 init: tty (/dev/tty3) main process (10829) killed by TERM signal
    Jan 20 10:15:31 rac1 init: tty (/dev/tty4) main process (10831) killed by TERM signal
    Jan 20 10:15:31 rac1 init: tty (/dev/tty5) main process (10833) killed by TERM signal
    Jan 20 10:15:31 rac1 init: tty (/dev/tty6) main process (10835) killed by TERM signal
    Jan 20 10:15:58 rac1 abrtd: Got signal 15, exiting
    Jan 20 10:16:01 rac1 acpid: exiting
    Jan 20 10:16:02 rac1 multipathd: dm-7: remove map (uevent)
    Jan 20 10:16:02 rac1 multipathd: dm-7: devmap not registered, can't remove
    Jan 20 10:16:02 rac1 multipathd: dm-7: remove map (uevent)
    Jan 20 10:16:02 rac1 multipathd: dm-7: devmap not registered, can't remove
    Jan 20 10:16:02 rac1 multipathd: MEG1_OracleOS_Lun0: event checker exit
    Jan 20 10:16:02 rac1 multipathd: dm-2: remove map (uevent)
    Jan 20 10:16:02 rac1 multipathd: dm-2: devmap not registered, can't remove
    ......

    问题已经很明显了,是因为光纤网线连接交换机、共享存储时出现问题;
    可能因为搬机柜时人员不小心起飞脚碰到网线,导致网线松动原因,从而使节点一无法连接到共享存储导致节点重启;



来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/30220976/viewspace-1979652/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2015-04-24

  • 博文量
    34
  • 访问量
    85639