ITPub博客

首页 > 数据库 > Oracle > 5分钟,彻底精通Oracle DG切换

5分钟,彻底精通Oracle DG切换

原创 Oracle 作者:chenoracle 时间:2022-09-11 09:24:24 0 删除 编辑

由于博客排版问题,详细信息见我的微信公众号"IT小Chen"

没错,就是标题党,5分钟就想精通Oracle DG切换了吗?恐怕原理知识都掌握不了,更别说实操部分了,不要相信这些鬼话,太容易得到的知识你也不会去珍惜。


先看几个思考题?


1.主库切换为备库,成功执行如下命令后,原主库实例是什么状态?


alter database commit to switchover to physical standby with session shutdown;


A.shutdown B.nomount C.mount D.read only 


2.备库切换为主库,成功执行如下命令后,原备库实例是什么状态?


alter database commit to switchover to primary with session shutdown;


A.shutdown B.nomount C.mount D.oread write 


3.配置了dg broker以后,没有开启FSFO,是否还可以正常通过之前的SQL语句进行dg切换呢?


A.可以 B.不可以


4.配置了dg broker并且启用了FSFO,是否还可以正常通过之前的SQL语句进行dg切换呢?


A.可以 B.不可以


5.配置了dg broker并且启用了FSFO,主库执行shutdown immediate后,会自动进行主备切换吗?


A.会 B.不会


本篇文章主要讲解如下内容:


一:通过sqlplus进行switchover
二:通过dg broker进行switchover
三:通过dg broker进行自动failover
四:通过keepalived进行vip自动切换

一:通过sqlplus工具进行switchover


主机信息


172.16.6.137 cjc-db-01 

172.16.6.138 cjc-db-02

OS:Redhat 7.9

DB:Oracle 11.2.0.4.0 单机

切换前检查


1.检查数据库基本信息


set lin 200 pages 100
col FLASHBACK_ON for a10
col current_scn for 99999999999999
col open_mode for a10
col SWITCHOVER_STATUS for a20
col PROTECTION_MODE for a20
col name for a10
select name,current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


主库:


NAME       CURRENT_SCN PROTECTION_MODEDATABASE_ROLE FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

---------- --------------- -------------------- ---------------- --- ---------- ---------- --------------------

CJC    978784 MAXIMUM PERFORMANCEPRIMARY  YES NO READ WRITE TO STANDBY

从库:


NAME       CURRENT_SCN PROTECTION_MODEDATABASE_ROLE FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

---------- --------------- -------------------- ---------------- --- ---------- ---------- --------------------

CJC    971317 MAXIMUM PERFORMANCEPHYSICAL STANDBY YES NO READ ONLY  NOT ALLOWED

从库的OPEN_MODE为READ ONLY,切换前需要开启MRP。


ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;


SWITCHOVER_STATUS说明:


NOT ALLOWED             当前的数据库不是带有备用数据库的主数据库

PREPARING DICTIONARY    该逻辑备用数据库正在向一个主数据库和其他备用数据库发送它的重做数据,以便为切换做准备

PREPARING SWITCHOVER    接受用于切换的重做数据时,逻辑备用配置会使用它

RECOVERY NEEDED         备用数据库还没有接收到切换请求

SESSIONS ACTIVE         在主数据库中存在活动的SQL会话;在继续执行之前必须断开这些会话

SWITCHOVER PENDING      适用于那些已收到主数据库切换请求但是还没有处理该请求的备用数据库

SWITCHOVER LATENT       切换没有完成并返回到主数据库

TO LOGICAL STANDBY      主数据库已经收到了来自逻辑备用数据库的完整的字典

TO PRIMARY              该备用数据库可以转换为主数据库

TO STANDBY              该主数据库可以转换为备用数据库

2.检查DG参数


set linesize 500 pages 100
col value for a70 
col name for a30 
select name, value 
from v$parameter 
where name in ('db_name','db_unique_name', 
'log_archive_config', 
'log_archive_dest_1','log_archive_dest_2', 
'log_archive_dest_state_1', 'log_archive_dest_3', 
'log_archive_dest_state_3', 
'log_archive_dest_state_2', 
'remote_login_passwordfile', 
'log_archive_format', 
'log_archive_max_processes', 
'fal_server','db_file_name_convert', 
'log_file_name_convert', 
'standby_file_management');


主库:


NAME       VALUE

------------------------------ ----------------------------------------------------------------------

db_file_name_convert       cjc2, cjc1

log_file_name_convert       cjc2, cjc1

log_archive_dest_1       LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc1

log_archive_dest_2       SERVICE=cjc2 ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_

               NAME=cjc2



log_archive_dest_3

log_archive_dest_state_1       ENABLE

log_archive_dest_state_2       ENABLE

log_archive_dest_state_3       enable

fal_server               cjc2

log_archive_config       DG_CONFIG=(cjc1,cjc2)

log_archive_format       cjc_%t_%s_%r.arc

log_archive_max_processes      4

standby_file_management        AUTO

remote_login_passwordfile      EXCLUSIVE

db_name            cjc

db_unique_name       cjc1


16 rows selected.

从库:


NAME       VALUE

------------------------------ ----------------------------------------------------------------------

db_file_name_convert       cjc1, cjc2

log_file_name_convert       cjc1, cjc2

log_archive_dest_1       LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc2

log_archive_dest_2       SERVICE=cjc1 ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_

       NAME=cjc1



log_archive_dest_3

log_archive_dest_state_1       ENABLE

log_archive_dest_state_2       ENABLE

log_archive_dest_state_3       enable

fal_server           cjc1

log_archive_config       DG_CONFIG=(cjc2,cjc1)

log_archive_format       cjc_%t_%s_%r.arc

log_archive_max_processes      4

standby_file_management        AUTO

remote_login_passwordfile      EXCLUSIVE

db_name            cjc

db_unique_name       cjc2


16 rows selected.

3.检查DG进程状态


col dest_name for a30
col error for a20
set lin 200 pages 100
col applied_scn for 9999999999999
select dest_id,error,status,log_sequence,applied_scn,MAX_CONNECTIONS,NET_TIMEOUT,COMPRESSION from v$archive_dest where dest_id<5;


主库:


 DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID    18    0    10 DISABLE

 2VALID    19       9713171   30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

从库:


  DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID    18    0    10 DISABLE

 2VALID     0    0    1   30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

4.检查进程状态


select INST_ID,process,status,thread#,sequence#,block#,blocks from gv$managed_standby order by INST_ID ;


主库:


   INST_ID PROCESS   STATUS     THREAD#  SEQUENCE#     BLOCK#     BLOCKS

---------- --------- ------------ ---------- ---------- ---------- ----------

 1 ARCH      CLOSING   1     19      20480  831

 1 ARCH      CLOSING   1     17  1   75

 1 LNS     WRITING   1     20        168    1

 1 ARCH      CLOSING   1     18  1   22

 1 ARCH      CLOSING   1     14  1 2404

从库:


   INST_ID PROCESS   STATUS     THREAD#  SEQUENCE#     BLOCK#     BLOCKS

---------- --------- ------------ ---------- ---------- ---------- ----------

 1 ARCH      CONNECTED   0      0  0    0

 1 ARCH      CONNECTED   0      0  0    0

 1 ARCH      CLOSING   1     18  1   22

 1 ARCH      CLOSING   1     19      20480  831

 1 MRP0      APPLYING_LOG   1     20        152     102400

 1 RFS     IDLE   1     20        152    1

 1 RFS     IDLE   0      0  0    0

 1 RFS     IDLE   0      0  0    0

 1 RFS     IDLE   0      0  0    0


9 rows selected.

5.检查从库应用状态


set lines 200
col dest_name for a30
select DEST_ID,DEST_NAME,RECOVERY_MODE from gv$archive_dest_status where RECOVERY_MODE <>'IDLE';


  DEST_ID DEST_NAME  RECOVERY_MODE

---------- ------------------------------ -----------------------

 1 LOG_ARCHIVE_DEST_1  MANAGED REAL TIME APPLY

6.gap检查


主库:


SELECT LOG_ARCHIVED-LOG_APPLIED+1 LOGGAP FROM (SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES' AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#)
FROM V$ARCHIVED_LOG)), (SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES'
AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#) FROM V$ARCHIVED_LOG)) ;


    LOGGAP

----------

 1

7.主库创建测试数据,检查数据是否正常同步


SQL> col name for a50

SQL> select name from v$dbfile;

NAME

--------------------------------------------------

/oradata/cjc/users01.dbf

/oradata/cjc/undotbs01.dbf

/oradata/cjc/sysaux01.dbf

/oradata/cjc/system01.dbf


SQL> select * from cjc.t1;

ID

----------

 1

8.主切备


sqlplus / as sysdba
SELECT trim(DATABASE_ROLE) DBROLE FROM v$DATABASE;
DBROLE
----------------
PRIMARY
alter system switch logfile;
alter system flush SHARED_POOL;
ALTER SYSTEM SET log_archive_trace=8191 sid='*';


执行命令后,原主库自动关闭实例


alter database commit to switchover to physical standby with session shutdown;


对应切换日志如下:


Tue Sep 06 14:15:08 2022
alter database commit to switchover to physical standby with session shutdown
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 21460] (cjc1)
krss_find_arc: Selecting ARC2 to receive message as last resort
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Waiting for all FAL entries to be archived...
All FAL entries have been archived.
Waiting for potential Physical Standby switchover target to become synchronized...
Tue Sep 06 14:15:08 2022
OCISessionBegin with PasswordVerifier succeeded
Client pid [10698] attached to RFS pid [21661] at remote instance number [1] at dest 'cjc2'
Active, synchronized Physical Standby switchover target has been identified
Tue Sep 06 14:15:10 2022
Process (ospid 10671) is suspended due to switchover to physical standby operation.
Switchover End-Of-Redo Log thread 1 sequence 21 has been fixed
Switchover: Primary highest seen SCN set to 0x0.0xf62e8
ARCH: Noswitch archival of thread 1, sequence 21
ARCH: End-Of-Redo Branch archival of thread 1 sequence 21
ARCH: Evaluating archive   log 3 thread 1 sequence 21
ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
ARCH: Beginning to archive thread 1 sequence 21 (988339-Infinity) (cjc1)
ARCH: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'cjc2' (thread 1 sequence 21) (cjc1)
ARCH: Transmitting activation ID 0xdfebe7b0
OCISessionBegin with PasswordVerifier succeeded
Client pid [21460] attached to RFS pid [21665] at remote instance number [1] at dest 'cjc2'
ARCH: Standby redo logfile selected for thread 1 sequence 21 for destination LOG_ARCHIVE_DEST_2
ARCH: Standby redo logfile selected for thread 1 sequence 21 for destination LOG_ARCHIVE_DEST_2
ARCH: Creating local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_21_1114613364.arc' (thread 1 sequence 21) (cjc1)
ARCH: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'cjc2' (cjc1)
ARCH: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_21_1114613364.arc' (cjc1)
Committing creation of archivelog '/arch/cjc_1_21_1114613364.arc'
Archived Log entry 25 added for thread 1 sequence 21 ID 0xdfebe7b0 dest 1:
Archived Log entry 26 added for thread 1 sequence 21 ID 0xdfebe7b0 dest 2:
ARCH: Completed archiving thread 1 sequence 21 (988339-1008360) (cjc1)
ARCH: Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received alls redo
Final check for a synchronized target standby. Check will be made once.
ARCH: Transmitting activation ID 0xdfebe7b0
LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target
Active, synchronized target has been identified
Target has also received all redo
-----------------------------------------------------------
|          Target Standby Status                           |
| LOG_ARCHIVE_DEST_1  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_2  : HAS RECEIVED ALL DATA          |
| LOG_ARCHIVE_DEST_3  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_4  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_5  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_6  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_7  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_8  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_9  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_10  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_11  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_12  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_13  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_14  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_15  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_16  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_17  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_18  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_19  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_20  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_21  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_22  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_23  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_24  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_25  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_26  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_27  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_28  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_29  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_30  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_31  : NOT ACTIVE                     |
------------------------------------------------------------
Backup controlfile written to trace file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_ora_21460.trc
Clearing standby activation ID 3756779440 (0xdfebe7b0)
The primary database controlfile was created using the
'MAXLOGFILES 16' clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
Archivelog for thread 1 sequence 21 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
Switchover: Complete - Database shutdown required
USER (ospid: 21460): terminating the instance
Instance terminated by USER, pid = 21460
Completed: alter database commit to switchover to physical standby with session shutdown
Shutting down instance (abort)
License high water mark = 6
Tue Sep 06 14:15:11 2022
Instance shutdown complete


启动实例


startup mount


9.备切主


su - oracle

sqlplus / as sysdba

show parameter instance_name

SELECT trim(DATABASE_ROLE) DBROLE FROM v$DATABASE;

DBROLE

----------------

PHYSICAL STANDBY

ALTER SYSTEM SET log_archive_trace=8191 sid='*';

执行命令后,原备库自动启动到mount


alter database commit to switchover to primary with session shutdown;

切换日志如下:


Tue Sep 06 14:18:05 2022
alter database commit to switchover to primary with session shutdown
ALTER DATABASE SWITCHOVER TO PRIMARY (cjc2)
Maximum wait for role transition is 15 minutes.
Switchover: Media recovery is still active
Role Change: Canceling MRP - no more redo to apply
Tue Sep 06 14:18:06 2022
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_pr00_20372.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Errors in file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_pr00_20372.trc:
ORA-16037: user requested cancel of managed recovery operation
Tue Sep 06 14:18:07 2022
Errors in file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_mrp0_20358.trc:
ORA-10877: error signaled in parallel recovery slave 
MRP0: Background Media Recovery process shutdown (cjc2)
Role Change: Canceled MRP
All dispatchers and shared servers shutdown
CLOSE: killing server sessions.
CLOSE: all sessions shutdown successfully.
Tue Sep 06 14:18:08 2022
SMON: disabling cache recovery
Backup controlfile written to trace file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_ora_20276.trc
SwitchOver after complete recovery through change 1008360
Online log /oradata/cjc/redo01.log: Thread 1 Group 1 was previously cleared
Online log /oradata/cjc/redo02.log: Thread 1 Group 2 was previously cleared
Online log /oradata/cjc/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 1008358
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary with session shutdown
Tue Sep 06 14:18:29 2022
ARC2: Becoming the 'no SRL' ARCH
Tue Sep 06 14:18:48 2022
idle dispatcher 'D000' terminated, pid = (17, 1)


重启实例


shutdown immediate;

startup;

9.切换后检查


set lin 200 pages 100
set lin 200 pages 100
col FLASHBACK_ON for a10
col current_scn for 99999999999999
col open_mode for a20
col SWITCHOVER_STATUS for a20
col PROTECTION_MODE for a20
select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- ---------- --------------------

1008736 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE RESOLVABLE GAP

ALTER SYSTEM SET log_archive_trace=0 sid='*';

10.新备库启动mrp


sqlplus / as sysdba
alter database open;
ALTER SYSTEM SET log_archive_trace=0 sid='*';
###recover managed standby database using current logfile disconnect from session;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;


从库


  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- ---------- --------------------

1009025 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES NO     READ ONLYNOT ALLOWED

     WITH APPLY

主库:


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- ---------- --------------------

1008800 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE TO STANDBY

主库:


SQL> insert into cjc.t1 values(2);

SQL> insert into cjc.t1 values(3);

SQL> commit;

从库:


SQL> select * from cjc.t1;

ID

----------

 1

 2

 3

回切:


主切备


alter database commit to switchover to physical standby with session shutdown; 

startup mount

备切主


alter database commit to switchover to primary with session shutdown;

shutdown immediate;

startup;

备库启动MRP


alter database open;

###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

图片


二:通过dg broker进行switchover


手动进行DG switchover,步骤有些麻烦,是否有更简单的方式呢,可以试试dg broker。


Oracle DataGuard Broker分为Client Side和Server Side。


Client Side可以通过EM和DGMGRL两种工具对服务端进行管理和维护。


Server side会有一个配置文件和一个后台进程叫Data Guard Broker monitor process(DMON)。


DMON:它是一个用来管理Broker的后台进程,这个进程负责本地数据库与standby数据库的DMON进程进行通讯,当主库上接收到一个请求的时候,它会协调其他数据库上的DMON进程处理相应的请求,比如switchover。


同时会更新本地系统中的配置文件,并与standby数据库上的DMON进程进行通信,更新Standby上的配置文件。


1.启用dg broker


主库、从库:


配置DG_BROKER_START参数


show parameter dg_broker_start;

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

dg_broker_start      boolean FALSE

启用dg_broker_start,启用后oracle会自动启动一个dmon进程


alter system set dg_broker_start = true;

对应日志:


Tue Sep 06 15:04:40 2022

ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH;

Tue Sep 06 15:04:40 2022

DMON started with pid=31, OS id=3206 

Starting Data Guard Broker (DMON)

Tue Sep 06 15:04:48 2022

INSV started with pid=32, OS id=3215 

2.调整监听文件


在监听文件中加入DGMGRL静态监听


主库


SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))

    )

  )

从库  


SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc2) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc2) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc2_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc2) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.138)(PORT = 1521))

    )

  )

监听状态


lsnrctl reload

lsnrctl status

主库:


Services Summary...

Service "cjc1" has 2 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGMGRL" has 1 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

The command completed successfully

从库:


Services Summary...

Service "cjc2" has 2 instance(s).

  Instance "cjc2", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc2", status READY, has 1 handler(s) for this service...

Service "cjc2_DGMGRL" has 1 instance(s).

  Instance "cjc2", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc2", status READY, has 1 handler(s) for this service...

The command completed successfully

3.配置broker


[oracle@cjc-db-01 ~]$ dgmgrl sys/oracle

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL>

显示配置


DGMGRL> show configuration

ORA-16532: Data Guard broker configuration does not exist


Configuration details cannot be determined by DGMGRL

添加配置


DGMGRL> create configuration 'cjcdgbroker' as primary database is 'cjc1' connect identifier is cjc1;

Configuration "cjcdgbroker" created with primary database "cjc1"

显示配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database


Fast-Start Failover: DISABLED


Configuration Status:

DISABLED

新增备库配置


DGMGRL> add database 'cjc2' as connect identifier is 'cjc2' maintained as physical;

Database "cjc2" added

显示配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

DISABLED

启用配置


DGMGRL> enable configuration

Enabled.

显示配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

4.测试switchover


DGMGRL> switchover to cjc2

Performing switchover NOW, please wait...

Operation requires a connection to instance "cjc2" on database "cjc2"

Connecting to instance "cjc2"...

Connected.

New primary database "cjc2" is opening...

Operation requires startup of instance "cjc1" on database "cjc1"

Starting instance "cjc1"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc2"

检查配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

SQL检查


set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

新备:


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1052426 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES NO     READ ONLY WITH APPLY NOT ALLOWED

新主


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1052498 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE   SESSIONS ACTIVE

5.回切


DGMGRL> switchover to cjc1;

Performing switchover NOW, please wait...

Operation requires a connection to instance "cjc1" on database "cjc1"

Connecting to instance "cjc1"...

Connected.

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1"

检查配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

思考:配置了dg broker以后,是否还可以通过之前的SQL语句进行dg切换呢?


此时,通过SQL命令仍然可以进行切换


主切备


alter database commit to switchover to physical standby with session shutdown; 

startup mount

备切主


alter database commit to switchover to primary with session shutdown;

shutdown immediate;

startup;

备库启动MRP


alter database open;

###recover managed standby database using current logfile disconnect from session;

###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

可以使用,既然已经启动了dg broker,建议通过broker进行切换。


图片


三:通过dg broker进行自动failover


启动自动故障转移FSFO


Fast-Start Failove


FSFO允许代理在主库故障的情况下自动故障转移到先前选择的备库,无需手动执行任何步骤,以便快速可靠地恢复业务。


FSFO只能在代理配置中使用,并且只能通过DGMGRL或OEM进行配置。


FSFO支持在最高可用性与最高性能模式下使用。


最大可用性模式保证不会丢失任何数据,最高性能模式可保证丢失的数据量不超过FastStartFailoverLagLimit属性指定的数据量(单位为秒)。


思考:何时会进行FSFO?


配置完成后,FSFO将在以下情况下起作用:


1.主库与 Observer和目标备库 失联时间均超过FastStartFailoverThreshold属性配置阈值(单位为秒)

2.单实例数据库中主实例崩溃

3.RAC中所有主实例崩溃

4.shutdown abort主库

5.应用程序通过调用DBMS_DG.INITIATE_FS_FAILOVER函数启动FSFO

6.Oracle 错误:可以指定启动 FSFO 故障切换的 ORA 错误列表(默认为空)

Broker 可配置为遇到以下任一条件时启动FSFO


1.Datafile Offline    数据文件由于写错误脱机Yes

2.Corrupted Dictionary关键数据库对象的字典损坏,该状态可在数据库open时被检测到Yes

3.Corrupted Controlfile控制文件由于写入错误永久损坏Yes

4.Inaccessible Logfile由于IO错误,LGWR无法写入日志组的任何成员No

5.Stuck Archiver        由于设备已满或不可用,归档进程无法归档redo log

设置保护模式与LogXptMode


LOGXPTMODE属性在最大可用性模式下应为SYNC,在最大性能模式下应为ASYNC,主备库LOGXPTMODE设置必须相同


###DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;

DGMGRL> edit database cjc1 set property 'LogXptMode'='ASYNC'; 

Property "LogXptMode" updated

DGMGRL> edit database cjc2 set property 'LogXptMode'='ASYNC';

Property "LogXptMode" updated

设置FSFO阈值


如果主库与 Observer和目标备库 失联时间均超过FastStartFailoverThreshold属性配置阈值(单位为秒),将启动FSFO。


换句话说,FastStartFailoverThreshold表示,Observer和目标备库 在检测到主库不可用后、希望等多少秒才启动FSFO。再此期间,它们会尝试重连主库。


默认值为30秒,最小6秒。


DGMGRL> edit configuration set property faststartfailoverthreshold=20;

Property "faststartfailoverthreshold" updated

设置最大可接受延迟时间


FastStartFailoverLagLimit表示,自动故障转移允许的最大数据丢失量(单位为秒),仅在最高性能保护模式时才可使用。


FastStartFailoverLagLimit作为DG可接受的最大延迟时间,备库的apply lag只有在此限制之内,才允许FSFO。


如果无法维持已配置的数据丢失保证,主库上的redo生成将停止。


为了避免长时间停顿,Observer或者目标备库可能会在第一次记录到无法发生FSFO之后,允许主库继续生成redo。


此时若主库故障,将无法发生FSFO。


默认值为30秒,最小值为10秒。


DGMGRL> edit configuration set property faststartfailoverlaglimit=60;

设置自动恢复数据库属性


如果FastStartFailoverAutoReinstate属性设置为TRUE,在原主库故障修复后,会自动尝试将其恢复为新主库的备库。


DGMGRL> EDIT CONFIGURATION SET PROPERTY FASTSTARTFAILOVERAUTOREINSTATE=TRUE;

Property "faststartfailoverautoreinstate" updated

启动Observer


创建脚本


[oracle@cjc-db-02 dg]$ pwd

/home/oracle/dg

[oracle@cjc-db-02 dg]$ mkdir observer

[oracle@cjc-db-02 dg]$ cd observer/

vi observer.sh

nohup dgmgrl sys/oracle@cjc2 "start observer file=FSFO.dat">>fsfo.log 2>&1 &

执行脚本


[oracle@cjc-db-02 observer]$ sh observer.sh

启动FSFO,报错ORA-16651


DGMGRL> enable fast_start failover;

Error: ORA-16651: requirements not met for enabling fast-start failover

检查报错信息,发现主从库必须开启闪回


[oracle@cjc-db-02 observer]$ oerr ora 16651
16651, 0000, "requirements not met for enabling fast-start failover"
// *Cause:  The attempt to enable fast-start failover could not be completed
//          because one or more requirements were not met:
//          - The Data Guard configuration must be in either MaxAvailability
//            or MaxPerformance protection mode.
//          - The LogXptMode property for both the primary database and
//            the fast-start failover target standby database must be
//            set to SYNC if the configuration protection mode is set to
//            MaxAvailability mode.
//          - The LogXptMode property for both the primary database and
//            the fast-start failover target standby database must be
//            set to ASYNC if the configuration protection mode is set to
//            MaxPerformance mode.
//          - The primary database and the fast-start failover target standby
//            database must both have flashback enabled.
//          - No valid target standby database was specified in the primary
//            database FastStartFailoverTarget property prior to the attempt
//            to enable fast-start failover, and more than one standby
//            database exists in the Data Guard configuration.
// *Action: Retry the command after correcting the issue:
//          - Set the Data Guard configuration to either MaxAvailability
//            or MaxPerformance protection mode.
//          - Ensure that the LogXptMode property for both the primary
//            database and the fast-start failover target standby database
//            are set to SYNC if the configuration protection mode is set to
//            MaxAvailability.
//          - Ensure that the LogXptMode property for both the primary
//            database and the fast-start failover target standby database
//            are set to ASYNC if the configuration protection mode is set to
//            MaxPerformance.
//          - Ensure that both the primary database and the fast-start failover
//            target standby database have flashback enabled.
//          - Set the primary database FastStartFailoverTarget property to
//            the DB_UNIQUE_NAME value of the desired target standby database
//            and the desired target standby database FastStartFailoverTarget
//            property to the DB_UNIQUE_NAME value of the primary database.


启动闪回


alter database flashback on;

再次开启FSFO


DGMGRL> enable fast_start failover;

Enabled.

对应日志:


Tue Sep 06 16:31:43 2022

Fast-Start Failover (FSFO) has been enabled between:

  Primary = "cjc1"

  Standby = "cjc2"

Tue Sep 06 16:31:43 2022

FSFP started with pid=38, OS id=8869 

OCISessionBegin with PasswordVerifier succeeded

查看配置,Fast-Start Failover状态变成ENABLED了。


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

测试自动切换


主库


SQL> shutdown abort    

ORACLE instance shut down.

查看fsfo日志


[oracle@cjc-db-02 observer]$ tail -10f fsfo.log

16:38:41.35  Tuesday, September 06, 2022

Initiating Fast-Start Failover to database "cjc2"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "cjc2"

16:38:48.09  Tuesday, September 06, 2022

完成了自动切换


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

      Warning: ORA-16829: fast-start failover configuration is lagging


    cjc1 - (*) Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: ENABLED


Configuration Status:

WARNING

已完成自动切换


  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1159699 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   NOT ALLOWED

启动原主库


SQL> startup

ORACLE instance started.


Total System Global Area 1135747072 bytes

Fixed Size    2252544 bytes

Variable Size  754974976 bytes

Database Buffers  369098752 bytes

Redo Buffers    9420800 bytes

Database mounted.

ORA-16649: possible failover to another database prevents this database from being opened

检查错误


[oracle@cjc-db-01 ~]$ oerr ora 16449

16449, 00000, "incomplete redo thread enable operation"

// *Cause:  The switchover operation could not continue because it failed to

//          disable a thread that was left in an incomplete thread enable

//          state.

// *Action: Check alert log for more details.

启动数据库时报错ORA-16649,不需要处理,后台会自动执行闪回数据库,自动open实例。


SQL> set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


    CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1160947 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY SWITCHOVER PENDING

查看配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

原主库自动变为备库


告警日志


FLASHBACK DATABASE TO SCN 1160354
Flashback Restore Start
Flashback Restore Complete
Flashback Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 2 slaves
Flashback Media Recovery Log /arch/cjc_1_2_1114706326.arc
Flashback Media Recovery Log /arch/cjc_1_3_1114706326.arc
Tue Sep 06 16:53:25 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
  Mem# 0: /oradata/cjc/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
  Mem# 0: /oradata/cjc/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0
  Mem# 0: /oradata/cjc/redo03.log
Incomplete Recovery applied until change 1160355 time 09/06/2022 16:50:59
Flashback Media Recovery Complete
Completed: FLASHBACK DATABASE TO SCN 1160354
alter database convert to physical standby
ALTER DATABASE CONVERT TO PHYSICAL STANDBY (cjc2)
Flush standby redo logfile failed:1649
Clearing standby activation ID 3756857555 (0xdfed18d3)
The primary database controlfile was created using the
'MAXLOGFILES 16' clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Clearing online redo logfile 1 /oradata/cjc/redo01.log
Clearing online log 1 of thread 1 sequence number 4
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /oradata/cjc/redo02.log
Clearing online log 2 of thread 1 sequence number 5
Tue Sep 06 16:53:28 2022
RFS[2]: Assigned to RFS process 10506
RFS[2]: Opened log for thread 1 sequence 6 dbid -538205520 branch 1114706326
Clearing online redo logfile 2 complete
Archived Log entry 87 added for thread 1 sequence 6 rlc 1114706326 ID 0xdfed18d3 dest 2:
Clearing online redo logfile 3 /oradata/cjc/redo03.log
Clearing online log 3 of thread 1 sequence number 6
Tue Sep 06 16:53:31 2022
RFS[3]: Assigned to RFS process 10503
RFS[3]: Opened log for thread 1 sequence 1 dbid -538205520 branch 1114707085
RFS[2]: Opened log for thread 1 sequence 2 dbid -538205520 branch 1114707085
Tue Sep 06 16:53:31 2022
Clearing online redo logfile 3 complete
A new recovery destination branch has been registered
RFS[3]: New Archival REDO Branch(resetlogs_id): 1114707085  Prior: 1114706326
RFS[3]: Archival Activation ID: 0xdfeddf20 Current: 0x0
RFS[3]: Effect of primary database OPEN RESETLOGS
RFS[3]: Incarnation entry added for Branch(resetlogs_id): 1114707085 (cjc2)
Tue Sep 06 16:53:31 2022
Setting recovery target incarnation to 4
Completed: alter database convert to physical standby
Archived Log entry 88 added for thread 1 sequence 1 rlc 1114707085 ID 0xdfeddf20 dest 2:
Killing 2 processes with pids 10503,10506 (all RFS) in order to disallow current and future RFS connections. Requested by OS process 10491
Archived Log entry 89 added for thread 1 sequence 2 rlc 1114707085 ID 0xdfeddf20 dest 2:
Tue Sep 06 16:53:33 2022
RFS[4]: Assigned to RFS process 10515
RFS[4]: Opened log for thread 1 sequence 3 dbid -538205520 branch 1114707085
Tue Sep 06 16:53:33 2022
Shutting down instance (immediate)
Shutting down instance: further logons disabled
Stopping background process MMNL
Archived Log entry 89 added for thread 1 sequence 3 rlc 1114707085 ID 0xdfeddf20 dest 2:
Stopping background process MMON
License high water mark = 5
All dispatchers and shared servers shutdown
alter database CLOSE NORMAL
ORA-1109 signalled during: alter database CLOSE NORMAL...
alter database DISMOUNT
Shutting down archive processes
Archiving is disabled
Completed: alter database DISMOUNT
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Shutting down Data Guard Broker processes
Tue Sep 06 16:53:39 2022
Completed: Data Guard Broker shutdown
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Tue Sep 06 16:53:41 2022
Stopping background process VKTM
Archiving is disabled
Tue Sep 06 16:53:43 2022
Instance shutdown complete
Tue Sep 06 16:53:44 2022
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 2
Number of processor cores in the system is 2
Number of processor sockets in the system is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
    NUMA status: non-NUMA system
    cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
    Grp 0: 
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =27
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
ORACLE_HOME = /oracle/app/oracle/product/11.2/db
System name:Linux
Node name:cjc-db-02
Release:5.4.17-2102.201.3.el7uek.x86_64
Version:#2 SMP Fri Apr 23 09:05:55 PDT 2021
Machine:x86_64
Using parameter settings in server-side spfile /oracle/app/oracle/product/11.2/db/dbs/spfilecjc2.ora
System parameters with non-default values:
  processes                = 150
  event                    = "28401 trace name context forever,level 1"
  event                    = "10949 trace name context forever,level 1"
  memory_target            = 1088M
  control_files            = "/oradata/cjc/control01.ctl"
  control_files            = "/oracle/app/oracle/fast_recovery_area/cjc/control02.ctl"
  db_file_name_convert     = "cjc1"
  db_file_name_convert     = "cjc2"
  log_file_name_convert    = "cjc1"
  log_file_name_convert    = "cjc2"
  control_file_record_keep_time= 31
  db_block_size            = 8192
  compatible               = "11.2.0.4.0"
  log_archive_dest_1       = "LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc2"
  log_archive_dest_2       = "service="cjc1""
  log_archive_dest_2       = "LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="cjc1" net_timeout=30"
  log_archive_dest_2       = "valid_for=(all_logfiles,primary_role)"
  log_archive_dest_state_1 = "ENABLE"
  log_archive_dest_state_2 = "ENABLE"
  log_archive_min_succeed_dest= 1
  fal_server               = "cjc1"
  log_archive_trace        = 0
  log_archive_config       = "DG_CONFIG=(cjc2,cjc1)"
  log_archive_format       = "cjc_%t_%s_%r.arc"
  log_archive_max_processes= 4
  archive_lag_target       = 0
  _use_adaptive_log_file_sync= "FALSE"
  db_files                 = 2048
  db_recovery_file_dest    = "/home/oracle/dg/recover"
  db_recovery_file_dest_size= 10G
  standby_file_management  = "AUTO"
  _cleanup_rollback_entries= 10000
  undo_tablespace          = "UNDOTBS1"
  _partition_large_extents = "FALSE"
  remote_login_passwordfile= "EXCLUSIVE"
  audit_sys_operations     = FALSE
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=cjcXDB)"
  parallel_execution_message_size= 32768
  _PX_use_large_pool       = TRUE
  result_cache_max_size    = 0
  audit_file_dest          = "/oracle/app/oracle/admin/cjc/adump"
  audit_trail              = "NONE"
  cell_offload_processing  = FALSE
  db_name                  = "cjc"
  db_unique_name           = "cjc2"
  open_cursors             = 300
  _optimizer_null_aware_antijoin= FALSE
  _b_tree_bitmap_plans     = FALSE
  _optimizer_extended_cursor_sharing= "NONE"
  _optimizer_extended_cursor_sharing_rel= "NONE"
  _optimizer_adaptive_cursor_sharing= FALSE
  deferred_segment_creation= FALSE
  _optimizer_use_feedback  = FALSE
  dg_broker_start          = TRUE
  diagnostic_dest          = "/oracle/app/oracle"
  max_dump_file_size       = "1024M"
Tue Sep 06 16:53:45 2022
PMON started with pid=2, OS id=10533 
Tue Sep 06 16:53:45 2022
PSP0 started with pid=3, OS id=10535 
Tue Sep 06 16:53:46 2022
VKTM started with pid=4, OS id=10538 at elevated priority
VKTM running at (1)millisec precision with DBRM quantum (100)ms
Tue Sep 06 16:53:46 2022
GEN0 started with pid=5, OS id=10543 
Tue Sep 06 16:53:46 2022
DIAG started with pid=6, OS id=10545 
Tue Sep 06 16:53:46 2022
DBRM started with pid=7, OS id=10547 
Tue Sep 06 16:53:46 2022
DIA0 started with pid=8, OS id=10549 
Tue Sep 06 16:53:46 2022
MMAN started with pid=9, OS id=10551 
Tue Sep 06 16:53:46 2022
DBW0 started with pid=10, OS id=10553 
Tue Sep 06 16:53:46 2022
LGWR started with pid=11, OS id=10555 
Tue Sep 06 16:53:46 2022
CKPT started with pid=12, OS id=10557 
Tue Sep 06 16:53:46 2022
SMON started with pid=13, OS id=10559 
Tue Sep 06 16:53:46 2022
RECO started with pid=14, OS id=10561 
Tue Sep 06 16:53:46 2022
MMON started with pid=15, OS id=10563 
Tue Sep 06 16:53:46 2022
MMNL started with pid=16, OS id=10565 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
Tue Sep 06 16:53:46 2022
DMON started with pid=19, OS id=10571 
ORACLE_BASE from environment = /oracle/app/oracle
Tue Sep 06 16:53:46 2022
alter database  mount
ARCH: STARTING ARCH PROCESSES
Tue Sep 06 16:53:51 2022
ARC0 started with pid=21, OS id=10582 
ARC0: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Tue Sep 06 16:53:52 2022
ARC1 started with pid=22, OS id=10585 
Tue Sep 06 16:53:52 2022
ARC2 started with pid=23, OS id=10587 
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC2: Thread not mounted
Tue Sep 06 16:53:52 2022
ARC3 started with pid=24, OS id=10589 
ARC1: Becoming the heartbeat ARCH
ARC1: Thread not mounted
Successful mount of redo thread 1, with mount id 3756855130
Allocated 8388608 bytes in shared pool for flashback generation buffer
Starting background process RVWR
Tue Sep 06 16:53:52 2022
RVWR started with pid=25, OS id=10591 
Physical Standby Database mounted.
Lost write protection disabled
ARC1: Becoming the active heartbeat ARCH
Completed: alter database  mount
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Starting Data Guard Broker (DMON)
Tue Sep 06 16:53:55 2022
INSV started with pid=26, OS id=10597 
Tue Sep 06 16:54:01 2022
NSV0 started with pid=27, OS id=10607 
Tue Sep 06 16:54:04 2022
Using STANDBY_ARCHIVE_DEST parameter default value as /arch
Tue Sep 06 16:54:04 2022
RSM0 started with pid=29, OS id=10615 
Tue Sep 06 16:54:04 2022
RFS[1]: Assigned to RFS process 10617
RFS[1]: Opened log for thread 1 sequence 2 dbid -538205520 branch 1114707085
Archived Log entry 90 added for thread 1 sequence 2 rlc 1114707085 ID 0xdfeddf20 dest 2:
RFS[1]: Opened log for thread 1 sequence 4 dbid -538205520 branch 1114707085
Archived Log entry 91 added for thread 1 sequence 4 rlc 1114707085 ID 0xdfeddf20 dest 2:
RFS[1]: Selected log 4 for thread 1 sequence 5 dbid -538205520 branch 1114707085
Tue Sep 06 16:54:06 2022
Primary database is in MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process 10621
RFS[2]: Selected log 5 for thread 1 sequence 6 dbid -538205520 branch 1114707085
Tue Sep 06 16:54:06 2022
Archived Log entry 92 added for thread 1 sequence 5 ID 0xdfeddf20 dest 1:
Data Guard: Failover target was a Real Time Query standby; attempting to open this standby after reinstatement ...
ALTER DATABASE OPEN READ ONLY
Data Guard Broker initializing...
Data Guard Broker initialization complete
Tue Sep 06 16:54:07 2022
SMON: enabling cache recovery
Dictionary check beginning
Dictionary check complete
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Physical standby database opened for read only access.
Completed: ALTER DATABASE OPEN READ ONLY
Tue Sep 06 16:54:08 2022
db_recovery_file_dest_size of 10240 MB is 0.49% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='cjc2';
ALTER SYSTEM SET log_archive_format='cjc_%t_%s_%r.arc' SCOPE=SPFILE SID='cjc2';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='cjc1','cjc2' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='cjc1','cjc2' SCOPE=SPFILE;
ALTER SYSTEM SET fal_server='cjc1' SCOPE=BOTH;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (cjc2)
Tue Sep 06 16:54:09 2022
MRP0 started with pid=33, OS id=10629 
MRP0: Background Managed Standby Recovery process started (cjc2)
 started logmerger process
Tue Sep 06 16:54:15 2022
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 1160356
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /arch/cjc_1_6_1114706326.arc
Identified End-Of-Redo (failover) for thread 1 sequence 6 at SCN 0x0.11b4a4
Tue Sep 06 16:54:15 2022
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Resetting standby activation ID 3756857555 (0xdfed18d3)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Media Recovery Log /arch/cjc_1_1_1114707085.arc
Archived Log entry 93 added for thread 1 sequence 6 ID 0xdfeddf20 dest 1:
Tue Sep 06 16:54:16 2022
Primary database is in MAXIMUM PERFORMANCE mode
Media Recovery Log /arch/cjc_1_2_1114707085.arc
RFS[3]: Assigned to RFS process 10651
RFS[3]: Selected log 4 for thread 1 sequence 7 dbid -538205520 branch 1114707085
Media Recovery Log /arch/cjc_1_3_1114707085.arc
Media Recovery Log /arch/cjc_1_4_1114707085.arc
Media Recovery Log /arch/cjc_1_5_1114707085.arc
Media Recovery Log /arch/cjc_1_6_1114707085.arc
Media Recovery Waiting for thread 1 sequence 7 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 7 Reading mem 0
  Mem# 0: /oradata/cjc/standby_redo04.log


思考:


主库shutdown immediate会自动切换吗?


主库:


SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

没有自动切换


过10min后启动数据库


startup


检查当前配置信息


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

SQL>

 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1164197 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   TO STANDBY

思考:


启动FSFO后还可以通过之前的SQL命令进行主备切换吗?


主切备


alter database commit to switchover to physical standby with session shutdown; 

startup mount

备切主,报错ORA-16109


alter database commit to switchover to primary with session shutdown;

ORA-16109: failed to apply log data from previous primary

[oracle@cjc-db-02 ~]$ oerr ora 16109

16109, 00000, "failed to apply log data from previous primary"

// *Cause:  Log data from previous primary could not be completely applied.

// *Action: Check DBA_LOGSTDBY_EVENTS for failures and take corrective

//          action.  Then, reissue command

检查fsfo日志


[oracle@cjc-db-02 observer]$ tail -10f fsfo.log

17:41:12.95  Tuesday, September 06, 2022

Initiating Fast-Start Failover to database "cjc2"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "cjc2"

17:41:22.02  Tuesday, September 06, 2022

备库自动切成主库了


set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1185108 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   NOT ALLOWED

备库需要手动启动MRP


alter database open;

###recover managed standby database using current logfile disconnect from session;

###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

检查当前配置,显示需要重建备库


[oracle@cjc-db-01 ~]$ dgmgrl sys/oracle

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production


Copyright (c) 2000, 2009, Oracle. All rights reserved.


Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL> show configuration;

ORA-16795: the standby database needs to be re-created


Configuration details cannot be determined by DGMGRL

尝试回切


DGMGRL> switchover to cjc1;

ORA-16795: the standby database needs to be re-created


Configuration details cannot be determined by DGMGRL



DGMGRL> show configuration;


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

      Warning: ORA-16829: fast-start failover configuration is lagging


    cjc1 - (*) Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: ENABLED


Configuration Status:

WARNING

备库提示


DGMGRL> show configuration;

ORA-16795: the standby database needs to be re-created


Configuration details cannot be determined by DGMGRL

检查cjc1

DGMGRL> show database cjc1;


Database - cjc1


  Role:            PHYSICAL STANDBY

  Intended State:  APPLY-ON

  Transport Lag:   (unknown)

  Apply Lag:       (unknown)

  Apply Rate:      (unknown)

  Real Time Query: OFF

  Instance(s):

    cjc1


Database Status:

ORA-16661: the standby database needs to be reinstated


检查cjc2

DGMGRL> show database cjc2;


Database - cjc2


  Role:            PRIMARY

  Intended State:  TRANSPORT-ON

  Instance(s):

    cjc2


  Database Warning(s):

    ORA-16829: fast-start failover configuration is lagging


Database Status:

WARNING

查看参数


备库


DGMGRL> show database verbose cjc1;
Database - cjc1
  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   (unknown)
  Apply Lag:       (unknown)
  Apply Rate:      (unknown)
  Real Time Query: OFF
  Instance(s):
    cjc1
  Properties:
    DGConnectIdentifier             = 'cjc1'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'async'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'cjc2, cjc1'
    LogFileNameConvert              = 'cjc2, cjc1'
    FastStartFailoverTarget         = 'cjc2'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    SidName                         = 'cjc1'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=cjc-db-01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=cjc1_DGMGRL)(INSTANCE_NAME=cjc1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = '/arch'
    AlternateLocation               = ''
    LogArchiveTrace                 = '8191'
    LogArchiveFormat                = 'cjc_%t_%s_%r.arc'
    TopWaitEvents                   = '(monitor)'
Database Status:
ORA-16661: the standby database needs to be reinstated


主库


DGMGRL> show database verbose cjc2;
Database - cjc2
  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    cjc2
  Database Warning(s):
    ORA-16829: fast-start failover configuration is lagging
  Properties:
    DGConnectIdentifier             = 'cjc2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'async'
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'cjc1, cjc2'
    LogFileNameConvert              = 'cjc1, cjc2'
    FastStartFailoverTarget         = 'cjc1'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    SidName                         = 'cjc2'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=cjc-db-02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=cjc2_DGMGRL)(INSTANCE_NAME=cjc2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = '/arch'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = 'cjc_%t_%s_%r.arc'
    TopWaitEvents                   = '(monitor)'
Database Status:
WARNING
主从不同步了,归档收不到



 主库

 DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID    12    0    10 DISABLE

 2DEFERRED     0    0    1       30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

 

从库

SQL> col dest_name for a30

col error for a20

set lin 200 pages 100

col applied_scn for 9999999999999

select dest_id,error,status,log_sequence,applied_scn,MAX_CONNECTIONS,NET_TIMEOUT,COMPRESSION from v$archive_dest where dest_id<5;SQL> SQL> SQL> SQL> 


   DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID     0    0    10 DISABLE

 2VALID     0    0    1       30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

主库 


SQL> show parameter log_archive_dest_state_2


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

log_archive_dest_state_2     string RESET

重新激活


SQL> alter system set log_archive_dest_state_2=defer;

System altered.


SQL> alter system set log_archive_dest_state_2=enable;

System altered.

归档可以正常接收,主库新增数据可以正常同步到从库。


但是dgmgr显示还是不对


ORA-16829: fast-start failover configuration is lagging


ORA-16661: the standby database needs to be reinstated


重建备库


DGMGRL> reinstate database cjc1;  

Reinstating database "cjc1", please wait...

Reinstatement of database "cjc1" succeeded

查看备库对应告警日志


Wed Sep 07 14:04:45 2022
NSV1 started with pid=38, OS id=3216 
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:04:49 2022
RSM0 started with pid=39, OS id=3223 
Killing 4 processes with pids 2641,2710,2652,2654 (all RFS) in order to reinstate the database after a failover. Requested by OS process 3223
Data Guard: Stopping apply to check viability of standby
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Wed Sep 07 14:04:54 2022
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_pr00_2725.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 1186859
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_pr00_2725.trc:
ORA-16037: user requested cancel of managed recovery operation
Wed Sep 07 14:04:54 2022
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_mrp0_2719.trc:
ORA-10877: error signaled in parallel recovery slave 
MRP0: Background Media Recovery process shutdown (cjc1)
Managed Standby Recovery Canceled (cjc1)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
krsk_ler_purge_scn: Purged kccle 4 (next SCN 65535:-1)
ALTER SYSTEM SET log_archive_trace=8191 SCOPE=BOTH SID='cjc1';
ALTER SYSTEM SET log_archive_format='cjc_%t_%s_%r.arc' SCOPE=SPFILE SID='cjc1';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='cjc2','cjc1' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='cjc2','cjc1' SCOPE=SPFILE;
ALTER SYSTEM SET fal_server='cjc2' SCOPE=BOTH;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (cjc1)
Wed Sep 07 14:04:56 2022
MRP0 started with pid=22, OS id=3233 
MRP0: Background Managed Standby Recovery process started (cjc1)
 started logmerger process
Wed Sep 07 14:05:02 2022
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
krss_find_arc: Selecting ARC2 to receive message as last resort
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Waiting for thread 1 sequence 17
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:02 2022
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Wed Sep 07 14:05:03 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[6]: Assigned to RFS process 3248
RFS[6]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 2512
Primary database is in MAXIMUM PERFORMANCE mode
RFS[6]: Begin archive primary thread 1 sequence 18 (cjc1)
Primary database is in MAXIMUM PERFORMANCE mode
RFS[6]: Successfully opened standby log 4: '/oradata/cjc/standby_redo04.log'
RFS[6]: Selected log 4 for thread 1 sequence 18 dbid -538205520 branch 1114707085
Wed Sep 07 14:05:12 2022
Fetching gap sequence in thread 1, gap sequence 17-17
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:23 2022
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:33 2022
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:43 2022
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:44 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[7]: Assigned to RFS process 3288
RFS[7]: Identified database type as 'physical standby': Client is ARCH pid 2506
Wed Sep 07 14:05:44 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[8]: Assigned to RFS process 3290
RFS[8]: Identified database type as 'physical standby': Client is ARCH pid 2503
RFS[8]: Begin archive primary thread 1 sequence 17 (cjc1)
RFS[8]: Successfully opened standby log 5: '/oradata/cjc/standby_redo05.log'
RFS[8]: Selected log 5 for thread 1 sequence 17 dbid -538205520 branch 1114707085
RFS[8]: Completed archive log 0 thread 1 sequence 17 (cjc1)
Wed Sep 07 14:05:44 2022
ARC3: Evaluating archive   log 5 thread 1 sequence 17
ARC3: Beginning to archive thread 1 sequence 17 (1186140-1186874) (cjc1)
ARC3: Creating local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_17_1114707085.arc' (thread 1 sequence 17) (cjc1)
ARC3: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_17_1114707085.arc' (cjc1)
Committing creation of archivelog '/arch/cjc_1_17_1114707085.arc'
Archived Log entry 127 added for thread 1 sequence 17 ID 0xdfed0f5a dest 1:
ARC3: Completed archiving thread 1 sequence 17 (0-0) (cjc1)
Wed Sep 07 14:05:45 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[9]: Assigned to RFS process 3294
RFS[9]: Identified database type as 'physical standby': Client is ARCH pid 2510
Media Recovery Log /arch/cjc_1_17_1114707085.arc
RFS[9]: Begin archive primary thread 1 sequence 17 (cjc1)
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_rfs_3294.trc:
ORA-16401: archive log rejected by Remote File Server (RFS)
Media Recovery Waiting for thread 1 sequence 18 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 18 Reading mem 0
  Mem# 0: /oradata/cjc/standby_redo04.log


再次检查,恢复正常


DGMGRL> show configuration;


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS


回切

DGMGRL> switchover to cjc1;

Performing switchover NOW, please wait...

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1

总结:


在使用了DGMGRL来管理Dataguard时,不要再使用SQLPLUS命令行来管理了,所有的操作做好在DGMGRL下进行修改配置,否则会造成参数冲突,状态异常。


图片


四:通过keepalived进行vip自动切换


既然DG可以自动切换了,如何保障主库出现异常时,应用中断时间最短呢,或者尽量避免人为干预?


这需要VIP或一个域名,应用连接数据库连接的是VIP或域名,当主库异常时,通过 FSFO 进行自动主备切换,VIP地址通过keepalived自动切换,切换完成后,应用仍然可以通过VIP进行访问数据库。


1.安装keepalived


[root@cjc-db-02 ~]# cd /soft/

[root@cjc-db-02 soft]# tar -zxvf keepalived-2.0.15.tar.gz 

[root@cjc-db-02 soft]# cd keepalived-2.0.15/

[root@cjc-db-02 keepalived-2.0.15]# ./configure  --prefix=/usr/local/keepalived

报错


configure: error: 

  !!! OpenSSL is not properly installed on your system. !!!

  !!! Can not include OpenSSL headers files.            !!!

解决


[root@cjc-db-01 keepalived-2.0.15]# yum install openssl openssl-devel

编译


make && make install

echo $?

将命令复制到/usr/sbin里:


cp /usr/local/keepalived/sbin/keepalived /usr/sbin/

cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/

配置keepalived,VIP地址为172.16.6.150


节点1  172.16.6.137


vi /etc/keepalived/keepalived.conf

! Configuration File for keepalived



vrrp_script chk_dg_stats {  

    script "/etc/keepalived/check_dataguard.sh" 

    interval 2

    weight -5

    fall 2  

    rise 1  

}

       

vrrp_instance VI_1 {

    state MASTER    

    interface enp0s3 

    mcast_src_ip 172.16.6.137

    virtual_router_id 100 

    priority 100

    inopreempt

    advert_int 1         

    authentication {   

        auth_type PASS 

        auth_pass 888888   

    }

    virtual_ipaddress {    

        172.16.6.150

    }

      

    track_script {               

       chk_dg_stats             

    }

}

节点2 172.16.6.138


[root@test05 ~]# cat /etc/keepalived/keepalived.conf

! Configuration File for keepalived



vrrp_script chk_dg_stats {  

    script "/etc/keepalived/check_dataguard.sh" 

    interval 2

    weight -5

    fall 2  

    rise 1  

}

       

vrrp_instance VI_1 {

    state BACKUP

    interface enp0s3 

    mcast_src_ip 172.16.6.138

    virtual_router_id 100 

    priority 100 

    inopreempt

    advert_int 1         

    authentication {   

        auth_type PASS 

        auth_pass 888888   

    }

    virtual_ipaddress {    

        172.16.6.150

    }

      

    track_script {               

       chk_dg_stats             

    }

}

检查脚本


vi /etc/keepalived/check_dataguard.sh 

#!/bin/bash

dbstats=`ps -ef | grep ora_smon | grep -v grep | wc -l`

dgstats=`ps -ef | grep ora_mrp | grep -v grep | wc -l`


if [ "${dbstats}" -eq 0  ]; then

systemctl stop keepalived.service

#elif [[ "${dbstats}" -gt 0 ]] && [[ "${dgstats}" -gt 0 ]]; then

#systemctl stop keepalived.service

fi 

添加执行权限


chmod a+x /etc/keepalived/check_dataguard.sh

启动keepalived


[root@cjc-db-01 keepalived]# systemctl start keepalived.service 

[root@cjc-db-01 keepalived]# systemctl status keepalived.service 

?.keepalived.service - LVS and VRRP High Availability Monitor

   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)

   Active: active (running) since Wed 2022-09-07 15:28:01 CST; 6s ago

  Process: 16728 ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)

 Main PID: 16730 (keepalived)

    Tasks: 2

   CGroup: /system.slice/keepalived.service

           ?..16730 /usr/local/keepalived/sbin/keepalived -D

           ?..16731 /usr/local/keepalived/sbin/keepalived -D


Sep 07 15:28:01 cjc-db-01 Keepalived_vrrp[16731]: VRRP_Script(chk_dg_stats) succeeded

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Receive advertisement timeout

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Entering MASTER STATE

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) setting VIPs.

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Sending/queueing gratuitous ARPs on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

检查vip


ip a

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:29:44:a2 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.137/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

测试VIP漂移


[root@cjc-db-01 keepalived]# systemctl stop keepalived.service 

[root@cjc-db-02 keepalived]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

    inet6 ::1/128 scope host 

       valid_lft forever preferred_lft forever

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed 

       valid_lft forever preferred_lft forever

    inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

禁用自动切换


DGMGRL> disable fast_start failover;

Disabled.

DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

配置监听


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc1


SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 15:33:18 2022


Copyright (c) 1982, 2013, Oracle.  All rights reserved.


ERROR:

ORA-12541: TNS:no listener

[oracle@cjc-db-01 admin]$ cat listener.ora

SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))

    )

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.150)(PORT = 1521))

    )

  )

配置服务


SQL> show parameter name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

cell_offloadgroup_name     string

db_file_name_convert     string cjc2, cjc1

db_name      string cjc

db_unique_name     string cjc1

global_names     boolean FALSE

instance_name     string cjc1

lock_name_space      string

log_file_name_convert     string cjc2, cjc1

processor_group_name     string

service_names     string cjc1


SQL> alter system set service_names='cjc1,cjc' scope=both;


System altered.


SQL> show parameter service_name


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

service_names     string cjc1,cjc


从库

alter system set service_names='cjc2,cjc' scope=both;

重启监听


lsnrctl reload

lsnrctl stop

lsnrctl start

查看监听状态


[oracle@cjc-db-01 admin]$ lsnrctl status


LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 07-SEP-2022 15:45:15


Copyright (c) 1991, 2013, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.6.137)(PORT=1521)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production

Start Date                07-SEP-2022 15:44:08

Uptime                    0 days 0 hr. 1 min. 7 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/app/oracle/product/11.2/db/network/admin/listener.ora

Listener Log File         /oracle/app/oracle/diag/tnslsnr/cjc-db-01/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.137)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.150)(PORT=1521)))

Services Summary...

Service "cjc1" has 2 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGMGRL" has 1 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

The command completed successfully

连接测试


SQL> conn system/oracle@172.16.6.137:1521/cjc1

Connected.

SQL> conn system/oracle@172.16.6.150:1521/cjc1

Connected.

监听文件


[oracle@cjc-db-01 admin]$ cat listener.ora

SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))

    )

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.150)(PORT = 1521))

    )

  )

  

  

测试

SQL> conn system/oracle@172.16.6.137:1521/cjc1

Connected.

SQL> conn system/oracle@172.16.6.137:1521/cjc

Connected.

SQL> conn system/oracle@172.16.6.150:1521/cjc1

Connected.

SQL> conn system/oracle@172.16.6.150:1521/cjc

Connected.

SQL> conn system/oracle@172.16.6.138:1521/cjc2

Connected.

SQL> conn system/oracle@172.16.6.138:1521/cjc

Connected.


监听状态

SQL> ho lsnrctl status


LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 07-SEP-2022 16:05:22


Copyright (c) 1991, 2013, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.6.137)(PORT=1521)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production

Start Date                07-SEP-2022 15:44:08

Uptime                    0 days 0 hr. 21 min. 13 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/app/oracle/product/11.2/db/network/admin/listener.ora

Listener Log File         /oracle/app/oracle/diag/tnslsnr/cjc-db-01/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.137)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.150)(PORT=1521)))

Services Summary...

Service "cjc" has 2 instance(s).

  Instance "cjc", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1" has 2 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGMGRL" has 1 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

The command completed successfully

测试VIP切换


停主库


SQL> shutdown immediate

查看VIP已经飘到从库


ip a

: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

连接测试


SQL> conn system/oracle@172.16.6.150:1521/cjc

Connected.


sqlplus system/oracle@172.16.6.150:1521/cjc

set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1219061 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY NOT ALLOWED

只是VIP飘了,数据库角色没有变


需要执行手动切换 或 配置自动切换后才能进行使用


通过dg broker进行手动切换


DGMGRL> switchover to cjc2;

Performing switchover NOW, please wait...

Operation requires a connection to instance "cjc2" on database "cjc2"

Connecting to instance "cjc2"...

Connected.

New primary database "cjc2" is opening...

Operation requires startup of instance "cjc1" on database "cjc1"

Starting instance "cjc1"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc2"

检查,切换成功


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc

SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 16:28:14 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> show parameter name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

cell_offloadgroup_name     string

db_file_name_convert     string cjc1, cjc2

db_name      string cjc

db_unique_name     string cjc2

global_names     boolean FALSE

instance_name     string cjc2

lock_name_space      string

log_file_name_convert     string cjc1, cjc2

processor_group_name     string

service_names     string cjc2,cjc

通过VIP连接已经是cjc2了


CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1240051 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

回切


回切前需要先启动从库的keepalived服务


[root@cjc-db-01 keepalived]# systemctl start keepalived.service

开始回切


DGMGRL> switchover to cjc1

Performing switchover NOW, please wait...

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1"

通过VIP连接数据库


SQL> show parameter instance_name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc1


sqlplus system/oracle@172.16.6.150:1521/cjc


    CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1260596 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

测试 FSFO+keepalived自动切换


启动自动切换


DGMGRL> enable fast_start failover;

DGMGRL> show configuration


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

主库


enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:29:44:a2 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.137/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

主库执行非正常关库操作


SQL> shutdown abort

ORACLE instance shut down.

查看fsfo.log日志,成功切换到cjc2


16:38:20.23  Wednesday, September 07, 2022

Initiating Fast-Start Failover to database "cjc2"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "cjc2"

16:38:24.18  Wednesday, September 07, 2022

vip 已经切换到138上


enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed 

       valid_lft forever preferred_lft forever

    inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

主库也是切换到138 cjc2上


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

      Warning: ORA-16829: fast-start failover configuration is lagging


    cjc1 - (*) Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: ENABLED


Configuration Status:

WARNING

恢复原主库137 cjc1


SQL> startup

ORACLE instance started.


Total System Global Area 1135747072 bytes

Fixed Size    2252544 bytes

Variable Size  754974976 bytes

Database Buffers  369098752 bytes

Redo Buffers    9420800 bytes

Database mounted.

ORA-16649: possible failover to another database prevents this database from

being opened

报错可以忽略,后台自动执行闪回数据库,重新open数据库


可以看到已经加会到备库


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

普通关闭数据库,理论上主备角色是不会自动切换的,VIP会发生漂移


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc

SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 16:44:40 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options



SQL> show parameter instance_name


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc2

启动keepalived


[root@cjc-db-01 keepalived]# systemctl start keepalived.service 

主库关闭实例


查看IP


enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed 

       valid_lft forever preferred_lft forever

    inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

主从角色并没有自动切换


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

ORA-01034: ORACLE not available

ORA-16625: cannot reach database "cjc2"

DGM-17017: unable to determine configuration status

但是VIP进行切换了


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc

SQL> show parameter instance_name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc1


set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1261913 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY SWITCHOVER PENDING

手动切换


启动主库实例


startup

启动主库keepalived


systemctl start keepalived.service

重启从库keepalived,目的是将VIP切换到主库


[root@cjc-db-01 keepalived]# systemctl stop keepalived.service 

[root@cjc-db-01 keepalived]# systemctl start keepalived.service 

关闭自动切换


DGMGRL> disable fast_start failover;

Disabled.

DGMGRL>  show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCES

回切


DGMGRL> switchover to cjc1

Performing switchover NOW, please wait...

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1"

DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc


SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 17:29:02 2022


Copyright (c) 1982, 2013, Oracle.  All rights reserved.



Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options



SQL> show parameter instance_name


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc1



    CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1304918 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

图片


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29785807/viewspace-2914268/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
Oracle ACE、OCMU 用户组成员、Oracle 11g OCM、微信公众号"IT小Chen"

注册时间:2014-08-05

  • 博文量
    686
  • 访问量
    1903350