Thursday, January 1, 2015

Standby - Switchover Failover

Switchover Work Plan

 
·         Make sure Primary database is  good and standby nodes are up and recovering (see SQL below)
·         Check Backups on Both Primary/Standby
·         Run check Configuration( data guard) on primary or standby ( use Grid)
·         Manually check Data guard Configurations ( use DGMGRL on all nodes )
·         Open all  Node’s alert logs in a Linux vi editor and keep tailing
·         Make sure you have all ip addresses handy ( use only –v  ie virtual ips)
·         Make sure Web Admin has stopped all pools
·         Make sure data Admin  has stopped all Jobs
·         Either Restart Primary or  Kill all users and Lock if needed ( Only application not Oracle related)
·         Turnoff One Node from Each Cluster ( use srvctl stop instance -i [instance name] -d [db unique Name]
·         Perform switchover ( use DGMGRL ) from Standby Host
·         Watch all  alert logs  & Data Guard  logs
·         Once switchover is complete  ask Network Team to change the DNS
·         Once Switchover is done check that all services are good and you can connect
·         Unlock uses that you have locked earlier (If Any)
·         Ask  Web Admin to start testing
·         Once testing is done web Admin can now enable pools
·         Once all pools are active and live  check sessions in pweb you should have web machines showing up in the oracle session( select * from gv$session; )
·         Turn on Monitoring
·         Check Streams , Data Replication, Failed jobs etc
·         Check Backup scripts and Grid Targets etc
 
 
Task: Switchover to a Physical Standby Database –  Using Broker
 
Step1: Connect to ALL Nodes Both Primary and Standby and perform the following
 
 
Example
oramo51-pweb1 /home/oracle> dgmgrl
DGMGRL for Linux: Version 11.2.0.2.0 - 64bit Production
 
Copyright (c) 2000, 2009, Oracle. All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
Connected.
 
 
DGMGRL> show configuration verbose
 
Configuration - pwebb.oracle.cfx
 
  Protection Mode: MaxPerformance
  Databases:
    pwebb - Primary database
    pweb  - Physical standby database
    pwebc - Physical standby database
 
  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS
 
Step2: Go to Nodes& shutdown 2nd node from each
 
 
n  As Oracle Run this from oramo52
srvctl stop instance -i pweb2 -d pwebb
 
n  As Oracle run this from orac02
srvctl stop instance -i pweb2 -d pwebc
 
 
Step3: Go to Standby  Instance & Issue Switchover
 
DGMGRL > switchover to  “USE DB UniqueNM example pwebc”


Step3:  Use SQLPLUS Restart the  new Standby in MOUNT Mode and Watch alertlog
 
Sqlplus  / as sysdba
Shutdown immediate
Startup mount
 
 
 Task: Switchover to a Physical Standby Database – Not using  Broker
 
n  First Identify which Node is running primary and which is running Standby Database
 
set linesize 100
select DBID,NAME, substr(DB_UNIQUE_NAME,1,10) unq_name, OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, DATAGUARD_BROKER
from v$database;
 
n  Check Standby Status First
 
select max(sequence#) from gv$log_history;
 
select SEQUENCE# LAST_SEQ,(sysdate - FIRST_TIME)*24*60  Min_behind,
       to_char(FIRST_TIME,'dd/mm/yy HH:MI:ss am') STANDBY_CURRENT_AS_OF,
       COMPLETION_TIME RECEIVED_ON,applied
from V$ARCHIVED_LOG where sequence# = ( select max(SEQUENCE#)
                                        from V$ARCHIVED_LOG where APPLIED='YES'
                                      );
 
-- Stby
----- ---------
SELECT THREAD#,MAX(SEQUENCE#),to_char(max(first_time),'dd-Mon-yy HH:mi-AM') DateWhen,applied
FROM V$ARCHIVED_LOG
GROUP BY THREAD#,applied;
 
 
Step 1: Verify that the primary database can be switched to the standby role.
 
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the primary database.
For example:
 
Primary SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
 
SWITCHOVER_STATUS
-----------------
TO STANDBY
1 row selected
 
A value of TO STANDBY or SESSIONS ACTIVE indicates that the primary database can be switched to the standby role. If neither of these values is returned, a switchover is not possible because redo transport is either misconfigured or is not functioning
properly.
 
Primary SQL> ALTER SYSTEM SRCHIVE LOG ALL;
 
Now Shutdown the All Node on Primary Database (shutdown immediate) and Restart Just One Node in Open Mode
 
Step 2: Verify that the Standby database can be switched to the Primary role.
            Note: Its ok if it says Not Allowed because later steps we will transfer arch logs to stby in stp5/6
 
 Primary SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
 
SWITCHOVER_STATUS
-----------------
TO PRIMARY
1 row selected
 
If  above returns “NOT ALLOWED” means last redo may not have reached to Standby
 
Or
 
The NOT ALLOWED means :--- Either this is a standby database and the primary database has not been switched first or this is a primary database and there are no standby databases.
 
 
Ensure that Redo Apply is active at the target standby database.  
 
Primary SQL> ALTER SYSTEM FLUSH REDO TO ‘Standby_database_db_unique_name’;
 
 
Check the Standby node again for the switchover status.
 
 
Step 3:  Turn Off All  RAC Instances – and Restart Just One.
 
   srvrsctl stop database <>
   Srvctl start instance …
 
 
Step 4 :  Turn Off Jobs and Job Queue processes
 
 
Step 5: Initiate the switchover on the primary database.
 
Issue the following SQL statement on the primary database to switch it to the standby role:
 
Primary SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;
 
This statement converts the primary database into a physical standby database. The current control file is backed up to the current SQL session trace file before the switchover. This makes it possible to reconstruct a current control file, if necessary.
 
Step 6: Shut down and then mount the former primary database.
 
Primary SQL> SHUTDOWN ABORT;
Primary SQL> STARTUP MOUNT;
At this point in the switchover process, the original primary database is a physical standby database
 
Step 7: Verify that the switchover target (Standby) is ready to be switched to the primary role.
 
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the standby database.
 
For example:
Standby SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
 
SWITCHOVER_STATUS
-----------------
TO_PRIMARY
1 row selected
 
If you see “NOT ALLOWED “ Issue the following SQL statement at the old primary database (can be Mount mode)
 
Primary SQL> ALTER SYSTEM FLUSH REDO TO ‘Standby_database_db_unique_name’;
 
A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is returned, verify that Redo Apply is active and that redo transport is configured and working properly.
Continue to query this column until the value returned is either TO PRIMARY or SESSIONS ACTIVE.
 
Step 8: Switch the target physical standby database role to the primary role.
 
Issue the following SQL statement on the target physical standby database:
Standby SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
 
Step 9: Open the new primary database.
 
 SQL> ALTER DATABASE OPEN;
 
Step 10: Start Redo Apply on the NEW physical standby database (i.e. OLD Primary Node).
 
For example:
 
SQL>  alter database recover managed standby database disconnect from session;
 
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
 
 
Note: The WITH SESSION SHUTDOWN clause can be omitted from the switchover statement if the query performed in the previous step returned a value of TO STANDBY.
 
Note: The WITH SESSION SHUTDOWN clause can be omitted from the switchover statement if the query performed in the previous step returned a value of TO PRIMARY.
 
 
 Task: Failover to a Physical Standby Database (When Switchover is Not Possible)
 
n  First Identify which Node is running primary and which is running Standby Database
 
set linesize 100
select DBID,NAME, substr(DB_UNIQUE_NAME,1,10) unq_name, OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, DATAGUARD_BROKER
from v$database;
 
Step 1 Flush any unsent redo from the primary database to the target standby database.
 
If the primary database can be mounted, it may be possible to flush any unsent archived and current redo from the primary database to the standby database. If this operation is successful, a zero data loss failover is possible even if the primary database is not in a zero data loss data protection mode.
 
Ensure that Redo Apply is active at the target standby database.
Mount, but do not open the primary database. If the primary database cannot be mounted, go to Step 2.
 
Issue the following SQL statement at the primary database:
 
SQL> ALTER SYSTEM FLUSH REDO TO target_db_unique_name;
 
This statement flushes any unsent redo from the primary database to the standby database, and waits for that redo to be applied to the standby database.
If this statement completes without any errors, go to Step 5. If the statement completes with any error, or if it must be stopped because you cannot wait any longer for the statement to complete, continue with Step 2.
 
Step 2 Verify that the standby database has the most recently archived redo log file for each primary database redo thread.
 
Query the V$ARCHIVED_LOG view on the target standby database to obtain the highest log sequence number for each redo thread.
For example:
 
SQL> SELECT UNIQUE THREAD# AS THREAD, MAX(SEQUENCE#) OVER (PARTITION BY thread#) AS LAST from V$ARCHIVED_LOG;
 
THREAD LAST
---------- ----------
1 100
 
If possible, copy the most recently archived redo log file for each primary database redo thread to the standby database if it does not exist there, and register it. This must be done for each redo thread.
 
For example:
SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';
 
Step 3 Identify and resolve any archived redo log gaps.
 
Query the V$ARCHIVE_GAP view on the target standby database to determine if there are any redo gaps  
 
For example:
SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
 
 
If possible, copy any missing archived redo log files to the target standby database from the primary database and register them at the target standby database. This must  be done for each redo thread.
 
For example:
SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';
 
Step 4 Repeat Step 3 until all gaps are resolved.
 
The query executed in Step 3 displays information for the highest gap only. After resolving a gap, you must repeat the query until no more rows are returned.
If, after performing Step 2 through Step 4, you are not able to resolve all gaps in the archived redo log files (for example, because you do not have access to the system that hosted the failed primary database), some data loss will occur during the failover.
 
 
Step 5 Stop Redo Apply.
 
Issue the following SQL statement on the target standby database:
 
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
 
Step 6 Finish applying all received redo data.
 
Issue the following SQL statement on the target standby database:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
 
If this statement completes without any errors, proceed to Step 7.
If an error occurs, some received redo data was not applied. Try to resolve the cause of the error and re-issue the statement before proceeding to the next step.Note that if there is a redo gap that was not resolved in Step 3 and Step 4, you will receive an error stating that there is a redo gap.
 If the error condition cannot be resolved, a failover can still be performed (with some data loss) by issuing the following SQL statement on the target standby database:
 
SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
 
Proceed to Step 9 when the ACTIVATE statement completes.
 
Step 7 Verify that the target standby database is ready to become a primary database.
 
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the target standby database.
 
For example:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
 
SWITCHOVER_STATUS
-----------------
TO PRIMARY
 
A value of either TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is  returned, verify that Redo Apply is active and continue to query this view until either
TO PRIMARY or SESSIONS ACTIVE is returned.
 
Step 8 Switch the physical standby database to the primary role.
 
Issue the following SQL statement on the target standby database:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
 
Step 9 Open the new primary database.
 
SQL> ALTER DATABASE OPEN;
 
Step 10 Back up the new primary database.
 
Oracle recommends that a full backup be taken of the new primary database.
 
Step 11 Restart Redo Apply if it has stopped at any of the other physical standby databases
 
For example:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
 
Step 12 Optionally, restore the failed primary database.
After a failover, the original primary database can be converted into a physical standby database of the new primary database using the method or it can be re-created as a physical standby database from a backup of the new primary database
 
 
Task: Troubleshooting
This section provides help troubleshooting problems with Real Application Clusters. It contains the following sections:
D.3.1 Switchover Fails in a Real Application Clusters Configuration
When your database is using Real Application Clusters, active instances prevent a switchover from being performed. When other instances are active, an attempt to switch over fails with the following error message:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY;
ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY *
ORA-01105: mount is incompatible with mounts by other instances
Action: Query the GV$INSTANCE view as follows to determine which instances are causing the problem:
SQL> SELECT INSTANCE_NAME, HOST_NAME FROM GV$INSTANCE
  2> WHERE INST_ID <> (SELECT INSTANCE_NUMBER FROM V$INSTANCE);
INSTANCE_NAME HOST_NAME
------------- ---------
INST2         standby2
In the previous example, the identified instance must be manually shut down before the switchover can proceed. You can connect to the identified instance from your instance and issue the SHUTDOWN statement remotely, for example:
SQL> CONNECT SYS/CHANGE_ON_INSTALL@standby2 AS SYSDBA
SQL> SHUTDOWN;
SQL> EXIT
 
NOT ALLOWED
 
The NOT ALLOWED means :--- Either this is a standby database and the primary database has not been switched first or this is a primary database and there are no standby databases.
 
 
------------------------------
1) Set The Environment first
------------------------------
 
export ORACLE_SID=
 
 
------------------------------
2) Connect via dgmgrl
------------------------------
 
dgmgrl /
 
DGMGRL> show configuration
 
Configuration - dvlp.oracle.cfx_dbaovm-cluster
 
  Protection Mode: MaxPerformance
  Databases:
    dvlp  - Primary database
    dvlpb - Physical standby database
    dvlpc - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS
 
------------------------------------
3) Stop One Node from Each Clusrer
------------------------------------
 
 
dbaovm04 /home/oracle> which srvctl
/apps/oracle/product/11.2.0/db/bin/srvctl
 
dbaovm04 /home/oracle> srvctl stop instance -i dvlp2 -d dvlp
 
 
-- Stop Instance2 on Use DB_unique_name
 
dbaovm06 /home/oracle> srvctl stop instance -i dvlp2 -d dvlpb
 
 
------------------------------------
4) Switchover from STANDBY side
------------------------------------
 
 
===== Be On the standby side ===========
 
DGMGRL> switchover to dvlpb;
Performing switchover NOW, please wait...
New primary database "dvlpb" is opening...
Operation requires shutdown of instance "dvlp1" on database "dvlp"
Shutting down instance "dvlp1"...
ORA-01031: insufficient privileges
 
Warning: You are no longer connected to ORACLE.
 
Please complete the following steps to finish switchover:
        shut down instance "dvlp1" of database "dvlp"
        start up and mount instance "dvlp1" of database "dvlp"
 
------------------------------------
5) Restart The New Standby
------------------------------------
 
 
-- Now shutdown the OLD primary which is going to be new standby
-- Now Startup mount and watch MRP start and other Node start
 
 
 
============= Document ==========================================
http://docs.oracle.com/cd/E11882_01/server.112/e17023/cli.htm
 
Scenarios Using the DGMGRL Command-Line Interface
 
=================================================================
 
dvlp - db_unique_name of primary
dvlpb - dbuinque_name of stby
dvlp1 - instance name
 
 
# display the configuration
 
show configuration               <- List All Databases in the dataguard Configuration
 
SHOW DATABASE 'pwebc'            <- List Info for Specific DB eg DB Mode(primary or Stby), Apply Status, No of Instances,Which inst is applying logs
 
SHOW DATABASE VERBOSE 'pwebc'    <- List All Above,Properties & Values
 
SHOW INSTANCE VERBOSE pweb1 on database pwebc;  <- Inst level info & Standby Archive location & connect identifier
 
SHOW DATABASE VERBOSE pwebc StandbyArchiveLocation
 
SHOW DATABASE 'pwebc' 'StatusReport';  -- Goode one Mostly return blank if ok
 
SHOW  INSTANCE VERBOSE <instName> 'LogXptStatus' on database <dbUniqueName>  <- get dbUniqueName & instance from "SHOW CONFIGURATION
 
SHOW DATABASE 'pwebc' 'LogXptStatus';
 
SHOW DATABASE 'pwebc' 'latestlog';
 
SHOW DATABASE 'pwebc' 'InconsistentProperties'
 
SHOW DATABASE 'pwebc' 'inconsistentlogxptProps;
 
-----------------------------------------------------------------------
Changing Dataguard Properties , More Examples are below
-----------------------------------------------------------------------
 
--EDIT CONFIGURATION SET PROTECTION MODE AS MAXPROTECTION;
 
--EDIT DATABASE 'dvlpb' SET PROPERTY 'LogXptMode'='SYNC';
 
--EDIT DATABASE 'dvlpb' SET PROPERTY 'StandbyArchiveLocation'='+DG_BACKUP';
 
-----------
Primary
-----------
--edit database prod1 set state=transport-off;
--edit database prod1 set state=transport-on;
 
-----------
Standby
-------------
--edit database prod1dr set state=apply-off;
--edit database prod1dr set state=apply-on;
change the protection mode
--------------------------
Primary
 
# Choose what level of protection you require
 sql> alter database set standby to maximize performance;
 sql> alter database set standby to maximize availability;
 sql> alter database set standby to maximize protection;
 
========================================================================
How to change the Archive of Primary going to ASMDISKS on Standby Node
=========================================================================
Standby Node:
--------------
SQL> show parameter db_recovery_file_dest
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string      +DG_BACKUP
db_recovery_file_dest_size           big integer 100000G
 
--alter system set log_archive_dest_1='LOCATION=use_db_recovery_file_dest, valid_for=(ALL_ROLES,ONLINE_LOGFILE)';
 
DGMGRL> show database verbose <dbname> StandbyArchiveLocation
 
DGMGRL>  edit database dvlpb set property StandbyArchiveLocation='USE_DB_RECOVERY_FILE_DEST'