转 RMAN: RAC Backup, Restore and Recovery using RMAN

PURPOSE

The purpose of this document is to give a quick guide for using RMAN on RAC databases. We will follow this points:

1. Verify the database mode and archive destination.
2. Verify connectivity using sqlnet for target and catalog.
3. Determine the backup device.
4. Understand how to create an RMAN persistent configuration for a RAC env.
5. Create backups to disk using the new persistent configuration parameters.
6. Backupset maintenance using the configured retention policy.
7. Restore and Recover
a. Complete
b. Incomplete
8. Review and understand the impact of resetlogs on the catalog.
9. RMAN Sample Commands.

SCOPE

This discussion is for a 2-node Oracle RAC Cluster.
The logs are being archived to their respective node.
We are allocating channels to each node to enable the autolocate feature of RMAN in a RAC env.

DETAILS

1. Verify the databases are in archivelog mode and archive destination.

a. NODE 1: thread 1

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u02/app/oracle/product/11.2.0/dbs/arch
Oldest online log sequence     20
Next log sequence to archive   21
Current log sequence           21

b. NODE 2: thread 2

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u02/app/oracle/product/11.2.0/dbs/arch
Oldest online log sequence     8
Next log sequence to archive   9
Current log sequence           9

2. Verify connectivity to the target nodes and catalog if used.

$ setenv TNS_ADMIN $ORACLE_HOME/network/admin
$ sqlplus /nolog
    SQL> connect sys/pwd@node1 as sysdba
   SQL> connect sys/pwd@node2 as sysdba
    SQL> connect rman/rman@rcat

3. Set your testing areas.

Testing HOME for logs: /u02/home/usupport/rman

Backups HOME Location: /rman/V112

4. Connect using RMAN to verify and set the controlfile persistent configuration.

The controlfiles are shared between the instances so configuring the controlfile on node 1 also sets it for all nodes in the RAC cluster.

* Always note the target DBID
connected to target database: V112 (DBID=228033884)

* Default Configuration

RMAN> SHOW ALL;
    CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
    CONFIGURE BACKUP OPTIMIZATION OFF; # default
    CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
    CONFIGURE CONTROLFILE AUTOBACKUP ON;
    CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
    CONFIGURE DEVICE TYPE DISK PARALLELISM 1; # default
    CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
    CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
    CONFIGURE MAXSETSIZE TO UNLIMITED; # default
    CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u02/app/oracle/product/11.2.0/dbs/snapcf_V11201.f'; # default

* Configuring Channels to Use a Specific Node

To configure one RMAN channel for each policy-managed Oracle RAC database instance, use the following syntax:
CONFIGURE CHANNEL DEVICE TYPE disk CONNECT 'SYS/RAC@NODE1';'
CONFIGURE CHANNEL DEVICE TYPE disk CONNECT ''SYS/RAC@NODE2';

* Configure parallelism and use dynamic channel allocation. Review the following for details:
How To Use RMAN Dynamic Channel Allocation For RAC Environments (Note 1100443.1)

* Make changes to other configurations to fit your business requirements. Review the following for details:
Getting Started with Recovery Manager (RMAN) Note 360416.1

In 11.2, there is a new requirement for snapshot controlfile to reside on a shared location, see:
In RAC environment from 11.2 onwards Backup Or Snapshot controlfile needs to be in shared location Note 1472171.1

5. Make a backup using the new persistent configuration parameters.

* Backup database with differential incremental 0 and then archived logs using the delete input option.

BACKUP INCREMENTAL LEVEL 0 FORMAT '/rman/V112/%d_LVL0_%T_%u_s%s_p%p' DATABASE;
BACKUP ARCHIVELOG ALL FORMAT '/rman/V112/%d_AL_%T_%u_s%s_p%p'DELETE INPUT;

* Backup again using differential incremental level 1

BACKUP INCREMENTAL LEVEL 1 FORMAT '/rman/V112/%d_LVL1_%T_%u_s%s_p%p' DATABASE;
BACKUP ARCHIVELOG ALL FORMAT '/rman/V112/%d_AL_%T_%u_s%s_p%p' DELETE INPUT;

* To simplify this you can also use PLUS ARCHIVELOG

BACKUP INCREMENTAL LEVEL 0 FORMAT '/rman/V112/%d_LVL0_%T_%u_s%s_p%p'
DATABASE PLUS ARCHIVELOG FORMAT '/rman/V112/%d_AL_%T_%u_s%s_p%p' DELETE INPUT;

This uses a different algorithm than backup database and backup archivelog in separate commands, the algorithm for PLUS ARCHIVELOG is:

1. Archive log current
     2. Backup archived logs
     3. Backup database level 0
     4. Archive log current
     5. Backup any remaining archived log created during backup

6. Backupset maintenance using the configured retention policy

RMAN> LIST BACKUP SUMMARY;
   RMAN> LIST BACKUP BY DATAFILE;
    RMAN> LIST BACKUP OF DATABASE;
    RMAN> LIST BACKUP OF ARCHIVELOG ALL;
    RMAN> LIST BACKUP OF CONTROLFILE;

These above can be enhanced with the "until time" clause as well as the archivelog backups using "not backed up x times" to cut down on many copies of a log in several backup sets.
Then continuing with SMR Server Managed Recovery use the change archivelog from...until...delete to remove old logs no longer needed on disk.

To check/delete obsolete backups or archivelogs we use:

RMAN> REPORT OBSOLETE;

RMAN> DELETE OBSOLETE;
or
RMAN> DELETE NOPROMPT OBSOLETE;

To check the database files:

RMAN> REPORT SCHEMA;

7. Restore and Recover

Complete Recovery

With the database mounted on the node1 and no-mount on node2 connect to the target and catalog using RMAN.

rman target / catalog rman/rman@rcat

This script will restore and recover the database completely and open the database in read/write mode.

run {
             RESTORE DATABASE;
             RECOVER DATABASE;
             ALTER DATABASE OPEN;
          }

Incomplete Recovery

If you are using instance registration the database must be mounted to register with the listener. This means you must use the current controlfile for restore and recovery or setup a dedicated listener if not already done. RMAN requires a dedicated server connection and does not work with using instance registration before mounting the controlfile. Using the autobackup controlfile feature requires the DBID of the TARGET database. It must be set when the database is not mounted and only the controlfile and spfile (from 9.2) can be restored this way.

1. Shutdown node1 and node2

2. Startup no-mount node2 and node1

3. Start rman and restore the controlfile from autobackup:

rman trace reco1.log

RMAN> CONNECT CATALOG rman/rman@rcat

RMAN> SET DBID=228033884;

RMAN> CONNECT TARGET

RMAN> restore controlfile;

4. If no catalog is used, you can restore the controlfile from autobackup

% rman trace recocf.log

RMAN> SET DBID=228033884;

RMAN> CONNECT TARGET /

RMAN> RUN
           {
             SET CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE disk TO '/rman/V112/%F';
             ALLOCATE CHANNEL d1 DEVICE TYPE disk;
             RESTORE CONTROLFILE FROM AUTOBACKUP
                 MAXSEQ 5           # start at sequence 5 and count down (optional)
                 MAXDAYS 5;         # start at UNTIL TIME and search back 5 days (optional)
             MOUNT DATABASE;
            }

See Note 372996.1 and/or Note 403883.1 for more options for restoring a controlfile.

5. Verify what is available for incomplete recovery.

We will recover with the highest available redo information. In a RAC database, both thread must be considered to determine highest available redo. The options are "until time", "until scn", or "until sequence". We will use the log sequence in this case.

a. First we need to find the highest sequence of each thread:

SQL> select max(sequence#) from v$archived_log L, v$database D

where L.resetlogs_change# = D.resetlogs_change# and

thread#=1;

MAX(SEQUENCE#)
--------------
25

SQL> select max(sequence#) from v$archived_log L, v$database D

where L.resetlogs_change# = D.resetlogs_change# and

thread#=2;

MAX(SEQUENCE#)
--------------
13

b. Next is to find the thread with lowest NEXT_CHANGE# scn.

SQL> select sequence#, thread#, first_change#, next_change#
             from v$archived_log L, v$database D
   where L.resetlogs_change# = D.resetlogs_change# and
             sequence# in (13,25);

SEQUENCE#    THREAD# FIRST_CHANGE# NEXT_CHANGE#
     -------------------- -------------- ------------------------- -------------------------
            25            1        1744432    1744802
            13          2       1744429        1744805

SQL> select sequence#, thread#, first_change#, next_change#
from v$backup_redolog
where sequence# in (13,25);

SEQUENCE#    THREAD# FIRST_CHANGE# NEXT_CHANGE#
      -------------------- -------------- ------------------------- -------------------------
            25            1        1744432    1744802
            13          2       1744429        1744805

In this case the next_change# SCN in thread 1 sequence 25 is lower than sequence 13 thread 2. In a RAC environment, we use the lower to ensure we have the redo required from BOTH threads. In other words, we use the lower (thread# 1) to ensure that ALL scn (s) in thread #1 exist in the available sequence for thread #2.

So we will set sequence 26 for thread 1 for RMAN 'until sequence' recovery, because RMAN stops the recovery before applying the indicated sequence. Log sequence for recovery needs always be sequence+1 to end at +1 after applying the prior sequence. I.e.:

SET UNTIL SEQUENCE 26 THREAD 1;

6. Get the command to add TEMPFILES after opening DB.

Locally Managed Temporary Tablespaces are not restored by RESTORE command, we need to create them manually after recovery is complete.

If using LMT Temporary tablespace the controlfile will have the syntax to add the tempfile after recovery is complete. The following command will give us the create controlfile statement:

SQL> alter database backup controlfile to trace;

Example:

# Commands to add tempfiles to temporary tablespaces.
      # Online tempfiles have complete space information.
      # Other tempfiles may require adjustment.
        ALTER TABLESPACE TEMP ADD TEMPFILE '/dev/db/temp_01.dbf' SIZE 41943040 REUSE AUTOEXTEND OFF;
      # End of tempfile additions.
      #

NOTE: In newer versions, the tempfiles are added automatically.

7. Run the rman script

Since log sequence 13 thread 2 next_change# is 3 changes ahead of thread 1 sequence 25 we are using sequence 26 (25+1) to stop recovery. This will restore the data files and recover them applying all of sequence #25 of thread 1 and stopping at sequence #26.

run {
          SET UNTIL SEQUENCE 26 THREAD 1;
          RESTORE DATABASE;
          RECOVER DATABASE;
          ALTER DATABASE OPEN RESETLOGS;
          }

8. Review and understand the impact of resetlogs on the catalog.

After resetlogs a new incarnation for the database is recorded in the RMAN catalog and database controlfile. Only one incarnation can be current and any need to restore from a previous incarnation requires you to "reset database to incarnation...".

For example:

RMAN> LIST INCARNATION OF DATABASE V112;

List of Database Incarnations
DB Key Inc Key DB Name DB ID            CUR Reset SCN Reset Time
----------- ----------- ------------ ---------------- ------- -------------- ---------------
2656          2657       V112         228033884      NO   1                   29-MAY-13
2656          3132       V112         228033884      YES 1744806       13-JUN-13

We see that an "open resetlogs" was executed against this database on 13-JUN-2013.

9. RMAN Sample Commands

* With a dedicated listener (not using instance registration) restoring the controlfile.

run {
ALLOCATE CHANNEL D1 TYPE DISK CONNECT 'SYS/RAC@NODE1';
ALLOCATE CHANNEL D2 TYPE DISK CONNECT 'SYS/RAC@NODE2';
SET UNTIL SEQUENCE 14 THREAD 2;
RESTORE CONTROLFILE;
ALTER DATABASE MOUNT;
RELEASE CHANNEL D1;
RELEASE CHANNEL D2;
}

* Backup Archivelog

BACKUP ARCHIVELOG ALL NOT BACKED UP 3 TIMES;
BACKUP ARCHIVELOG UNTIL TIME 'SYSDATE-2' NOT BACKED UP 2 TIMES;

#####

compress resotre优化
在之前的blog中我提到可以使用rman的多通道进行compress 备份，相应的在restore期间也可以使用多个通道，这样每个通道进程会被os放到不同的cpu上运行
从而加快restore过程，特别是在compress 的restore时候，这时候uncompress占用了很大一部分时间，使用多通道加速恢复的过程

run
{
allocate channel c1 device type disk;
allocate channel c2 device type disk;
allocate channel c3 device type disk;
allocate channel c4 device type disk;
allocate channel c5 device type disk;
set until sequence 26747 thread 1;
set until sequence 13743 thread 2;
restore database;
recover database;
release channel c1;
release channel c2;
release channel c3;
release channel c4;
release channel c5;
}

试图分配五个通道进行restore，看结果

channel c1: starting datafile backup set restore
channel c1: specifying datafile(s) to restore from backup
XXXXXXXXXXXX
channel c1: reading from backup piece XXXXXXXXXXX
channel c2: starting datafile backup set restore
XXXXXXXXXXXXX
channel c2: reading from backup piece XXXXXXXXXX
channel c3: starting datafile backup set restore
channel c3: specifying datafile(s) to restore from backup set
XXXXXXXXXXXXXX
channel c3: reading from backup piece XXXXXXXXXXX

从输出中可以看出，rman 只使用了三个通道。这是为什么呢？原因是RMAN在进行restore的时候，只能是每个通道读一个backuppiece,虽然分配了五个道通，但实际上是三个通道在进行
restore操作，再然后就是在备份的时候，根据io能力，可以适当设置每个piece大小，形成多个piece,这样在resotre的时候可以使用多通道来加速,,再然后.......就没有了...............