Dontcheff

Archive for January, 2012|Monthly archive page

Restoring the Oracle database

In Bugs, Database tuning, DBA, Oracle database, RMAN on January 11, 2012 at 06:08

If you are reading this, there is high chance that someone has been breathing behind your back while you were restoring a database or mildly harassing you over the phone with the only question: when do you think that the database will be open for WebLogic connections?

Indeed when? There is no way to say precisely. You can guess based on the time the backup had taken in average but that is just an estimate. After the restore, there might be a nasty long sequence of archivelog files that has to be applied (with the hope that none is missing) and then you can open with resetlogs. Unless you hit a “feature” or two.

There are 5 major reasons why RMAN restore performance may be poor:

1. Missing statistics: Before a restore even begins the the control file is queried, datafile headers are read, media managers are initialised, the catalog is resynced, PL/SQL is generated and compiled, RMAN metadata either in the control file or the catalog is queried. All this incurs I/O against the control file, the datafile headers and the recovery catalog.

Make sure that the catalog database is analyzed and reorganized on regular basis. It also requires regular maintenance: crosscheck backup, etc.

Generate statistics on the fixed objects, with the GATHER_FIXED_OBJECTS_STATS procedure. This should be done when the database has been running for awhile with generic workload, so the information in the fixed objects reflects a reasonable state of the database load.

Gather the statistics with the following command:

exec dbms_stats.gather_fixed_objects_stats;

Believe it or now, often adding this to the RMAN script might help for both faster backup and restore:

alter session set optimizer_mode=RULE;

2. The backup is hardware multiplexed: all channels write to a single tape device. If the media manager does not support multiplexing only one backuppiece can be returned at a time. Considering the fact that each backuppiece is multiplexed with the rest of the others, this restore would require as many scans as the number of channels allocated of at most the whole backup size each (with a tape rewind after each scan).

3. Individual files or tablespaces are being restored: A restore of a single file needs a scan of potentially the whole database backup depending on where the header of the file being restored is positioned on the tape. The worst situation is when single file is being restored so that the file’s block header is the last block written to the backuppiece.

4. A different number of channels is used for restore compared to backup: The number of channels to be allocated should be equal to the number of physical tape drives. Try to do a couple of things:

Set enough big large pool: set LARGE_POOL_SIZE to at least 256M
Enable the backup_tape_io_slaves by setting BACKUP_TAPE_IO_SLAVES = true

However, some media managers will allow hardware multiplexing to a single tape and are able to parallelize the restore (in the media manager layer) such that the backuppieces are returned from a single tape to multiple channels in parallel. In such a situation it is possible to use an unpublished SET PARALLELMEDIARESTORE OFF command to make RMAN ignore this check.


SET PARALLELMEDIARESTORE OFF;
      RUN { ...
        RESTORE DATABASE;
        RECOVER DATABASE; ...
      }

5. The RMAN relation with bugs is not a new one. The “mend-it-or-end-it” principle does not play any role here. 5 years ago, Jaffar posted on his blog Known RMAN Performance Problems. It is worth checking again the MOS note 247611.1

Yet, here are some Oracle or MML bugs not fixed by January 1st, 2012:

Bug 12363733: restoring backups are very slow (reproduced both with asm and filesystem)
Bug 7573468: RMAN duplicate until time very slow
Bug 11835641: 11gr2 – RMAN restore took more time – duplicate or restore is slow
Bug 10210264: RMAN restore is slower following upgrade to 11.2
Bug 11827990: select name from v$datafile wait on “rdbms ipc reply” on rman restore
Bug 12543119: EM not using correct ASM diskgroup to restore datafiles when creating clone
Bug 11892765: please allow RMAN restore to pre-allocate file extents
Bug 9556740: RMAN job failed when using sudo
Bug 9724316: dbclone not choosing the correct controlfile for the clone
Bug 9845029: OEM clone 11gR2 ASM database fails during recovery
Bug 5381095: RMAN: poor restore performance in nocatalog mode
Bug 10093810: RMAN backup need to maintain an index for file header block for each file
Bug 9918138: RMAN restoring a compressed backup running very slow
Bug 6964596: RMAN single channel restore performance is faster than 6 channels
Bug 5462916: RMAN: recover database skip forever tbs with ORA-01157
Bug 2608118: restore of autobackup controlfile take a long time to complete
Bug 1533686: RMAN hangs when restoring via adsm with an ops target db
Bug 2911857: bad performance of RMAN duplication with large number of files (> 1500)
Bug 13349761: RMAN RESYNC IS SLOW
Bug 6770620: RMAN recover of copy database command consumes CPU

Bug 13109129: after upgrade to 11.2 compressed RMAN backups are slow
Bug 13066081: RMAN unregister database is slow/hangs
Bug 12896388: RMAN backup slower with – check logical – option on compressed tablespace
Bug 13500084: RMAN switch command is taking too long
Bug 13064833: RMAN process hangs after finished backup
Bug 13520050: RMAN session does not terminate- loops -spinning on cpu when destination is full

Some of the bugs will never be probably fixed as you can see that they may get, and are closed (or suspended) because of vendor OS problems, problem cannot be replicated, closed as duplicate, information not avaiable, etc.

Or how about those bugs with status 92: closed, not a bug. For example bug 11835641 is actually not a bug? Right.

Additional information: 11.2 Oracle Database Backup and Recovery User’s Guide. Check Part VI, Tuning and Troubleshooting.

2011 in review

In DBA on January 1, 2012 at 11:55

The WordPress.com stats helper monkeys prepared a 2011 annual report for this blog.

Here’s an excerpt:

The concert hall at the Syndey Opera House holds 2,700 people. This blog was viewed about 30,000 times in 2011. If it were a concert at Sydney Opera House, it would take about 11 sold-out performances for that many people to see it.

Click here to see the complete report.