Jennifer Lin’s Weblog

March 13, 2009

Oracle RMAN – Restore from any Disk Location

Filed under: RMAN — jennyca @ 9:26 pm

RMAN looks for the backupfiles to restore within its catalog and therefore there is no RESTORE DATABASE FROM command. You can circumvent this constraint using the RMAN command CATALOG START WITH .

Oracle DB => RMAN backup Backup Set to default location (configure channel device type disk format ;)

RMAN> catalog start with ;
RMAN restore Backup Set from

Example:

1. Copy the Backup Set Files to any Disk Location

cp /tmp/backup

2. Mount the Database

sqlplus / as sysdba
startup mount;

3. Cleanup RMAN Catalog

rman target /

crosscheck backup;
delete noprompt expired backup of database;
delete noprompt expired backup of controlfile;
delete noprompt expired backup of archivelog all;
list backup;

At this point no backup should be available !

4. Make new Backup Location visible to RMAN Catalog

catalog start with ‘/tmp/backup’;

searching for all files that match the pattern
/tmp/backup

List of Files Unknown to the Database
=====================================
File Name: /tmp/backup/PROD_datafile_14_1.bak
File Name: /tmp/backup/PROD_controlfile_17.bak
File Name: /tmp/backup/PROD_archivelog_16_1.bak
File Name: /tmp/backup/PROD_datafile_15_1.bak

Do you really want to catalog the above files
(enter YES or NO)? yes

List of Cataloged Files
=======================
File Name: /tmp/backup/PROD_datafile_14_1.bak
File Name: /tmp/backup/PROD_controlfile_17.bak
File Name: /tmp/backup/PROD_archivelog_16_1.bak
File Name: /tmp/backup/PROD_datafile_15_1.bak

list backup;

At this point the backup must be available !

5. Restore and Recover the Database

restore database;
recover database;
alter database open;

http://www.akadia.com/

July 6, 2008

RMAN-20202

Filed under: RMAN error — jennyca @ 1:37 pm

Symptoms

RMAN convert returns error:
Starting backup at 07-MAR-06
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=521 devtype=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 03/07/2006 11:00:38
RMAN-20202: tablespace not found in the recovery catalog
RMAN-06019: could not translate tablespace name “test”

Cause

Tablespace name is not being matched within the database. Name must be entered in uppercase
within single quotes.

Error clearly shows the tablespace is not found.

Solution

To implement the solution, please execute the following steps:

Rather than using:

convert tablespace ‘test’ to platform=’……………….

Use:

convert tablespace ‘TEST’ to platform=’……………….

February 13, 2008

About System Change Number (SCN)

Filed under: Backup and Recovery — jennyca @ 6:38 pm

Where does the System Change Number (SCN) resides ? 

It resides in control files, datafile headers, redo records and alert log file.

The system change number (SCN) is an ever-increasing internal timestamp that uniquely identifies a committed version of the database. Every time a user commits a transaction, Oracle records a new SCN. You can obtain SCNs in a number of ways, for example, from the alert log.

select dbms_flashback.get_system_change_number from dual;

Oracle uses SCNs in control files, datafile headers, and redo records.

The SCN plays an important role to determine if the database is in a consistent state when the database is brought online. SMON checks the SCN in all datafile headers when the database is started. Everything is OK if all of these SCNs matches the SCN found in the controlfile. If the SCNs don’t match, the database is in an inconsistent state.

The SCN is incremented whenever a transaction commits. However, this is not the only source of increments. In a seemingly idle database, the SCN gets incremented also through AQ, SMON, job queues…

The SCN plays a vital role for providing consistent reads.
Basically, it works as follows: The query reads a db block. This block has as an attribute the SCN when it was last changed. If this SCN is greater than the SCN that was in place when (our) query began, it means that the block was changed after we have started our query. So we have to find an older version of the block. If this block is found in the rollback segments, we use it for our query.

[ When a (reading) query starts, it is important that the values of the rows selected are the same as when the query started, even if another session has changed those rows. This is refered to as read consistency. Read consistency is achieved through the SCN (system change number).]

Oracle System Change Number (SCN) 完全筆記

Filed under: Backup and Recovery — jennyca @ 6:13 am

SCN是當Oracle資料更新後,由DBMS自動維護去累積遞增的一個數字。 當一個交易commit時,LGWR會將log buffer寫入redo log file,同時也會將該筆交易的 SCN同步寫入到redo log file內(wait-until-completed)。因此當你commit transaction時, 在交易成功的訊息返回之前,LGWR必須先完整的完成上述行為之後,否則你是看不到提交成功的回應訊息。

我們可以查詢目前系統最新的SCN

select dbms_flashback.get_system_change_number from dual;

可以理解的,這裡返回的SCN,也是目前redo log file最新的SCN紀錄。 因為commit後的交易才會有SCN,而一旦commit就會立刻寫入redo log file中。

CHECKPOINT 和 SCN 的關連

checkpoint發生的目的就是要把儲存在buffer內的已提交交易寫回disk,否則一旦發生crash,需要 進行recovery時,你就必須花很多的時間從redo log file內最後的SCN交易開始進行recovery,這樣 在商業應用上是很浪費時間和沒有效率的。

重點在於當commit一個交易時,只會立刻將redo buffer寫入redo log file內,但是並不會馬上將 該update後的block(dirty block)同步寫回disk datafile中,這是為了減少過多disk IO的考量,所以採取batch的方式寫入。

When a checkpoint occurs, Oracle must update the headers of all datafiles to record the details of the checkpoint. This is done by the CKPT process. The CKPT process does not write blocks to disk; DBWn always performs that work.

在shutdown normal or shutdown immediate下,也就是所謂的clean shutdown,checkpoint也會自動觸發,並且把SCN紀錄寫回。 當發生checkpoint時,會把SCN寫到四個地方去。三個地方於control file內,一個在datafile header。

Control file三個地方為

1.System checkpoint SCN ===========> (SYSTEM CHECKPOINT SCN in control file)

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
——————–
292767

2.Datafile checkpoint SCN ===============> (DATAFILE CHECKPOINT SCN in control file)

SQL> select name,checkpoint_change#
from v$datafile where name like ‘%users01%’;

NAME CHECKPOINT_CHANGE#
———————————– ——————–
/u02/oradata/OMFD1/users01.dbf 292767

3.Stop SCN ======================> (STOP SCN in control file)

SQL> select name,last_change#
from v$datafile where name like ‘%users01%’;

NAME LAST_CHANGE#
———————————– ————
/u02/oradata/OMFD1/users01.dbf

正常datafile在read-write mode運作下,Last_change#一定是NULL

另外一個地方在datafile header內

4.Start SCN ================================> (DATAFILE HEADER)

SQL> select name,checkpoint_change#
from v$datafile_header where name like ‘%users01%’;

NAME CHECKPOINT_CHANGE#
———————————– ——————–
/u02/oradata/OMFD1/users01.dbf 292767

為什麼儲存在CONTROL FILE中要分為兩個地方(SYSTEM CHECKPOINT SCN,DATAFILE CHECKPOINT SCN ?) 當你把一個tbs設為read-only時,他的SCN會凍結停止,此時DATAFILE CHECKPOINT SCN是不會再遞增改變的, 但是整體的SYSTEM CHECKPOINT SCN卻仍然會不斷遞增前進。
所以,這就是為什麼需要分別在兩個地方儲存SCN。

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

正常shutdown database後,SCN會發生什麼變化?

我們可以把資料庫開在mount mode

select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
——————–
293184

select name,checkpoint_change#,last_change# from v$datafile where name like ‘%user%’;

NAME CHECKPOINT_CHANGE# LAST_CHANGE#
———————————– ——————– ————–
/u02/oradata/OMFD1/users01.dbf 293184 293184

可以看到儲存在control file中的三個SCN位置都是相同,注意此時的stop scn不會是NULL,而是等於start scn

我們也來查詢datafile header SCN:

select name,checkpoint_change# from v$datafile_header where name like ‘%users01%’;

NAME CHECKPOINT_CHANGE#
———————————– ——————–
/u02/oradata/OMFD1/users01.dbf 293184

當clean shutdown 時,checkpoint會進行,並且此時datafile的stop scn和start scn會相同。 等到我門開啟資料庫時,Oracle檢查datafile header中的start scn和存於control file中的datafile的scn是否相同, 如果相同,接著檢查start scn和stop scn是否相同,如果仍然相同,資料庫就會正常開啟,否則就需要recovery… 等到資料庫開啟後,儲存在control file中的stop scn就會恢復為NULL值,此時表示datafile是open在正常模式下了。

如果不正常SHUTDOWN (shutdown abort),則mount資料庫後,你會發現stop scn並不是等於其他位置的scn, 而是等於NULL,這表示Oracle在shutdown時沒有進行checkpoint,下次開機必須進行crash recovery。

crash recovery

必須先進行roll forward(從redo log file中從目前的start SCN開始,重做後面的已提交之交易)
再從roll back segment 做rollback未完成(dead transaction)交易

檢驗controlfile中的SCN會等於datafile header的SCN

select ‘controlfile’ “SCN location”,name,checkpoint_change#
from v$datafile where name like ‘%users01%’
union
select ‘file header’,name,checkpoint_change#
from v$datafile_header where name like ‘%users01%’;

SCN location NAME CHECKPOINT_CHANGE#
————– ———————————– ——————–
controlfile /u02/oradata/OMFD1/users01.dbf 293188
file header /u02/oradata/OMFD1/users01.dbf 293188

 

crash recovery vs media recovery

啟 動資料庫時,如果發現STOP SCN = NULL,表示需要進行crash recovery;啟動資料庫時,如果發現有datafile header的START SCN 不等於儲存於CONTROLFILE的DATAFILE SCN,表示需要進行Media recovery

STOP SCN equal NULL ==> NEED CRASH RECOVERY
DATAFILE HEADER START SCN not equal CONTROLFILE SCN ==> NEED MEDIA RECOVERY

RECOVERY DATABASE 兩種常見問題

1) RECOVER DATABASE UNTIL CANCEL ==> OPEN DATABASE RESETLOG

==> DATAFILE HEADER SCN一定會小於CONTROLFILE的DATAFILE SCN

如果你有進行RESTORE DATAFILE,則該RESTORE的DATAFILE HEADER SCN一定會小於目前CONTROLFILE的DATAFILE SCN,此時會無法
開啟資料庫,必須進行media recovery~~重做archive log直到該datafile header的SCN=current scn

restore datafile後,可以mount database然後去檢查controlfile and datafile header的SCN

select ‘controlfile’ “SCN location”,name,checkpoint_change#
from v$datafile where name like ‘%users01%’
union
select ‘file header’,name,checkpoint_change#
from v$datafile_header where name like ‘%users01%’;

SCN location NAME CHECKPOINT_CHANGE#
————– ———————————– ——————–
controlfile /u02/oradata/OMFD1/users01.dbf 313551
file header /u02/oradata/OMFD1/users01.dbf 313401

2) RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE; ===> OPEN DATABASE RESETLOG

==> DATAFILE HEADER SCN一定會大於CONTROLFILE的DATAFILE SCN

如果只是某TABLE被DROP掉,沒有破壞資料庫整體資料結構,還可以用INCOMPLETE RECOVERY解決 如果是某個TABLESPACE OR DATAFILE被DROP掉,因為檔案結構已經破壞,目前的CONTROL FILE內已經沒有 該DATAFILE的資訊,就算你只RESTORE DATAFILE然後進行INCOMPLETE RECOVERY也無法救回被DROP的DATA FILE。

只好RESOTRE 之前備份的CONTROL FILE(裡頭被DROP DATAFILE Metadata此時還存在),不過RESTOREC CONTROL FILE後 此時Oracle會發現CONTROL FILE內的SYSTEM SCN會小於目前的DATAFILE HEADER SCN,也不等於目前儲存於LOG FILE內的SCN, 此時就必須使用RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE到DROP DATAFILE OR DROP TABLESPACE之前的SCN。

另一種特殊狀況就是,萬一不幸地所有CONTROL FILE都遺失了,也必須用這種方式救回,所以請做MULTIPLEXING。

From http://www.odba.idv.tw/scn.htm

February 12, 2008

CREATE CONTROLFILE

Filed under: Backup and Recovery — jennyca @ 11:29 pm

Use the CREATE CONTROLFILE statement to re-create a control file in one of the following cases:

  • All copies of your existing control files have been lost through media failure.
  • You want to change the name of the database.
  • You want to change the maximum number of redo log file groups, redo log file members, archived redo log files, datafiles, or instances that can concurrently have the database mounted and open.

Details see the following link

http://www.itk.ilstu.edu/docs/oracle/server.101/b10759/statements_5003.htm

About disaster recovery plans

Filed under: Backup and Recovery — jennyca @ 10:49 pm

What is your opinion on what steps you would take to handle the Oracle disaster recovery plan (DRP) project?

Disaster recovery plans start with risk assessment. You need to identify all the risks that your data center faces and then determine the business impact should that risk become an event. For instance, typical risks that are considered are: loss of a disk drive, loss of a server, complete loss of your data center, etc. Once you have identified the risks you face and quantified the impact to your business, then you can begin to start planning for those disasters.

What happens if you lose a disk drive? Your disaster plan should give details on how to recover from that event. For instance, you may pull a spare drive off the shelf and restore the database from a backup. This means you have to have that spare drive on the shelf. The more detailed your risk assessment, the more detailed your disaster recovery plan will be.

In how many ways can we design and implement a disaster recovery plan for Oracle?

There are many ways to handle your disasters. You’ll have to look at the risks you face. Each risk should have a solution. In some cases, multiple risks have the same solution.

From http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1146036,00.html

Does the SCN number get written to the online redo log files? What are all the places where the SCN number does get written?

Filed under: Backup and Recovery — jennyca @ 8:25 pm

The System Change Number (SCN) is how Oracle internally keeps track of the timing of events (i.e., transactions) in the database. The SCN does not have a time associated with it, but a higher SCN occurred after a lower SCN. There is no correlation between the SCN values and the time between when the SCNs were generated. Two SCNs that differ by a value of 10 could be generated one second apart or 10 minutes apart.

Since the online redo log files contain transaction information, it is useful to know when that transaction took place, relative to other transactions. So SCNs are written to the online redo log files. Any record written to the online redo log file has an associated SCN so the database will know when that change took place.

The SCN is stored in other places as well. One of the most important is the control files. The control file keeps track of changes to the database. As such, it needs to know the SCN.

The datafile headers also contain the SCN of the most recent transaction that changed the datafile’s contents. When you start Oracle, Oracle checks the SCN in the datafile with the SCN in the control file. If the SCN in the datafile is “older” than the SCN in the control file, Oracle knows that the file needs recovery. When you make a tablespace READ ONLY, all of its datafiles have the SCN frozen, but Oracle does not worry about this because it knows the files have not had changes made to it.

Another place the SCN is stored is in the data block. The SCN in the block will show when the block’s contents were last updated.

From http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1188377,00.html

February 8, 2008

When to use “alter database open resetlogs;”?

Filed under: Backup and Recovery — jennyca @ 7:46 pm

1. Incomplete media recovery
2. recovery using a backup control file
3. recovery with a control file recreated with the resetlogs option

When to run “recover database using backup controlfile until cancel;”? What does this command do and when do we use this command?

Filed under: Backup and Recovery — jennyca @ 7:36 pm

If you were to recover the database with a current control file, Oralce will know the last SCN of the last checkpoint. So Oracle can use the information in the archived and online redo logs to apply recovery up to that SCN. This is called a “complete” recovery.

If you do not have the current control file, your option is to use a backup controlfile. Which means Oracle does not know the SCN to stop applying recovery. So you tell Oracle that you are using a “backup controlfile” and that you will tell it when to stop applying redo by replying “cancel.” When Oracle starts recovery, it looks at the datafiles to know the last time a checkpoint was performed on the datafile. Oracle now knows to start applying recovery to the datafile for all SCNs after the SCN in the datafile header. Oracle rolls forward transactions. But Oracle does not know when to stop, and eventually, Oracle applies recovery in all of your archived redo logs. You can then tell Oracle to use the redo in the online redo logs. Oracle will ask you where to find more redo. At this point, you tell it to quit applying redo by replying CANCEL.

Hopefully, you never have to use this command. This is because you multiplex your control files and have three copies of the control file, all on different disk units. So if you lose a disk unit, you use the control files on the other disk units. The only time you need this command is when you lose all of your control files. In this case, either use a binary backup of the control file or you re-create the control file with the CREATE CONTROLFILE command and then recover with the backup control file.

from http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1188377,00.html

Tablespace dropped accidentally

Since the current control file does not contain the dropped tablespace info, we must use backup control file to recover dropped tablespace.
Checked alert log to find out dropped time.

For example,

SQL> recover database until time ‘2008-01-31 13:25:06′
using backup controlfile;

SQL> alter database open resetlogs;

To execute a SQL statement or a PL/SQL stored procedure from within Recovery Manager

Filed under: RMAN — jennyca @ 7:11 pm

If the string that RMAN passes to PL/SQL contains a filename, then the filename must be enclosed in duplicate single quotes and the entire string following the SQL keyword must be enclosed in double quotes. For example, use the following syntax:

 SQL "CREATE TABLESPACE temp1 DATAFILE ''?/oradata/trgt/temp1.dbf'' SIZE 10M
 TEMPORARY";

If you attempt to use single quotes for the string following the SQL keyword or use only one set of single quotes for the filename, then the command fails.

You cannot execute SELECT statements.

This example issues a PL/SQL stored procedure called scott.update_log:

RUN { SQL ' BEGIN scott.update_log; END; '; }

http://www.acs.ilstu.edu/docs/oracle/server.101/b10770/rcmsynta59.htm

Blog at WordPress.com.