一次坏块的处理过程
最近在一個數據庫遇到了壞塊,以下是處理過程。
一、壞塊的發現及處理
首先是在做RMAN備份的時候,看到alert文件有壞塊報錯:
.......
Corrupt block relative dba: 0x00925026 (file 2, block 1200166)
Fractured block found during backing up datafile
Data in bad block:
type: 6 format: 2 rdba: 0x00925026
last change scn: 0x0578.15b7ebf7 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x33da0602
check value in block header: 0xf8d0
computed block checksum: 0xd82e
Reread of blocknum=1200166, file=+DATA/dwrac/datafile/dwdata_1m01.dbf. found same corrupt data
Reread of blocknum=1200166, file=+DATA/dwrac/datafile/dwdata_1m01.dbf. found same corrupt data
Reread of blocknum=1200166, file=+DATA/dwrac/datafile/dwdata_1m01.dbf. found same corrupt data
Reread of blocknum=1200166, file=+DATA/dwrac/datafile/dwdata_1m01.dbf. found same corrupt data
Reread of blocknum=1200166, file=+DATA/dwrac/datafile/dwdata_1m01.dbf. found same corrupt data
......
用dbv檢查也證實了這一點。
[oracle@dwdb02 admin]$ dbv file="+DATA/dwrac/datafile/dwdata_1m01.dbf" start=502554 end=1217134 userid=admin/sdoadmin123 blocksize=16384
DBVERIFY: Release 10.2.0.5.0 - Production on Thu Feb 17 14:50:13 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
DBVERIFY - Verification starting : FILE = +DATA/dwrac/datafile/dwdata_1m01.dbf
Page 502554 is marked corrupt
Corrupt block relative dba: 0x0087ab1a (file 2, block 502554)
Bad header found during dbv:?
Data in bad block:
type: 11 format: 2 rdba: 0x00800001
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000b01
check value in block header: 0x85d8
computed block checksum: 0x0
......
DBVERIFY - Verification complete
Total Pages Examined : 714581
Total Pages Processed (Data) : 567369
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 143342
Total Pages Failing (Index): 0
Total Pages Processed (Other): 3866
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 0
Total Pages Marked Corrupt : 4
Total Pages Influx : 0
Highest block SCN : 0 (0.0)
10g以后rman備份發現的壞塊信息可以在v$database_block_corruption查到。
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE
---------- ---------- ---------- ------------------ ---------------
2 1217134 1 0 FRACTURED
508 2386314 1 0 FRACTURED
517 1115970 1 0 FRACTURED
2 1200166 1 0 FRACTURED
51 401698 1 0 FRACTURED
61 306642 1 0 FRACTURED
2 502554 1 0 FRACTURED
2 1213658 1 0 FRACTURED
34 701050 1 0 FRACTURED
60 347166 1 0 FRACTURED
FRACTURED意味著數據塊是物理損壞,遇到這種情況,如果有備份的話最好是從備份恢復,甚至可以用rman做塊級恢復,如
blockrecover datafile 2 block 1217134;
或者:
blockrecover corruption list;
不幸的是,這個數據庫是一個新庫,沒有正式上線,也還沒有上備份。沒辦法,只能看看損壞的對象是什么了。
--由于庫太大,直接查dba_extents太慢,因此備份dba_extents到表dbextnet再查詢。
SQL> select * from dbextent where file_id=2 and 1213658 between block_id and block_id+blocks-1;
OWNER SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ -------------------------------------------------------------------------------- ------------------------------ ------------------ ------------------------------ ---------- ---------- ---------- ---------- ---------- ------------
CREATER_USER WIDGET_NEWUSE_IX2 INDEX DWDATA_1M 995 2 1213637 1048576 64 2
SQL> select * from dbextent where file_id=2 and 502554 between block_id and block_id+blocks-1;
OWNER SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ -------------------------------------------------------------------------------- ------------------------------ ------------------ ------------------------------ ---------- ---------- ---------- ---------- ---------- ------------
CREATER_USER WIDGET_NEWUSE_IX2 INDEX DWDATA_1M 1787 2 502533 1048576 64 2
萬幸的是,這里損壞的只是索引,把索引rebuild就可以。如果損壞的是表,則可能需要用到其他手段,如dbms_repare包來處理了。
SQL> alter index creater_user.WIDGET_NEWUSE_IX2 rebuild online parallel 16 tablespace dwdata_10m_1 nologging;
Index altered
......
注意:這里要用rebuild online,否則rebuild可能會讀取原索引作為源來重建新索引,這樣的話新建的索引也是損壞的。
全部處理完畢后,確認是否還有對象損壞。
SQL> select * from dba_extents E,v$database_block_corruption c where e.file_id=c.file# and c.block# between e.block_id and e.block_id+e.blocks-1;
OWNER SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE
------------------------------ -------------------------------------------------------------------------------- ------------------------------ ------------------ ------------------------------ ---------- ---------- ---------- ---------- ---------- ------------ ---------- ---------- ---------- ------------------ ---------------
以上查詢返回結果為空,表示沒有數據庫對象損壞,第一階段處理完畢。
二、后續處理
上面的步驟解決了壞塊對數據庫的潛在影響,但是此時v$database_block_corruption的信息是還沒有清除的,需要清除,需要再運行RMAN,如果沒有壞塊已經修復,才會從v$database_block_corruption刪除對應記錄。
但是在大部分情況下,即使刪除了原來的對象,數據塊的狀態讓人是標識損壞的,但是不影響一般的數據庫操作,可以忽略。如果你看不順眼或者擔心它影響RMAN的備份,可以用以下方法格式化這個數據塊,只有被正常格式化后,數據塊狀態才會恢復正常。
1. 首先創建一個新表
create table s (n number,c varchar2(4000)) nologging tablespace dwdata_1m;
注意表空間要指定在壞塊所在的表空間上(v$database_block_corruption可查詢到)
2. 確保目標數據塊在free list上,并查詢其大小
SQL> Select BYTES from dba_free_space where file_id=2 and 502554 between block_id and block_id + blocks -1;
BYTES
----------
1048576
3. 手工擴展
BEGIN
FOR I IN 1 .. 100000000
LOOP
EXECUTE IMMEDIATE 'alter table s allocate extent(size 1048576 datafile ''+DATA/dwrac/datafile/dwdata_1m01.dbf'')';
END LOOP;
END;
在這一步執行的同時,不斷地查新dba_free_space,直至確認該數據塊已經不在dba_free_space中了,之后就可以把上一步操作的循環中斷。
SQL> Select * from dba_free_space where file_id= 2 and 502554 between block_id and block_id + blocks -1;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
-------------------------------- ---------- ---------- ---------- ---------- ------------
4. 下一步就是要寫里寫入數據了
BEGIN
FOR i IN 1..1000000000 LOOP
INSERT /*+ APPEND */ INTO s select i, lpad('REFORMAT',3092, 'R') from dual connect by rownum<=10000;
commit ;
END LOOP;
END;
/
注意:必須要寫入數據,數據塊才能被重新格式化,才能修復塊的狀態。
5. 在寫入數據的同時,不斷用dbv檢測壞塊情況
--指定start和end參數,提高效率
[oracle@dwdb02 admin]$ dbv file="+DATA/dwrac/datafile/dwdata_1m01.dbf" start=502554 end=502554 userid=admin/sdoadmin123 blocksize=16384
DBVERIFY: Release 10.2.0.5.0 - Production on Thu Feb 17 15:56:53 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
DBVERIFY - Verification starting : FILE = +DATA/dwrac/datafile/dwdata_1m01.dbf
DBVERIFY - Verification complete
Total Pages Examined : 714581
Total Pages Processed (Data) : 610675
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 100202
Total Pages Failing (Index): 0
Total Pages Processed (Other): 3704
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 0
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 0 (0.0)
如果Total Pages Marked Corrupt=0,則表示數據塊已經修復了。此時就可以取消上一步的數據insert操作。
6. 用rman重新validate一次
RMAN> backup check logical validate datafile 2;
Starting backup at 17-FEB-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00508 name=+DATA/dwrac/datafile/dwdata_1m01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:20:45
Finished backup at 17-FEB-11
正常的話,這一步完了以后,v$database_block_corruption里對應的壞塊記錄就應該被刪除了。
SQL> select * from v$database_block_corruption t where file#=2 and block#=502554;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE
---------- ---------- ---------- ------------------ ---------------
重復上面的步驟直至v$database_block_corruption不包括任何數據。
以上只是壞塊的處理過程,但是壞塊產生的原因仍然不明朗,需進一步挖掘,但是從這些例子也可以看出備份的重要性!
總結
- 上一篇: 如何通过DBLINK取REMOTE DB
- 下一篇: 使用 Authid Current_Us