[20190212]删除tab$记录的恢复3.txt

--//春节前几天做了删除tan$记录的测试,链接:
http://blog.itpub.net/267265/viewspace-2565245/=> [20190130]删除tab$记录的恢复.txt
http://blog.itpub.net/267265/viewspace-2565250/=> [20190130]删除tab$记录的恢复2.txt

--//实际上我一直认为能拷贝出来数据就ok了.这样恢复的数据库不能在使用,问题多多.
--//我继续测试实际上建表等操作都会报类似的:
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [94232], [6110], [], [], [], [], [], [], [], []
--//这样的错误.主要问题在于system块的检查更加严格.

--//今天尝试恢复6110,6111,6112之类的错误.

1.环境:
SCOTT@book> @ ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

--//前面已经恢复只读状态exp操作是正常的.具体看前面的文章.
--//首先更正前面脚本的几个错误:http://blog.itpub.net/267265/viewspace-2565250/=> [20190130]删除tab$记录的恢复2.txt

 57                 echo "x /rx dba $dba *kdbr[$begin]" | rlbbed | egrep "^^flag@.*: *0x7c" > /dev/null
--//原来写成echo "x /rx dba $dba *kdbr[$begin]" | rlbbed | egrep "^^flag@.*: 0x7c" > /dev/null ,可能出现多个空格的情况.
 58                 if [ $? -eq 0 ]
 59                 then
 60                     echo "dba=$dba;ckix_value=0" >> scan4a.txt
 61                 fi

$ cat scana.sh
#! /bin/bash
cat clearout.txt | while read dba
do
    echo set dba $dba
    echo -n "assign "
    echo -e  "set dba $dba \np ktbbh" |rlbbed | /bin/grep -B 4 -A 9 "ub4 kxidsqn \+@.*0x00000751$"| egrep "ktbbhitl|ktbitflg" | cut -c11-21,55-60 | paste -d. - - | sed -e 's/   /=/'
--//这里写错,原来写成4204236,实际上应该是$dba,
    echo sum apply dba $dba
done

--//注:http://blog.itpub.net/267265/viewspace-2565250/=> [20190130]删除tab$记录的恢复2.txt 里面的脚本已经更正.

2.我前面做了冷备份.首先恢复冷备份:
$ /bin/cp -r /home/oracle/backup/book_20190122_bad/* /mnt/ramdisk/book/
*/

$ . scan.sh
process 1 start : 2019/02/12 08:45:44 scan dba 1,144 ,  create scan1.txt about ktetbdba,ktetbnbk
process 1 finish: 2019/02/12 08:45:44,enter continue...
process 2 start : 2019/02/12 08:45:45 scan block , get kdbtnrow,kdbtoffs ang grep kdbtnrow=0
process 2 finish: 2019/02/12 08:46:16,enter continue...
process 3 start : 2019/02/12 08:46:50 scan block , create bbed'script scan3_bbed.txt for modify delete of flag  and create scan4a.txt about block of ckix
process 3 finish: 2019/02/12 08:52:01,enter continue...
process 4 start : 2019/02/12 08:52:06 create bbed's scan4_bbed.txt for modify cluster of mref of value
process 4 finish: 2019/02/12 08:53:12,enter continue...
process 5 start : create bbed's scan5_bbed.txt for sum apply
process 5 finish: 2019/02/12 08:54:11,enter continue...

--//查看生成的bbed脚本:
$ cat scan4m_bbed.txt
assign dba 4288539 offset 8169 = 1
assign dba 4288546 offset 8145 = 1

--//前面我提到过这2块dba的不需要恢复(里面记录的scn不是这个事务产生的),或者讲scan4m_bbed.txt脚本记录的dba可能存在多恢复记录的情况.最好仔细检查.

$ grep -v 0x6c scan3_bbed.txt
assign /x dba 4194451 offset 7349 = 0x20
assign /x dba 4197642 offset 7888 = 0x20
assign /x dba 4207636 offset 7087 = 0x20
assign /x dba 4225801 offset 3621 = 0x4c
assign /x dba 4225801 offset 4436 = 0x4c
assign /x dba 4288537 offset 7717 = 0x4c
--//这3条记录存在行链接或者迁移的情况,前面已经解析不再说明.
--//修改scan3_bbed.txt文件,注解如下2行不需要恢复.
$ grep ^# scan3_bbed.txt
#assign /x dba 4288539 offset 7920 = 0x6c
#assign /x dba 4288546 offset 7851 = 0x6c

$ grep  0x6c scan3_bbed.txt | grep -v "^#" |wc
   2963   23704  124395

--//2963+3=2966,这样恢复的记录数量与实际情况相符.

3.执行生成的bbed脚本并修复块 6110,6111,6112错误.

bbed parfile=/home/oracle/bbed/bbed.par cmdfile=/home/oracle/hrp430/bbed/scan3_bbed.txt
bbed parfile=/home/oracle/bbed/bbed.par cmdfile=/home/oracle/hrp430/bbed/scan4k_bbed.txt
bbed parfile=/home/oracle/bbed/bbed.par cmdfile=/home/oracle/hrp430/bbed/scan5_bbed.txt

--//前面提到启动遇到如下错误,主要是因为延迟块提交的问题:
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [94232], [6110], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [9951], [6110], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [31548], [6110], [], [], [], [], [], [], [], []

--//编写脚本如下,主要获得延迟块提交的数据块,设置提交标识.并且将对应的ktbbhitl[N]._ktbitun._ktbitfsc=0.
--//这样会导致verify时出现如下错误,例子:
BBED> set dba 4204236
        DBA             0x004026cc (4204236 1,9932)

BBED> verify
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 9932

Block Checking: DBA = 4204236, Block Type = KTB-managed data block
data header at 0x7fb110a7b25c
kdbchk: space available on commit is incorrect
        tosp=5052 fsc=0 stb=0 avsp=4937
Block 9932 failed with check code 6111
--//解决方法就是assign kdbhtosp=kdbhavsp;sum apply就ok了.
--//我以前处理这个错误有点繁琐.实际上设置fsc=0,设置提交标识为快速提交标识0x2,在执行assign kdbhtosp=kdbhavsp,
--//这样记录里面lock标识不需要设置为0x0,这样简单许多.

$ cat scanb.sh
#! /bin/bash
# get dba of tailchk ,grep begin # line, save scan6.txt
# and then grep -v 0x5f5f06,save clearout.txt,other save notclearout.txt

grep dba scan3_bbed.txt | grep -v "^#" | cut -d" " -f4 | uniq | while read dba
do
        echo -n $dba :
        echo "p dba $dba offset 0 tailchk 8188"| rlbbed | grep "ub4 tailchk"
done >| scan6.txt

grep -v 0x5f5f06 scan6.txt >|  clearout.txt
grep    0x5f5f06 scan6.txt >|  notclearout.txt

# create modify ktbbhitl[N].ktbitflg , ktbbhitl[N]._ktbitun._ktbitfsc=0 and kdbhtosp=kdbhavsp of script.
cat clearout.txt | cut -d" " -f1 |while read dba
do
        echo set dba $dba
    echo -n "assign "
    echo -e  "set dba $dba \np ktbbh" |rlbbed | /bin/grep -B 4 -A 9 "ub4 kxidsqn \+@.*0x00000751$"| egrep "ktbbhitl|ktbitflg" | cut -c11-21,55-60 | paste -d. - - | sed -e 's/   /=/'
    echo -n "assign "
    echo -e  "set dba $dba \np ktbbh" |rlbbed | /bin/grep -B 4 -A 9 "ub4 kxidsqn \+@.*0x00000751$" | egrep "ktbbhitl"| cut -c11-21 | sed -e 's/$/._ktbitun._ktbitfsc=0/'
        echo "assign kdbhtosp=kdbhavsp"
    echo sum apply dba $dba
done >| clearout_bbed.txt

# create modify ktbbhitl[N]._ktbitun._ktbitfsc=0 and kdbhtosp=kdbhavsp of script.
cat notclearout.txt | cut -d" " -f1 |while read dba
do
        echo set dba $dba
#   echo -n "assign "
#   echo -e  "set dba $dba \np ktbbh" |rlbbed | /bin/grep -B 4 -A 9 "ub4 kxidsqn \+@.*0x00000751$"| egrep "ktbbhitl|ktbitflg" | cut -c11-21,55-60 | paste -d. - - | sed -e 's/   /=/'
    echo -n "assign "
    echo -e  "set dba $dba \np ktbbh" |rlbbed | /bin/grep -B 4 -A 9 "ub4 kxidsqn \+@.*0x00000751$" | egrep "ktbbhitl"| cut -c11-21 | sed -e 's/$/._ktbitun._ktbitfsc=0/'
        echo "assign kdbhtosp=kdbhavsp"
    echo sum apply dba $dba
done >| notclearout_bbed.txt

--//执行以上脚本:
$ . scanb.sh
$ grep  "ktbbhitl[1].ktbitflg=" clearout_bbed.txt | grep -v 0x0
$ echo $?
1
--//说明这些块都没有打上提交标识.

$ head -5 clearout_bbed.txt
set dba 4204236
assign ktbbhitl[1].ktbitflg=0x0002
assign ktbbhitl[1]._ktbitun._ktbitfsc=0
assign kdbhtosp=kdbhavsp
sum apply dba 4204236

--//使用vim执行:%s/ktbitflg=0x0/ktbitflg=0x2/.也就是设置提交标识.注意检查替换是否74行.
--//补充说明一下,我前面测试执行的是%s/=0x00/=0x20/,实际上提交标识仅仅占半个字节(4位),这样写有点问题,不过一般不会出现问题.
--//1块修改255条以上的情况在这里不会出现(至少对于这个表是如此).

$ head -5 clearout_bbed.txt
set dba 4204236
assign ktbbhitl[1].ktbitflg=0x2002
assign ktbbhitl[1]._ktbitun._ktbitfsc=0
assign kdbhtosp=kdbhavsp
sum apply dba 4204236

--//执行如下:
bbed parfile=/home/oracle/bbed/bbed.par cmdfile=/home/oracle/hrp430/bbed/clearout_bbed.txt
bbed parfile=/home/oracle/bbed/bbed.par cmdfile=/home/oracle/hrp430/bbed/notclearout_bbed.txt

$ dbv  file=/mnt/ramdisk/book/system01.dbf
DBVERIFY: Release 11.2.0.4.0 - Production on Tue Feb 12 10:37:09 2019
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /mnt/ramdisk/book/system01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 97280
Total Pages Processed (Data) : 64316
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 13442
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 4185
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 15337
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 393502590 (3.393502590)

--//现在OK了.
4.禁用sys.tab$的索引I_TAB1.

--//这样恢复,索引与表存在不一致情况,要禁用sys.tab$的索引I_TAB1.

BBED> x /rnnc dba 1,523 *kdbr[9]
rowdata[1269]                               @4910
-------------
flag@4910: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@4911: 0x01
cols@4912:    3
col    0[2] @4913: 33
col    1[2] @4916: 33
col  2[189] @4919: CREATE INDEX I_TAB1 ON TAB$(BOBJ#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483
645 PCTINCREASE 0 OBJNO 33 EXTENTS (FILE 1 BLOCK 312))

--//设置flag=3c,表示删除.
BBED> assign /x  dba 1,523 offset 4910= 0x3c
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
ub1 rowdata[0]                              @4910     0x3c

BBED> sum apply dba 1,523
Check value for File 1, Block 523:
current = 0x7e6b, required = 0x7e6b

BBED> verify dba 1,523
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 523

Block Checking: DBA = 4194827, Block Type = KTB-managed data block
data header at 0x6eee44
kdbchk: the amount of space used is not equal to block size
        used=4398 fsc=0 avsp=3525 dtl=8120
Block 523 failed with check code 6110
--//先不理会这个错误.

5.启动数据库看看:

SYS@book> startup pfile='/tmp/@.ora';
ORACLE instance started.
Total System Global Area  643084288 bytes
Fixed Size                  2255872 bytes
Variable Size             205521920 bytes
Database Buffers          427819008 bytes
Redo Buffers                7487488 bytes
Database mounted.
Database opened.

SYS@book> shutdown  immediate ;
Database closed.
Database dismounted.
ORACLE instance shut down.

SYS@book> startup open read only pfile='/tmp/book.ora'
ORACLE instance started.
Total System Global Area  643084288 bytes
Fixed Size                  2255872 bytes
Variable Size             205521920 bytes
Database Buffers          427819008 bytes
Redo Buffers                7487488 bytes
Database mounted.
Database opened.

--//注/tmp/book.ora最好加入修改如下:
*._system_trig_enabled=false
*.job_queue_processes=0

SYS@book> select * from sys.tab$ minus select * from orachk001;
no rows selected

SYS@book> select * from orachk001 minus select * from sys.tab$;
no rows selected

--//OK,几乎完美恢复.我执行如下select * from sh.sales;,顺利读出,没有任何问题.
--//现在建立删除表没有任何问题.

SCOTT@book> create table t as select * from all_objects;
Table created.

SCOTT@book> drop table t purge ;
Table dropped.

6.剩下恢复tab$.索引I_TAB1,写得有点长.另写一篇修复索引的帖子,感觉这步很难,也许要先放一放.
--//最后说明一点,我的是测试环境,也许真实的环境更加复杂.我可能还遗漏一些细节...^_^.

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。