本帖最后由 郑全 于 2022-8-3 14:31 编辑
1.问题描述:
用户系统管理员电话报OA系统从7月31后,备份没有成功,后台日志报块错误,业务没有报什么影响。
2.检查问题:
通过报警日志检查:
。。。
Mon Aug 01 20:05:03 2022
Hex dump of (file 11, block 612108) in trace file /home/oracle/app/diag/rdbms/xyoa/xyoa/trace/xyoa_ora_21215.trc
Corrupt block relative dba: 0x02c9570c (file 11, block 612108)
Completely zero block found during backing up datafile
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Mon Aug 01 20:05:45 2022
Checker run found 1 new persistent data failures
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x7F0AC57328A8] [PC:0x7F0ACE564B23, _Z12ndmpdPropSetPvS_()+995] [flags: 0x0, count: 1]
Errors in file /home/oracle/app/diag/rdbms/xyoa/xyoa/trace/xyoa_ora_21215.trc (incident=76366):
ORA-07445: exception encountered: core dump [_Z12ndmpdPropSetPvS_()+995] [SIGSEGV] [ADDR:0x7F0AC57328A8] [PC:0x7F0ACE564B23] [Address not mapped to object] []
Incident details in: /home/oracle/app/diag/rdbms/xyoa/xyoa/incident/incdir_76366/xyoa_ora_21215_i76366.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Aug 01 20:06:03 2022
Dumping diagnostic data in directory=[cdmp_20220801200603], requested by (instance=1, osid=21215), summary=[incident=76366].
。。。
Mon Aug 01 22:13:25 2022
Sweep [inc2][76230]: completed
Hex dump of (file 11, block 612108) in trace file /home/oracle/app/diag/rdbms/xyoa/xyoa/incident/incdir_76230/xyoa_m000_480_i76230_a.trc
Corrupt block relative dba: 0x02c9570c (file 11, block 612108)
Completely zero block found during validation
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612108, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Mon Aug 01 22:13:54 2022
Hex dump of (file 11, block 612109) in trace file /home/oracle/app/diag/rdbms/xyoa/xyoa/incident/incdir_76230/xyoa_m000_480_i76230_a.trc
Corrupt block relative dba: 0x02c9570d (file 11, block 612109)
Fractured block found during validation
Data in bad block:
type: 0 format: 0 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x0e6f0601
check value in block header: 0x0
block checksum disabled
Reread of blocknum=612109, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612109, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612109, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612109, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Reread of blocknum=612109, file=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora. found same corrupt data
Checker run found 1 new persistent data failures
。。。
可以看到,从8月1日晚上20点备份开始,就报块错误,具体错误号为:ora-01578
3.确认损坏范围
为了更进一步明确损害得范围,我使用validate database 命令检查,发现一共有三个块错误:612107,612108,612109
RMAN> validate database ;
Starting validate at 03-AUG-22
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1809 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00005 name=/home/oracle/app/oradata/xyoa/V3XSPACE1.ora
input datafile file number=00006 name=/home/oracle/app/oradata/xyoa/V3XSPACE2.ora
input datafile file number=00007 name=/home/oracle/app/oradata/xyoa/V3XSPACE3.ora
input datafile file number=00010 name=/home/oracle/app/oradata/xyoa/V3XSPACE4.ora
input datafile file number=00011 name=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora
input datafile file number=00012 name=/home/oracle/app/oradata/xyoa/V3XSPACE6.ora
input datafile file number=00002 name=/home/oracle/app/oradata/xyoa/sysaux01.dbf
input datafile file number=00009 name=/home/oracle/app/oradata/xyoa/undotbs02.dbf
input datafile file number=00013 name=/home/oracle/app/oradata/xyoa/V3XSPACE7.ora
input datafile file number=00003 name=/home/oracle/app/oradata/xyoa/undotbs01.dbf
input datafile file number=00001 name=/home/oracle/app/oradata/xyoa/system01.dbf
input datafile file number=00008 name=/home/oracle/app/oradata/xyoa/ema.ora
input datafile file number=00004 name=/home/oracle/app/oradata/xyoa/users01.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:56:06
...
SQL> select * from V$DATABASE_BLOCK_CORRUPTION;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
11 612108 1 0 ALL ZERO
11 612109 1 0 FRACTURED
11 612107 1 0 FRACTURED
4.问题分析确定是数据损坏,还是索引损坏
SQL> select OWNER,SEGMENT_NAME,SEGMENT_TYPE from dba_extents where file_id=11 and 612107 between block_id and block_id+blocks-1;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------------------ ------------------------------------------ -----------------
V3X IDX_AFFAIR_M_S_A_T INDEX
SQL> select OWNER,SEGMENT_NAME,SEGMENT_TYPE from dba_extents where file_id=11 and 612108 between block_id and block_id+blocks-1;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------------------ ------------------------------------------ -----------------
V3X IDX_AFFAIR_M_S_A_T INDEX
SQL> select OWNER,SEGMENT_NAME,SEGMENT_TYPE from dba_extents where file_id=11 and 612109 between block_id and block_id+blocks-1;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------------------ ------------------------------------------ -----------------
V3X IDX_AFFAIR_M_S_A_T INDEX
看来是同一个索引。
由于损坏的块都是索引,可以通过删除索引,然后再重建索引来解决,但考虑到索引太大,重建索引过程中,无法使用这个索引,对业务有影响,只有在没有办法的情况下,才会使用这一招。
SQL> r
1* select index_owner,INDEX_NAME,TABLE_NAME,COLUMN_NAME,COLUMN_POSITION from dba_ind_columns where index_name='IDX_AFFAIR_M_S_A_T'
INDEX_OWNER INDEX_NAME TABLE_NAME COLUMN_NAME COLUMN_POSITION
------------------------------ ------------------------------ ------------------------------ ------------------------------ ---------------
V3X IDX_AFFAIR_M_S_A_T CTP_AFFAIR MEMBER_ID 1
V3X IDX_AFFAIR_M_S_A_T CTP_AFFAIR STATE 2
V3X IDX_AFFAIR_M_S_A_T CTP_AFFAIR APP 3
V3X IDX_AFFAIR_M_S_A_T CTP_AFFAIR TRACK 4
V3X IDX_AFFAIR_M_S_A_T CTP_AFFAIR IS_FINISH 5
SQL> select index_name,owner,uniqueness,tablespace_name from dba_indexes where index_name='IDX_AFFAIR_M_S_A_T';
INDEX_NAME OWNER UNIQUENES TABLESPACE_NAME
------------------------------ ------------------------------ --------- ------------------------------
IDX_AFFAIR_M_S_A_T V3X NONUNIQUE V3XSPACE
SQL> select segment_name,bytes/1024/1024 from dba_segments where segment_name='IDX_AFFAIR_M_S_A_T';
SEGMENT_NAME BYTES/1024/1024
--------------------------------------------------------------------------------- ---------------
IDX_AFFAIR_M_S_A_T 768
SQL> select count(*) from V3X.CTP_AFFAIR;
COUNT(*)
----------
15954867
SQL> select segment_name,bytes/1024/1024 from dba_segments where segment_name='CTP_AFFAIR';
SEGMENT_NAME BYTES/1024/1024
--------------------------------------------------------------------------------- ---------------
CTP_AFFAIR 7242
SQL> select segment_name,bytes/1024/1024 from dba_segments where segment_name like '%AFFAIR%' order by 2;
SEGMENT_NAME BYTES/1024/1024
--------------------------------------------------------------------------------- ---------------
IDX_AFFAIR_TT5 480
IDX_CTP_AFFAIR_PLIST 507
PK_CTP_AFFAIR 535.125
IDX_AFFAIR_F_S_A_M 763
IDX_AFFAIR_M_S_A_T 768
IDX_AFFAIR_O_S_A_T 925
IDX_AFFAIR_M_S_A_S 1197
CTP_AFFAIR 7242
19 rows selected.
5.解决措施
考虑到有备份,又不想影响目前业务的使用。决定使用备份来在线解决坏块。
5.1 使用数据恢复助手
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
215941 HIGH OPEN 01-AUG-22 Datafile 11: '/home/oracle/app/oradata/xyoa/V3XSPACE5.ora' contains one or more corrupt blocks
--advise failure;
RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
215941 HIGH OPEN 01-AUG-22 Datafile 11: '/home/oracle/app/oradata/xyoa/V3XSPACE5.ora' contains one or more corrupt blocks
analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1300 device type=DISK
analyzing automatic repair options complete
Mandatory Manual Actions
========================
1. No backup of block 612108 in file 11 was found. Drop and re-create the associated object (if possible), or use the DBMS_REPAIR package to repair the block corruption
2. No backup of block 612109 in file 11 was found. Drop and re-create the associated object (if possible), or use the DBMS_REPAIR package to repair the block corruption
3. No backup of block 612107 in file 11 was found. Drop and re-create the associated object (if possible), or use the DBMS_REPAIR package to repair the block corruption
4. Contact Oracle Support Services if the preceding recommendations cannot be used, or if they do not fix the failures selected for repair
Optional Manual Actions
=======================
no manual actions available
Automated Repair Options
========================
no automatic repair options available
备份在磁带中,需要设置通道到磁带。
5.2 直接使用RMAN恢复
run {
allocate channel ch0 type 'sbt_tape';
send 'NBBSA_SOURCE_MACHINE_NAME=oadata.shineray';
recover datafile 11 block 612107,612107,612109;
release channel ch0;
}
using target database control file instead of recovery catalog
allocated channel: ch0
channel ch0: SID=1809 device type=SBT_TAPE
channel ch0: Symantec/BackupExec/1.1.0
sent command to channel: ch0
Starting recover at 03-AUG-22
channel ch0: restoring block(s)
channel ch0: specifying block(s) to restore from backup set
restoring blocks of datafile 00011
channel ch0: reading from backup piece BE_gp13ua0b_1_1
channel ch0: piece handle=BE_gp13ua0b_1_1 tag=TAG20220730T200114
channel ch0: restored block(s) from backup piece 1
channel ch0: block restore complete, elapsed time: 00:21:15
starting media recovery
media recovery complete, elapsed time: 00:00:15
Finished recover at 03-AUG-22
released channel: ch0
花了1小时。这个过程不影响应用的使用。
6.验证块恢复
RMAN> validate database ;
Starting validate at 03-AUG-22
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1809 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00005 name=/home/oracle/app/oradata/xyoa/V3XSPACE1.ora
input datafile file number=00006 name=/home/oracle/app/oradata/xyoa/V3XSPACE2.ora
input datafile file number=00007 name=/home/oracle/app/oradata/xyoa/V3XSPACE3.ora
input datafile file number=00010 name=/home/oracle/app/oradata/xyoa/V3XSPACE4.ora
input datafile file number=00011 name=/home/oracle/app/oradata/xyoa/V3XSPACE5.ora
input datafile file number=00012 name=/home/oracle/app/oradata/xyoa/V3XSPACE6.ora
input datafile file number=00002 name=/home/oracle/app/oradata/xyoa/sysaux01.dbf
input datafile file number=00009 name=/home/oracle/app/oradata/xyoa/undotbs02.dbf
input datafile file number=00013 name=/home/oracle/app/oradata/xyoa/V3XSPACE7.ora
input datafile file number=00003 name=/home/oracle/app/oradata/xyoa/undotbs01.dbf
input datafile file number=00001 name=/home/oracle/app/oradata/xyoa/system01.dbf
input datafile file number=00008 name=/home/oracle/app/oradata/xyoa/ema.ora
input datafile file number=00004 name=/home/oracle/app/oradata/xyoa/users01.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:56:06
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1 OK 0 21225 247042 883939825
File Name: /home/oracle/app/oradata/xyoa/system01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 204866
Index 0 17517
Other 0 3432
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2 OK 0 133881 2149137 883930663
File Name: /home/oracle/app/oradata/xyoa/sysaux01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 1181879
Index 0 789620
Other 0 43740
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
3 OK 0 1 336000 883936909
File Name: /home/oracle/app/oradata/xyoa/undotbs01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 0
Index 0 0
Other 0 335999
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
4 OK 0 33 641 921351
File Name: /home/oracle/app/oradata/xyoa/users01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 15
Index 0 2
Other 0 590
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
5 OK 0 7 4194302 883942123
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE1.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 1471004
Index 0 565042
Other 0 2158249
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
6 OK 0 7 4194302 883942069
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE2.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 1502611
Index 0 573335
Other 0 2118349
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
7 OK 0 7 4194302 883942063
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE3.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 997466
Index 0 937330
Other 0 2259499
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
8 OK 0 21047 131072 883939623
File Name: /home/oracle/app/oradata/xyoa/ema.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 107506
Index 0 1532
Other 0 987
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
9 OK 0 1 1592320 883925382
File Name: /home/oracle/app/oradata/xyoa/undotbs02.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 0
Index 0 0
Other 0 1592319
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
10 OK 0 31 4194302 883943619
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE4.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 1275372
Index 0 1107654
Other 0 1811245
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
11 OK 0 52 4194302 883941991
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE5.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 910028
Index 0 643602
Other 0 2640620
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
12 OK 0 250726 3970560 883941865
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE6.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 874922
Index 0 429536
Other 0 2415376
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
13 OK 0 1164442 1310720 883934930
File Name: /home/oracle/app/oradata/xyoa/V3XSPACE7.ora
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 20365
Index 0 11326
Other 0 114587
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
including current control file for validation
including current SPFILE in backup set
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Control File and SPFILE
===============================
File Type Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE OK 0 2
Control File OK 0 1308
Finished validate at 03-AUG-22
RMAN>
这一步检查,花了一个小时。
可以看到,已经没有坏块了。
到此,问题解决。
附: 块介质恢复:
|