📢📢📢📣📣📣
哈喽!大家好,我是【IT邦德】,江湖人称jeames007,10余年DBA及大数据工作经验
一位上进心十足的【大数据领域博主】!😜😜😜
中国DBA联盟(ACDU)成员,目前服务于工业互联网
擅长主流Oracle、MySQL、PG、高斯及Greenplum运维开发,备份恢复,安装迁移,性能优化、故障应急处理等。
✨ 如果有对【数据库】感兴趣的【小可爱】,欢迎关注【IT邦德】💞💞💞
❤️❤️❤️感谢各位大可爱小可爱!❤️❤️❤️
文章目录
- 📣 1.故障现象
- 📣 2.故障处理
- ✨ 2.1 清理在线日志
- ✨ 2.2 修改隐藏参数
- ✨ 2.3 控制文件备份
- ✨ 2.4 resetlogs开库
- ✨ 2.5 重建UNDO
- 📣 3.恢复后确认
- 📣 4.总结
有些时侯可能你的库处于非归档的模式下,而你的联机重做日志又currupted,你的数据文件不能完成完全的恢复,这里小编为大家介绍一个oracle的一个隐藏参数_allow_resetlogs_corruption,让数据库重生.
📣 1.故障现象
磁盘爆满,业务人员误删除在线日志,导致无法开库,并且没有归档
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: ‘/u01/app/oracle/oradata/ORCL/redo01.log’
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
📣 2.故障处理
1、allow_resetlogs_corruptions参数设置启动到mount状态
2、alter database backup controlfile to trace生成控制文件备份(后续可能要使用)
3、recover database until cancel(选择cancel不需要恢复)
4、alter database open resetlogs
5、如果上述报出ora-00600 2662、2663错误,需要推进scn
6、启动数据库mount状态,推进scn
7、alter database open,如果提示依然需要恢复,
8、重复3操作
9、再进行scn推进
10、推进成功后最后alter database open resetlogs启动数据库
✨ 2.1 清理在线日志
SQL> set linesize 300
SQL> select group#,sequence#,archived,status from v$log;
GROUP# SEQUENCE# ARC STATUS
---------- ---------- --- ----------------
1 7 NO CURRENT
3 6 NO INACTIVE
2 5 NO INACTIVE
-非归档模式,清理非当前日志
alter database clear unarchived logfile group 2;
alter database clear unarchived logfile group 3;
SQL> select group#,sequence#,archived,status from v$log;
GROUP# SEQUENCE# ARC STATUS
---------- ---------- --- ----------------
1 7 NO CURRENT
3 0 NO UNUSED
2 0 NO UNUSED
✨ 2.2 修改隐藏参数
alter system set “_allow_resetlogs_corruption”=true scope=spfile;
之后重启数据库到mount状态
SQL> show parameter _allow_resetlogs_corruption
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
_allow_resetlogs_corruption boolean TRUE
✨ 2.3 控制文件备份
指定trace文件的生成路径
SQL> alter database backup controlfile
to trace as ‘/tmp/controlfile.trc’;
✨ 2.4 resetlogs开库
RMAN> recover database until cancel;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00558: error encountered while parsing input commands
RMAN-01009: syntax error: found "cancel": expecting one of: "available, scn, sequence, time"
RMAN-01007: at line 1 column 24 file: standard input
发现以上不允许使用cancel,那么基于时间点恢复
recover database until time "to_date('2024-03-07 11:00:00','YYYY-MM-DD HH24:MI:SS')";
Starting recover at 08-MAR-24
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=38 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 03/08/2024 00:03:59
RMAN-06555: datafile 1 must be restored from backup created before 07-MAR-24
继续恢复
RMAN> recover database;
Starting recover at 08-MAR-24
using channel ORA_DISK_1
starting media recovery
RMAN-08187: warning: media recovery until SCN 2187454 complete
Finished recover at 08-MAR-24
RMAN> alter database open resetlogs;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00601: fatal error in recovery manager
RMAN-03004: fatal error during execution of command
ORA-01092: ORACLE instance terminated. Disconnection forced
RMAN-03002: failure of sql statement command at 03/08/2024 00:11:38
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], []
Process ID: 8041
Session ID: 35 Serial number: 38091
如果上述报出ora-00600
1.设置隐含参数_allow_error_simulation = TRUE
alter system set "_allow_error_simulation"=true scope=spfile;
SQL> startup force
ORACLE instance started.
Total System Global Area 1191181696 bytes
Fixed Size 8895872 bytes
Variable Size 771751936 bytes
Database Buffers 402653184 bytes
Redo Buffers 7880704 bytes
Database mounted.
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4193], [176], [183], [], [], [],
[], [], [], [], [], []
Process ID: 9078
Session ID: 33 Serial number: 29676
2.undo 4193、4194回滚段错误,
可以先undo手动管理undo_management,启动数据库open;
此时需要推进scn
alter system set undo_management = 'MANUAL' scope=spfile;
SQL> alter session set events '10015 trace name adjust_scn level 10';
3.重新启库
SQL> startup force;
ORACLE instance started.
Total System Global Area 1191181696 bytes
Fixed Size 8895872 bytes
Variable Size 771751936 bytes
Database Buffers 402653184 bytes
Redo Buffers 7880704 bytes
Database mounted.
Database opened.
✨ 2.5 重建UNDO
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
temp_undo_enabled boolean FALSE
undo_management string MANUAL
undo_retention integer 900
undo_tablespace string UNDOTBS1
–删除原有UNDO
drop tablespace UNDOTBS1 including contents and datafiles;
–重新创建UNDO
create undo tablespace UNDOTBS1 datafile ‘/u01/app/oracle/oradata/ORCL/undotbs01.dbf’ size 1G;
–恢复参数文件重启库
alter system set undo_management=‘auto’ scope=spfile;
alter system reset “_allow_resetlogs_corruption” scope=spfile;
alter system reset “_allow_error_simulation” scope=spfile;
SQL> startup force;
📣 3.恢复后确认
SQL> set linesize 300
SQL> select group#,sequence#,archived,status from v$log;
GROUP# SEQUENCE# ARC STATUS
---------- ---------- --- ----------------
1 4 NO CURRENT
2 2 NO INACTIVE
3 3 NO INACTIVE
📣 4.总结
需要使用open resetlogs来强制打开数据库的恢复:
1、对于open resetlogs时数据文件中有不一致数据的情况,可以设置
_allow_resetlogs_corruption=TRUE
2、如果出现ORA-01555错误,导致数据库无法open,可以设置
_CORRUPTED_ROLLBACK_SEGMENTS
_OFFLINE_ROLLBACK_SEGMENTS
undo_management = ‘MANUAL’
3、出现ORA-600[2662]错误时,先通过多次重启open的方法来观察Current SCN BASE增长速度。
如果Current SCN BASE和Current SCN BASE相差不远,重启几次数据库就可以打开。
4、如果Current SCN BASE和Current SCN BASE相差很远需要推SCN
alter session set events ‘10015 trace name adjust_scn level 10’;并且设置隐含参数_allow_error_simulation = TRUE才能使10015事件生效
alter system set “_minimum_giga_scn”=n; SCN向前推进到n10241024*1024,只有Current SCN和dependent SCN相差nG时这个参数才起效,否则无效
alter system set _smu_debug_mode= 268435456,增长SCN WRAP,该参数需要和_allow_error_simulation=true同时使用
oradebug poke修改scn
gdb修改scn
bbed修改数据文件对应的数据块scn
5、如果SCN号一致以后报错ORA-600[6006],ORA-600[4137]的话,需要添加参数
event=“10513 trace name context forever,level 2”
db_block_checking=false
6、出现ORA-600[4193][4194]错误时,尝试设置undo手动管理
7、对于open resetlogs打开以后的数据库,最好将业务用户导出以后重建数据库,以防止数据库出现不可预知的错误。ORACLE官方建议是open resetlogs以后需要重建数据库