从你提供的 Data Guard 配置信息来看,核心问题是物理备库 ycjj2dg 的重做应用(Redo Apply)被停止,报出ORA-16766: Redo Apply is stopped错误,同时传输延迟、应用延迟等指标均为(unknown),配置状态和数据库状态均为 ERROR。以下是逐步排查和解决的方案:
错误分析
该错误表示备库的介质恢复(Redo Apply)进程未运行,可能的原因包括:
备库的恢复进程(MRP)被手动停止;
备库无法读取归档日志 / 重做日志(文件缺失、权限不足、路径错误);
主备库之间的重做传输中断,备库无新日志可应用;
备库处于只读打开状态但未启动实时应用,或恢复进程异常终止;
主备库的数据文件不一致(如备库数据文件损坏、主库执行了备库未同步的操作如NOLOGGING)。
解决问题
步骤 1:检查备库的归档日志和重做传输状态
1. 登录备库的 SQL*Plus(以 sysdba 身份)
sqlplus / as sysdba
2. 检查主备库的重做传输状态(备库执行)
-- 查看归档日志的接收情况
SELECT SEQUENCE#, APPLIED, STATUS FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;
-- 查看重做传输的状态(主库执行更准确,备库可看V$LOGSTDBY_PROCESS)
-- 备库查看是否有MRP进程
SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS
FROM V$MANAGED_STANDBY;
如果V$MANAGED_STANDBY中没有MRP0或MRP进程,说明恢复进程未启动;
如果有MRP0进程但状态为ERROR,说明恢复过程中出现异常。
3. 检查主库的重做日志发送状态(主库执行)
-- 查看日志传输状态
SELECT DESTINATION, STATUS, ARCHIVED_THREAD#, ARCHIVED_SEQ#
FROM V$ARCHIVE_DEST_STATUS WHERE STATUS != 'INACTIVE';
-- 检查是否有传输错误
SELECT MESSAGE FROM V$ARCHIVE_DEST_STATUS WHERE ERROR != '';
步骤 2:重启备库的 Redo Apply(核心解决步骤)
如果恢复进程未启动,手动启动 Redo Apply(备库执行),分两种场景:
场景 1:非实时应用(传统归档应用)
-- 启动介质恢复(基于归档日志)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
场景 2:实时应用(Active Data Guard,备库只读打开时)
-- 先确保备库处于只读模式(如果之前是关闭的,先启动到mount再打开只读)
ALTER DATABASE MOUNT;
ALTER DATABASE OPEN READ ONLY;
-- 启动实时应用(实时应用主库的在线重做日志)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
若启动时出现错误(如日志缺失)
检查缺失的日志序列:
-- 备库查看缺失的日志序列
SELECT SEQUENCE#, STATUS FROM V$ARCHIVED_LOG WHERE STATUS = 'DELETED' OR APPLIED = 'NO';
从主库手动传输缺失的归档日志到备库,并注册:
-- 备库注册手动传输的归档日志
ALTER DATABASE REGISTER PHYSICAL LOGFILE '/u01/app/oracle/oradata/archivelog/xxx.dbf';
重新执行恢复命令。
步骤 3:排查深层问题(若重启后仍报错)
如果启动 Redo Apply 后仍然报ORA-16766,需排查以下问题:
Protection Mode: MaxPerformance
Databases:
ycjj - Primary database
ycjj2dg - Physical standby database
Warning: ORA-16826: apply service state is inconsistent with the DelayMins property
Fast-Start Failover: DISABLED
Configuration Status:
WARNING
DGMGRL>
DGMGRL> show database ycjj2dg;
Database - ycjj2dg
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 1 minute 33 seconds (computed 12 seconds ago)
Apply Lag: 1 day(s) 20 hours 20 minutes 7 seconds (computed 9 seconds ago)
Apply Rate: 9.49 MByte/s
Real Time Query: OFF
Instance(s):
ycjj
Database Warning(s):
ORA-16826: apply service state is inconsistent with the DelayMins property
Database Status:
WARNING
DGMGRL>
现在的问题从之前的ORA-16766(Redo Apply 停止)变为了ORA-16826: apply service state is inconsistent with the DelayMins property(应用服务状态与DelayMins属性不一致),同时备库出现 ** 传输延迟较小(1 分 33 秒)但应用延迟极大(1 天 20 小时)** 的情况。以下是问题分析和解决步骤:
解决问题
步骤 1:明确DelayMins配置和当前 Redo Apply 的运行模式
1. 查看 Data Guard 的DelayMins属性(DGMGRL 执行)
DGMGRL> show database ycjj2dg DelayMins;
从之前的输出可知DelayMins = '0'(无延迟),这是默认的正常配置。
2. 检查备库当前 Redo Apply 的运行状态(备库 SQL*Plus 执行,sysdba 身份)
-- 查看备库的恢复进程状态(关键看是否有DELAY配置)
SELECT PROCESS, STATUS, CLIENT_PROCESS, COMMAND
FROM V$MANAGED_STANDBY
WHERE PROCESS IN ('MRP0', 'MRP');
-- 查看备库的恢复配置(是否启用延迟应用)
SELECT NAME, VALUE FROM V$PARAMETER WHERE NAME LIKE '%delay%';
-- 查看归档日志的应用情况(重点看最新序列的APPLIED状态)
SELECT SEQUENCE#, APPLIED, FIRST_TIME, NEXT_TIME
FROM V$ARCHIVED_LOG
ORDER BY SEQUENCE# DESC FETCH FIRST 20 ROWS ONLY;
如果V$MANAGED_STANDBY中MRP0进程的COMMAND列显示RECOVER但带有DELAY,说明实际运行了延迟应用;
如果归档日志的最新序列APPLIED为NO,且序列号远小于已接收的序列,说明应用被阻塞或延迟。
步骤 2:解决配置冲突并重启无延迟的 Redo Apply(核心步骤)
1. 先停止当前的 Redo Apply 进程(备库执行)
-- 停止所有正在运行的介质恢复进程
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
2. 启动无延迟的 Redo Apply(匹配DelayMins=0的配置)
根据备库的运行模式选择对应命令:
场景 1:备库处于 MOUNT 状态(非 Active Data Guard
-- 启动无延迟的介质恢复,直接应用已接收的归档日志(无延迟)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
DISCONNECT FROM SESSION; -- 无DELAY参数,对应DelayMins=0
场景 2:备库处于 READ ONLY 状态(Active Data Guard,实时应用)
-- 启动实时应用(使用主库当前的在线日志,无延迟)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE
DISCONNECT FROM SESSION; -- 实时应用,彻底消除延迟
关键说明:
上述命令未指定DELAY参数,与 Data Guard 的DelayMins=0属性完全一致,会直接消除ORA-16826警告;
如果之前启动 Redo Apply 时误加了DELAY [n] MINUTES参数(如ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DELAY 60 DISCONNECT;),会导致实际运行状态与DelayMins=0冲突,这是引发ORA-16826的最常见原因。
步骤 3:解决应用延迟极大的问题(传输正常但应用慢)
如果重启 Redo Apply 后,应用延迟仍很大(如 1 天 20 小时),需排查以下原因:
1. 检查备库的 Redo Apply 并行度(提升应用速度)
-- 查看当前并行度配置
SHOW PARAMETER APPLY_PARALLEL; -- Data Guard属性:ApplyParallel = 'AUTO'
SHOW PARAMETER RECOVERY_PARALLELISM;
-- 手动提升并行度(备库执行,停止恢复后设置)
ALTER SYSTEM SET RECOVERY_PARALLELISM = 4 SCOPE=BOTH; -- 根据CPU核心数调整,如4/8/16
-- 重启Redo Apply时指定并行度(可选)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
PARALLEL 4
DISCONNECT FROM SESSION;
2. 检查备库是否存在日志应用阻塞
-- 查看备库的等待事件(是否有IO/锁阻塞)
SELECT EVENT, COUNT(*) FROM V$SESSION
WHERE STATUS = 'WAITING' AND PROGRAM LIKE '%ora_mrp%'
GROUP BY EVENT;
-- 主库执行:检查是否有NOLOGGING操作
SELECT TABLE_NAME, LOGGING FROM DBA_TABLES WHERE LOGGING = 'NO';
SELECT INDEX_NAME, LOGGING FROM DBA_INDEXES WHERE LOGGING = 'NO';
-- 查看主库的日志历史中是否有NOLOGGING记录
SELECT TIMESTAMP, OPERATION, LOGGING FROM V$NOLOGGING_OPERATION;
如果存在NOLOGGING操作,需通过RMAN 增量备份同步备库数据文件,再重启 Redo Apply。
步骤 4:验证问题是否解决
1. 回到 DGMGRL 检查配置状态
DGMGRL> show configuration;
DGMGRL> show database ycjj2dg;
正常情况下,ORA-16826警告会消失,配置状态变为SUCCESS;
传输延迟和应用延迟会逐渐缩小(实时应用时应用延迟接近 0)。
2. 备库 SQL*Plus 验证应用状态
-- 查看MRP进程状态(应为APPLYING_LOG)
SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';
-- 查看最新日志的应用情况
SELECT SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE# DESC FETCH FIRST 1 ROW ONLY;
总结
当前ORA-16826的本质是Redo Apply 的实际运行状态(可能带延迟)与 Data Guard 的DelayMins=0配置冲突,同时备库应用延迟极大是因为日志应用未以无延迟 / 实时模式运行。解决步骤的核心是:停止现有恢复进程,启动无延迟的 Redo Apply(或实时应用),并根据情况提升并行度、排查阻塞因素。如果是生产环境,建议启用 Active Data Guard 的实时应用(USING CURRENT LOGFILE),彻底避免延迟问题。
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Dec 21 22:32:14 2025
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/ycjj2dg/ycjj/trace/ycjj_pr00_100415.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery interrupted!
Recovered data files to a consistent state at change 11191119625
Sun Dec 21 22:32:17 2025
MRP0: Background Media Recovery process shutdown (ycjj)
Sun Dec 21 22:32:17 2025
Managed Standby Recovery Canceled (ycjj)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT PARALLEL 8 USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (ycjj)
Sun Dec 21 22:32:17 2025
MRP0 started with pid=30, OS id=101257
MRP0: Background Managed Standby Recovery process started (ycjj)
started logmerger process
Sun Dec 21 22:32:22 2025
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 8 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /u01/app/oracle/oradata/archivelog1_43228_1196150430.dbf
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT PARALLEL 8 USING CURRENT LOGFILE
Sun Dec 21 22:32:27 2025
RFS[4]: Selected log 5 for thread 1 sequence 43958 dbid 523858460 branch 1196150430
Media Recovery Log /u01/app/oracle/oradata/archivelog1_43229_1196150430.dbf
Sun Dec 21 22:32:37 2025
Archived Log entry 778 added for thread 1 sequence 43958 ID 0x2095edec dest 1:
Sun Dec 21 22:33:01 2025
RFS[3]: Selected log 5 for thread 1 sequence 43959 dbid 523858460 branch 1196150430
Sun Dec 21 22:33:01 2025
Archived Log entry 779 added for thread 1 sequence 43957 ID 0x2095edec dest 1:
Sun Dec 21 22:33:17 2025
Media Recovery Log /u01/app/oracle/oradata/archivelog1_43230_1196150430.dbf
Sun Dec 21 22:33:28 2025
Media Recovery Log /u01/app/oracle/oradata/archivelog1_43231_1196150430.dbf
2. 启动无延迟的实时应用(匹配 DelayMins=0,消除 ORA-16826)
根据备库的运行模式选择命令,推荐使用 Active Data Guard 的实时应用模式(备库只读打开,直接应用主库在线日志):
场景 1:备库当前处于 MOUNT 状态
-- 先打开备库为只读模式(Active Data Guard)
ALTER DATABASE OPEN READ ONLY;
DGMGRL> show database ycjj2dg
Database - ycjj2dg
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 4 minutes 15 seconds (computed 0 seconds ago)
Apply Lag: 1 day(s) 14 hours 33 minutes 53 seconds (computed 0 seconds ago)
Apply Rate: 3.53 MByte/s
Real Time Query: OFF
Instance(s):
ycjj
Database Warning(s):
ORA-16770: Redo Apply not started since physical standby database is opening
-- 启动实时应用(无延迟,直接应用主库当前在线日志)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE -- 实时应用主库在线日志,消除延迟
PARALLEL 8 -- 高并行加速应用
DISCONNECT FROM SESSION; -- 后台运行,不占用当前会话
场景 2:备库已处于 READ ONLY 状态
直接执行上述RECOVER MANAGED STANDBY DATABASE命令即可。
场景 3:无法开启 Active Data Guard(仅 MOUNT 模式)
-- 启动无延迟的归档日志应用(高并行)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
PARALLEL 8
DISCONNECT FROM SESSION;
步骤 3:排查应用延迟极大的根因(若重启后仍有大延迟)
如果启动后应用延迟仍高达 1 天 20 小时,说明日志应用被阻塞,需排查以下关键点:
1. 检查备库是否缺失归档日志(最常见原因)
-- 备库查询未应用的日志,重点看是否有GAP(日志序列不连续)
SELECT * FROM V$ARCHIVE_GAP;
-- 查看所有归档日志的接收和应用状态
SELECT SEQUENCE#, APPLIED, STATUS, FIRST_TIME
FROM V$ARCHIVED_LOG
ORDER BY SEQUENCE# DESC FETCH FIRST 50 ROWS ONLY;
如果V$ARCHIVE_GAP有记录,说明存在日志间隙,需从主库拷贝缺失的日志到备库并注册:
-- 主库查找缺失的日志文件(假设GAP的序列是100-105)
SELECT NAME FROM V$ARCHIVED_LOG WHERE SEQUENCE# BETWEEN 100 AND 105;
查看当前值
-- 备库SQL*Plus执行(sysdba身份)
SHOW PARAMETER RECOVERY_PARALLELISM;
-- 或查询数据字典
SELECT NAME, VALUE, DESCRIPTION FROM V$PARAMETER WHERE NAME = 'recovery_parallelism';
默认值为0:Oracle 会根据系统 CPU 核心数自动分配并行度(如 4 核 CPU 分配 2 个并行进程)。
修改方式
(1)动态修改(即时生效,重启后失效)
-- 临时设置并行度为8(根据CPU核心数调整,如4/8/16)
ALTER SYSTEM SET RECOVERY_PARALLELISM = 8 SCOPE=MEMORY;
(2)永久修改(重启后生效,需修改 spfile)
-- 永久设置并行度为8
ALTER SYSTEM SET RECOVERY_PARALLELISM = 8 SCOPE=SPFILE;