本帖最后由 郑全 于 2020-6-22 00:39 编辑
问题现象:
物理备库转换为逻辑备库后,主库报警文件文件一直报以下错误:
Sun Jun 21 11:13:23 2020
Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
------------------------------------------------------------
问题分析:
经测试,在物理备库没有问题,一经转换成逻辑备库,立马就会报这个问题。
检查发现,主备库的口令都一样,再次把主库的口令拷贝到备库,问题依旧。排除口令文件本身的问题。
remote_login_passwordfile 也是 EXCLUSIVE,排除参数设置的问题。
sys用户 状态为OPEN.排除用户状态的问题。
使用SQLPLUS 登陆主备库,都没有问题。排除密码不一致的问题。
但登陆备库,发现备库有以下错误:
ORA-00604: error occurred at recursive SQL level 1
ORA-06550: line 8, column 133:
PL/SQL: ORA-16224: Database Guard is enabled
ORA-06550: line 8, column 10:
PL/SQL: SQL Statement ignored
ORA-06550: line 13, column 140:
PL/SQL: ORA-16224: Database Guard is enabled
ORA-06550: line 13, column 13:
PL/SQL: SQL Statement ignored
备库没有接收到一个来自主库的归档日志文件,
在转换成逻辑备库后,备库总是出现以下一段错误:
LOGSTDBY: Attempting to pre-register dictionary build logfiles
LOGMINER: Error 308 encountered, failed to read logfile 1_145917_873478377.dbf
LOGMINER: Encountered error 1291 while adding logfile 1_145917_873478377.dbf to session 11
LOGMINER: Error 308 encountered, failed to read logfile 2_143561_873478377.dbf
LOGMINER: Encountered error 1291 while adding logfile 2_143561_873478377.dbf to session 11
LOGMINER: Error 308 encountered, failed to read logfile 1_145916_873478377.dbf
LOGMINER: Encountered error 1291 while adding logfile 1_145916_873478377.dbf to session 11
LOGMINER: Error 308 encountered, failed to read logfile 2_143562_873478377.dbf
LOGMINER: Encountered error 1291 while adding logfile 2_143562_873478377.dbf to session 11
LOGSTDBY: Unable to register recovery logfiles, will resend
ALTER DATABASE START LOGICAL STANDBY APPLY (xxx)
关于这个错误的相关信息,可以参见:文档 ID 1368946.1,文档 ID 2315256.1
但是在备库对应的位置,又发现有这些日志文件,感觉是在注册这个日志文件时,报错了。
检查备库应用状态,
SQL> SELECT TYPE, STATUS_CODE, STATUS FROM V$LOGSTDBY_PROCESS;
TYPE STATUS_CODE STATUS
-----------------------------------------------------------------------------
COORDINATOR 16111 ORA-16111: log mining and apply setting up
检查备库日志:
SELECT thread#,SEQUENCE#, to_char(first_time,'yyyy-mm-dd hh24:mi:ss'),to_char(NEXT_TIME,'yyyy-mm-dd hh24:mi:ss') ,applied
FROM DBA_LOGSTDBY_LOG ORDER BY thread#,SEQUENCE#
记录为 空 。
感觉就是内部没有认到主库传来的归档日志,后面主库的归档日志也不再向主库传。
后面找到MOS上的帖子:
ORA-16191 ORA-01017 on logical standby (文档 ID 2497091.1)
1. Copy the password file from primary to logical standby. If issue still exist...
2. Check cksum value of primary and standby password file. The cksum value should match in both primary and standby.
Ex:
cksum $ORACLE_HOME/<dbs/orapwSID
md5sum $ORACLE_HOME/dbs/orapw${ORACLE_SID}
If issue still exist...
3. Check for the current GAP in the logical standby. If any gap exist, copy the missing archive logs from primary to standby, register it, then start the apply process.
Once apply process completes applying all the gap, the primary should no longer throw ORA-16191 error.
SQL> select thread#, sequence# from dba_logstdby_log l where next_change# not in (select first_change# from dba_logstdby_log where l.thread#=thread#) order by thread#, sequence#;
检查排除,1,2,第3项,dba_logstdby_log 数据字典中,一条记录都有没有 ,也应该排除,但我还是手工去把主库的归档日志拷贝到备库,从 前面308报错的SEQUENCE开始拷贝过来,并再逻辑备库注册这些日志
,注册后,发现 select type,status, STATUS_CODE from v$LOGSTDBY_PROCESS; 记录在发生变化,于是手工把主库后面产生的归档日志都拷贝过来,但都应用完了,后面主库再产生新的日志,主库依然产生报错,看来没有找到问题原因。
后来,再次去分析登陆是否报错,再次发现
Sun Jun 21 19:28:13 2020
Errors in file /u01/app/oracle/diag/rdbms/hisodb/hisodb/trace/hisodb_ora_29745.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-06550: line 8, column 133:
PL/SQL: ORA-16224: Database Guard is enabled
ORA-06550: line 8, column 10:
PL/SQL: SQL Statement ignored
ORA-06550: line 13, column 140:
PL/SQL: ORA-16224: Database Guard is enabled
ORA-06550: line 13, column 13:
PL/SQL: SQL Statement ignored
猜测有触发器,
看文件:
[oracle@bigdata trace]$ cat /u01/app/oracle/diag/rdbms/hisodb/hisodb/trace/hisodb_ora_29745.trc
Trace file /u01/app/oracle/diag/rdbms/hisodb/hisodb/trace/hisodb_ora_29745.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1
System name: Linux
Node name:
Release: 3.10.0-957.el7.x86_64
Version: #1 SMP Thu Oct 4 20:48:51 UTC 2018
Machine: x86_64
Instance name: hisodb
Redo thread mounted by this instance: 1
Oracle process number: 21
Unix process pid: 29745, image: oracle@bigdata (TNS V1-V3)
*** 2020-06-21 19:28:13.283
*** SESSION ID:(421.97) 2020-06-21 19:28:13.283
*** CLIENT ID:() 2020-06-21 19:28:13.283
*** SERVICE NAME:(SYS$USERS) 2020-06-21 19:28:13.283
*** MODULE NAME:(sqlplus@xx (TNS V1-V3)) 2020-06-21 19:28:13.283
*** ACTION NAME:() 2020-06-21 19:28:13.283
Error in executing triggers on connect internal
*** 2020-06-21 19:28:13.283
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)
----- Error Stack Dump -----
ORA-00604: error occurred at recursive SQL level 1
ORA-06550: line 8, column 133:
PL/SQL: ORA-16224: Database Guard is enabled
ORA-06550: line 8, column 10:
PL/SQL: SQL Statement ignored
ORA-06550: line 13, column 140:
PL/SQL: ORA-16224: Database Guard is enabled
ORA-06550: line 13, column 13:
PL/SQL: SQL Statement ignored
*** 2020-06-21 19:36:28.330
LOGMINER: Encountered error 1291 while adding logfile /hisodb/standby_log/thread_1_seq_143561 to session 11
[oracle@ trace]$
看来备库确实有登陆触发器,按常规的检查,主要检查了sys用户的一个触发器,确实有一个触发器,但这个触发器,只是记录IP的
begin
dbms_application_info.set_client_info(sys_context('userenv','ip_address'));
end;
关闭该触发器后,依然报错,难道不是这个问题,于是把备库的触发器全部关闭,
SELECT 'alter trigger ' || owner || '."' || trigger_name || '" disable;' from dba_triggers where status='ENABLED';
奇迹发生了,报警不再报错,备库自动开始挖掘日志,看来确实是触发器的问题。
再次回头去分析系统中的触发器,发现另一用户建立了一个单独的登陆触发器,不是在固定的IP中登陆,会记录IP地址,并报错,如果是固定的IP地址登陆,会记录一条,正好两个 INSERT INTO语句,
declare
ip_addr varchar2(20);
begin
if sys_context('userenv', 'MODULE') in ('xxx') then
begin
select ip into ip_addr from ip_pbtab where ip=sys_context('userenv', 'ip_address');
insert into log_meg values(sys_context('userenv', 'ip_address') || ' allow ' || sys_context('userenv', 'MODULE') || ' ' || user );
commit;
exception
when no_data_found then
insert into log_meg values(sys_context('userenv', 'ip_address') || ' nologin ' || sys_context('userenv', 'MODULE') || ' ' || user );
commit;
RAISE_APPLICATION_ERROR(-20001, 'you dont login ,please contact to dba!');
end;
end if;
end;
其实,只要关闭这个触发器即可,毕竟备库不需要这个。
当然,备库不需要触发器,也可以直接把备库的触发器从系统层直接关闭,所有的触发器都不能激活,也可以达到目的。
ALTER SYSTEM SET "_system_trig_enabled"=FALSE;
参考MOS上的一篇文档:
2129339.1
ORA-16191 ORA-01017 and ORA-16000 shipping the logs to Standby database (文档 ID 2129339.1)
Symptoms
Cause
Solution
References
Applies to: Oracle Database - Enterprise Edition - Version 11.2.0.4 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
Symptoms
NOTE: In the images and/or the document content below, the user information and environment data used represents fictitious data
from the Oracle sample schema(s), Public Documentation delivered with an Oracle database product or other training material.
Any similarity to actual environments, actual persons, living or dead, is purely coincidental and not intended in any manner.
For the purposes of this document, the following fictitious environment is used as an example to describe the procedure:
Standby Database name: test_dg
Primary database errors out when shipping the archived logs to the Physical Standby:
PING[ARC2]: Heartbeat failed to connect to standby 'test_dg'. Error is 16191.
Alert log shows:
Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
------------------------------------------------------------
PING[ARC2]: Heartbeat failed to connect to standby 'test_dg'.
After tracing is enabled, trace log shows ORA-16000:
OCISessionBegin failed -1
.. Detailed OCI error val is 604 and errmsg is 'ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 5
ORA-16000: database open for read-only access
CauseStandby database is using a logon trigger
SolutionDisable the trigger.
or
Set the below parameter in the Standby database and restart:
_system_trig_enabled=false
Alternatively:
ALTER SYSTEM SET "_system_trig_enabled"=FALSE;
A value of FALSE stops system triggers from firing (i.e. triggers on various DDL or database events are disabled).
|