一、故障背景
某项目页面查询模块打开报错,通信通道的文件结束,生产业务受到影响。
二、排查过程
1、查看系统其它模块打开正常,受影响的只有1个功能模块,随即查看oracle告警日志,发现其中有大量报错如下:
Wed Aug 29 07:28:04 2018
Errors in file e:\app\administrator\diag\rdbms\gqdb\gqdb\trace\gqdb_ora_28448.trc (incident=16177):
ORA-03137: TTC 协议内部错误: [12333] [64] [0] [98] [] [] [] []
Incident details in: e:\app\administrator\diag\rdbms\gqdb\gqdb\incident\incdir_16177\gqdb_ora_28448_i16177.trc
Wed Aug 29 07:28:08 2018
Trace dumping is performing id=[cdmp_20180829072808]
Wed Aug 29 07:28:38 2018
Errors in file e:\app\administrator\diag\rdbms\gqdb\gqdb\trace\gqdb_ora_45000.trc (incident=16089):
ORA-03137: TTC 协议内部错误: [12333] [64] [0] [98] [] [] [] []
Incident details in: e:\app\administrator\diag\rdbms\gqdb\gqdb\incident\incdir_16089\gqdb_ora_45000_i16089.trc
Wed Aug 29 07:28:41 2018
Trace dumping is performing id=[cdmp_20180829072841]
Wed Aug 29 07:29:43 2018
Thread 1 cannot allocate new log, sequence 21462
Private strand flush not complete
Current log# 2 seq# 21461 mem# 0: E:\APP\ADMINISTRATOR\ORADATA\GQDB\REDO02.LOG
Thread 1 advanced to log sequence 21462 (LGWR switch)
Current log# 3 seq# 21462 mem# 0: E:\APP\ADMINISTRATOR\ORADATA\GQDB\REDO03.LOG
Wed Aug 29 07:30:00 2018
Errors in file e:\app\administrator\diag\rdbms\gqdb\gqdb\trace\gqdb_j000_40352.trc:
ORA-12012: 自动执行作业 12692 出错
ORA-06550: 第 1 行, 第 729 列:
PLS-00905: 对象 GQZWFW.ASP_REFRESHLEFTTIME 无效
ORA-06550: 第 1 行, 第 729 列:
PL/SQL: Statement ignored
Wed Aug 29 07:33:10 2018
Errors in file e:\app\administrator\diag\rdbms\gqdb\gqdb\trace\gqdb_ora_36984.trc (incident=16178):
ORA-03137: TTC 协议内部错误: [12333] [64] [0] [98] [] [] [] []
Incident details in: e:\app\administrator\diag\rdbms\gqdb\gqdb\incident\incdir_16178\gqdb_ora_36984_i16178.trc
2、根据告警提供的trace文件,详细查看报错信息
*** 2018-08-29 14:04:34.940
*** SESSION ID:(69.59095) 2018-08-29 14:04:34.940
*** CLIENT ID:() 2018-08-29 14:04:34.940
*** SERVICE NAME:(gqdb) 2018-08-29 14:04:34.940
*** MODULE NAME:(w3wp.exe) 2018-08-29 14:04:34.940
*** ACTION NAME:() 2018-08-29 14:04:34.940
Dump continued from file: e:\app\administrator\diag\rdbms\db\db\trace\db_ora_47708.trc
ORA-03137: TTC 协议内部错误: [12333] [254] [64] [0] [] [] [] []
========= Dump for incident 16305 (ORA 3137 [12333]) ========
*** 2018-08-29 14:04:34.946
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=gy425phpm9wq8) -----
SELECT AUDIT_PROJECT.RowGuid from AUDIT_PROJECT,AUDIT_TASK where AUDIT_PROJECT.TASKGUID=AUDIT_TASK.RowGuid and ITEM_ID in (select AUDIT_TASK.ITEM_ID from AUDIT_WINDOW_TASK,AUDIT_WINDOW_USER,AUDIT_TASK where AUDIT_WINDOW_TASK.WINDOWGUID=AUDIT_WINDOW_USER.WINDOWGUID and AUDIT_WINDOW_TASK.TASKGUID=AUDIT_TASK.RowGuid and USERGUID=:UserGuid ) and STATUS=:status
可以看到引起报错的为一条SQL,正好涉及到故障页面查询。
三、查阅资料
根据报错代码,查阅MOS文档
Troubleshooting ORA-3137 [12333]
Errors Encountered When Using Oracle JDBC Driver (文档 ID 1361107.1)
此报错信息来源于11.2.0.1其中一个bug
Unpublished Bug 9703463 - ORA-3137 [12333] or ORA-600 [kpobav-1] When Using Bind Peeking
This bug affects versions 11.1.0.6, 11.1.0.7, and 11.2.0.1 of the RDBMS. It is fixed in version 11.2.0.2 of the database.
It can also occur intermittently; similarly to unpublished Bug:8625762, this is a bind peeking bug.
四、解决方案
1、禁用Bind Peeking
SQL> alter system set "_optim_peek_user_binds"=false;
此参数为oracle的隐含参数
2、升级数据库版本
此bug已在11.2.0.3以上版本修复,可升级此版本或更高
SQL> col ksppinm for a20
SQL> col ksppinm for a30
SQL> col ksppstvl for a30
SQL> col ksppdesc for a30
SQL> SELECT ksppinm, ksppstvl, ksppdesc
FROM x$ksppi x, x$ksppcv y
WHERE x.indx = y.indx AND ksppinm = '_optim_peek_user_binds';
KSPPINM KSPPSTVL KSPPDESC
------------------------------ ------------------------------ ------------------------------
_optim_peek_user_binds TRUE enable peeking of user binds
查看隐含参数,此参数为开启状态
最终选择了禁用隐含参数,关闭特性之后,业务系统模块已恢复,告警日志也不再出现报错信息
|