重庆思庄Oracle、Redhat认证学习论坛

标题: EXADATA:Invalid command: Command name UNINITIALIZED COMMAND CODE, refid 0 [打印本页]

作者: 郑全    时间: 2021-9-13 18:57
标题: EXADATA:Invalid command: Command name UNINITIALIZED COMMAND CODE, refid 0
        Exadata: Node or Instance evictions due to RDS connectivity issues after the DB node upgrade to 12.2.1.1.0 or higher (文档 ID 2270319.1)        转到底部转到底部       


In this Document

Symptoms

Changes

Cause

Solution

References


APPLIES TO:

Oracle Database - Enterprise Edition - Version 12.1.0.1 to 12.1.0.1 [Release 12.1]

Oracle Exadata Storage Server Software - Version 12.2.1.1.0 to 12.2.1.1.1 [Release 12.2]

Oracle Database Cloud Schema Service - Version N/A and later

Oracle Database Exadata Cloud Machine - Version N/A and later

Oracle Cloud Infrastructure - Database Service - Version N/A and later

Information in this document applies to any platform.

SYMPTOMS

Node or DB instance evictions due to RDS connectivity issues after Exadata storage software upgrade to 12.2.1.1.0 or newer


diskmon.trc:

2017-04-19 15:23:23.985215*: osswait failed: context 0x7ff1d03d97d0 childctx 0x7ff1d03d97d0 timeout 5000 errorcode 38 2017-04-19 15:23:24.343159 : oss_wait done, no request to return. rcode=Process timedout when waiting for I/O completions (46)

2017-04-19 15:23:24.343192 : oss_wait called for request: 0x7ff1d03fa780

2017-04-19 15:24:24.343510 : ossnet_wait_all: WAITED TOO LONG for network request completion: 60000. init_timeout: 4294967295 remaining_timeout: 4294907295


Box 0x7ff1d03f9f80 my_box_refid: 0 source_id: 82127097 (box inc: 10)

Request Flags - 0000000c Callback Context - 0x7ff1c4040e80 Reconnect Time - 0

msec Num. Reconnects - 0

.

Message 0x7ff1c40413d8 with flags 0000000c

RQ_Tag_82127097_7249: RefId - 0, Last Reply Frag - 0

Reply PTR - (nil), expected size - 0, actual size - 0

Message has not been reaped

Command 0x7ff1c4041418 with flags 80000001

Payload Ptr - 0x7ff17c022d00, payload size - 32

RQ_Tag_82127097_7249: Command name IOCTL, refid 7249

Ioctl arguments fd 0 opcode 184 size 32

Reply 0x7ff1c40414d0 with flags 80000000

Payload Ptr - (nil), payload size - 0

Invalid command: Command name UNINITIALIZED COMMAND CODE, refid 0

Number of pollfds - 0

Poll list is not dirty

QOS level requested = 0

QOS support is available

Num pending netmsg - 1

ossnet_setup_connection failed - 0




/var/log/messages file on the OS side would indicate several RDS reconnect messages with vendor error 0xd7 or  0x8a




/var/log/message (DB Node):

Apr 19 15:23:19 dbnode01 kernel: [478147.056379] RDS/IB: send completion <10.nnn.nnn.49,10.nnn.nnn.85,4> status 9 vendor_err 0x8a, disconnecting and reconnecting

Apr 19 15:23:19 dbnode01 kernel: [478147.056567] RDS/IB: connection <10.nnn.nnn.49,10.nnn.nnn.85,4> dropped due to 'DISCONNECTED event'


/var/log/message (Cell):


Apr 19 15:23:19 cel01 kernel: [1062478.077722] RDS/IB: recv completion <10.nnn.nnn.85,10.nnn.nnn.49,4> had status 1 vendor_err 0xd7, disconnecting and reconnecting

Apr 19 15:23:19 cel01 kernel: [1062478.077730] RDS/IB: connection <10.nnn.nnn.85,10.nnn.nnn.49,4> dropped due to 'recv completion error'




Other symptoms include, Nodes not rejoining cluster after crs restarts.


CHANGES

Upgrade to Storage software version 12.2.1.1.0 or newer from a previous release.


CAUSE

Bug 25920916


This is identified as a rolling upgrade issue to 12.2.1.1.0 or newer which comes with UEK4 kernel.   


UEK4 kernel uses 16K fragment size RDS connections, where as the UEK2 kernel uses 4K frag size. During the rolling upgrade, there is a possibility of a 4KB buffer getting into the RDS 16KB frag cache, which creates the RDS connection issues.


SOLUTION

This kernel fix is included in the Exadata storage software maintenance releases 12.2.1.1.1.170605  and 12.2.1.1.2


Potential workaround is to reboot all the affected database nodes







欢迎光临 重庆思庄Oracle、Redhat认证学习论坛 (http://bbs.cqsztech.com/) Powered by Discuz! X3.2