Grid Infrastructure (GI) startup fails because crsd fails to start in a flex asm environment (Doc ID 2392762.1)
In this Document
Symptoms
Cause
Solution
References
APPLIES TO:
Oracle Database - Enterprise Edition - Version 12.2.0.1 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Information in this document applies to any platform.
SYMPTOMS
In a Flex ASM environment, Grid Infrastructure (GI) startup fails when GI is running on one or more of other nodes,
and "crsctl stat res -t -init" output shows that all resources except ora.crsd is up.
The ora.crsd shows either offline or intermediate state.
The cluster alert.log shows the following error:
2018-04-05 15:16:53.918 [CRSD(2697)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 2697
2018-04-05 15:17:00.608 [CRSD(2697)]CRS-1013: The OCR location in an ASM disk group is inaccessible. Details in /u01/app/grid/diag/crs/<name>/crs/trace/crsd.trc.
2018-04-05 15:17:00.615 [CRSD(2697)]CRS-0804: Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage Storage layer error [Insufficient quorum to open OCR devices] [0]]. Details at (:CRSD00111:) in /u01/app/grid/diag/crs/<name>/crs/trace/crsd.trc.
The crsd.trc shows the following error:
2018-04-05 15:17:01.732 : CLSCRED:2919112768: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.
2018-04-05 15:17:01.757 : OCRRAW:2919112768: 8033 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS
2018-04-05 15:17:01.761 : OCRRAW:2919112768: 8033 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS
2018-04-05 15:17:01.798 : CLSCRED:2919112768: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.076fa97b2ac84f70ff7035254e98f38d.root not found
2018-04-05 15:17:01.798 : OCRRAW:2919112768: 7755 Error 4 opening dom root in 0x4d37e30
2018-04-05 15:17:01.816 : OCRRAW:2919112768: kgfnConnect2: kgfnGetBeqData failed
2018-04-05 15:17:01.816*:kgfn.c@4933: kgfnConnect2: kgfnGetBeqData failed
2018-04-05 15:17:01.816 : CSSCLNT:2919112768: clsssinit: initialized context: (0x4f23d50) flags 0x104
2018-04-05 15:17:01.821 : CSSCLNT:2919112768: clsssterm: terminating context (0x4f23d50)
2018-04-05 15:17:01.862 : OCRRAW:2919112768: kgfnConnect2Int: cstr=(DESCRIPTION=(TCP_USER_TIMEOUT=1)(TRANSPORT_CONNECT_TIMEOUT=60)(EXPIRE_TIME=1)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=nn.nn.255.13))(PORT=1526)))(CONNECT_DATA=(SERVICE_NAME=+ASM)))
2018-04-05 15:17:01.862*:kgfn.c@6685: kgfnConnect2Int: cstr=(DESCRIPTION=(TCP_USER_TIMEOUT=1)(TRANSPORT_CONNECT_TIMEOUT=60)(EXPIRE_TIME=1)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=nn.nn.255.13)(PORT=1526)))(CONNECT_DATA=(SERVICE_NAME=+ASM)))
2018-04-05 15:17:01.862*:kgfn.c@6853: kgfnConnect2Int: OCISessionBegin failed
2018-04-05 15:17:03.139 : OCRRAW:2919112768: kgfnRecordErr 1017 OCI error:
ORA-01017: invalid username/password; logon denied
2018-04-05 15:17:03.139*:kgfn.c@1707: kgfnRecordErrPriv: 1017 error=ORA-01017: invalid username/password; logon denied
2018-04-05 15:17:03.140 : default:2919112768: clsCredDomClose: Credctx deleted 0x4d45890
2018-04-05 15:17:03.140 : OCRRAW:2919112768: kgfnConnect2: failed to connect
2018-04-05 15:17:03.140*:kgfn.c@5253: kgfnConnect2: failed to connect
2018-04-05 15:17:03.140 : OCRRAW:2919112768: kgfnConnect2Retry: failed to connect connect after 1 attempts, 143s elapsed
2018-04-05 15:17:03.140 : OCRRAW:2919112768: kgfo_kge2slos error stack at kgfoAl06: ORA-01017: invalid username/password; logon denied
ORA-27300: OS system dependent operation:sslssunreghdlr failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: sskgpreset1
ORA-15077: could not locate ASM instance serving a required diskgroup
CAUSE
The common causes are:
1) The sqlnet.ora shows
SQLNET.AUTHENTICATION_SERVICES=none
The above setting invalidates any OS authentication that crsd needs to connect to a remote asm instance running on another node.
2) The asm password is not correct
3) ASMlistener subnet do not match with the configured interconnect for the private interconnect.
Issue "oifcfg getif" to get the subnet for the private interconnect (cluster interconnect).
This problem is also reported when SQLNET.AUTHENTICATION_SERVICES=all
SOLUTION
1) if the sqlnet.ora shows SQLNET.AUTHENTICATION_SERVICES=none or SQLNET.AUTHENTICATION_SERVICES=all
1) Remove "SQLNET.AUTHENTICATION_SERVICES=none" or "SQLNET.AUTHENTICATION_SERVICES=all" from the Grid Home SQLNET.ORA file (location $ORACLE_HOME/network/admin)
2) Restart the CRS with force.
crsctl stop crs -f
crsctl start crs
Refer to the "Unable to startup CRS as ASM failed to startup with "ORA-01017: invalid username/password; logon denied Document 1681849.1"
2) The asm password is incorrect (this is likely cause if the sqlnet.ora is set up correctly)
1) Recreate the asm password as instructed in the MOS note " How to recreate shared ASM password file in 12c GI cluster Document 1929673.1".
2) Restart the CRS with force.
crsctl stop crs -f
crsctl start crs
3) ASMlistener subnet do not match with the configured interconnect
Recreate the ASMlistener as mentioned in Step 3 from the section "C. For 12c and 18c Oracle Clusterware with Flex ASM" in the Document 283684.1.
A quick workaround is to start the asm manually on the local node using sqlplus.
If the ora.crsd does not become online after a couple of minutes of manually starting asm, then issue "crsctl start res ora.crsd -init" as root.
REFERENCES
NOTE:1929673.1 - How to recreate shared ASM password file in 12c GI cluster
NOTE:1681849.1 - Unable to startup CRS as ASM failed to startup with "ORA-01017: invalid username/password; logon denied"
NOTE:283684.1 - How to Modify Private Network Information in Oracle Clusterware
|