重庆思庄Oracle、Redhat认证学习论坛

标题: 11.2.0.4 [ins-41112] specified network interface doesnt maintain connectivity [打印本页]

作者: 郑全    时间: 2013-10-7 13:05
标题: 11.2.0.4 [ins-41112] specified network interface doesnt maintain connectivity

在linux 6.4 下面安装11.2.0.4 grid 时,在网络接口确认时,报以下错误:

 

[ins-41112] specified network interface doesnt maintain connectivity across cluster nodes

 

检查/etc/hosts文件,都灭有问题,而且相互ping 对方,也没有问题,

[root@szrac1 raw]# cat /etc/hosts
127.0.0.1   localhost
192.168.0.201  szrac1
192.168.0.202  szrac2

10.0.0.201     szrac1-priv
10.0.0.202     szrac2-priv

192.168.0.203  szrac1-vip
192.168.0.204  szrac2-vip

192.168.0.205  scan-ip

但在ssh 节点时,发现需要输入密码

 

[grid@szrac2 ~]$ ssh szrac1-priv date
The authenticity of host 'szrac1-priv (10.0.0.201)' can't be established.
RSA key fingerprint is 40:1b:01:3c:98:05:83:b8:39:b8:95:09:2f:62:8c:67.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'szrac1-priv,10.0.0.201' (RSA) to the list of known hosts.
Mon Oct  7 11:30:55 CST 2013
[grid@szrac2 ~]$ ssh szrac2-priv date
The authenticity of host 'szrac2-priv (10.0.0.202)' can't be established.
RSA key fingerprint is 40:1b:01:3c:98:05:83:b8:39:b8:95:09:2f:62:8c:67.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'szrac2-priv,10.0.0.202' (RSA) to the list of known hosts.
Mon Oct  7 11:31:04 CST 2013

 

作了这个后,问题依旧。

 

后看到这篇文章:

 

[INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes. [ID 1427202.1]

--------------------------------------------------------------------------------

  修改时间 05-MAR-2012     类型 REFERENCE     状态 MODERATED   

In this Document
  Purpose
  [INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes.



--------------------------------------------------------------------------------



This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review.



Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.1 and later   [Release: 11.2 and later ]
Information in this document applies to any platform.

Purpose
The note lists problems, solutions or workarounds that's related to the following 11gR2 GI OUI error:


[FATAL] [INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes.
CAUSE: Installer has detected that network interface eth1 does not maintain connectivity on all cluster nodes.
ACTION: Ensure that the chosen interface has been configured across all cluster nodes.




[INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes.

[INS-41112] is a high level error number, the workarounds/solutions depend on the error code from lower layer, however, [INS-41112] does tell which interface is having the issue:


CAUSE: Installer has detected that network interface eth1 does not maintain connectivity on all cluster nodes.

## >> in this case, it's eth1 that's having connectivityissue


To find out lower layer error code, execute the following as grid user:


runcluvfy.sh comp nodecon -i <network-interface> -n <racnode1>,<racnode2>,<racnode3> -verbose

Refer to the following once CVU reports real error code:



?PRVF-7617
Refer to note 1335136.1 for details.



?PRVF-6020 : Different MTU values used across network interfaces in subnet "10.10.10.0"
Refer to note 1429104.1 for details.

 


作者: 郑全    时间: 2013-10-7 13:08
标题: 节点连同性检查,出现问题

马上执行 节点连同性检查:

 

grid@szrac1 grid]$ ./runcluvfy.sh comp nodecon -i eth0 -n szrac1,szrac2 -verbose

Verifying node connectivity

Checking node connectivity...

Checking hosts config file...
  Node Name                             Status                 
  ------------------------------------  ------------------------
  szrac2                                passed                 
  szrac1                                passed                 

Verification of the hosts config file successful


Interface information for node "szrac2"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU  
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth0   192.168.0.202   192.168.0.0     0.0.0.0         192.168.0.1     08:00:27:AC:2E:55 1500 
 eth1   10.0.0.202      10.0.0.0        0.0.0.0         192.168.0.1     08:00:27:47:85:E9 1500 


Interface information for node "szrac1"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU  
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth0   192.168.0.201   192.168.0.0     0.0.0.0         192.168.0.1     08:00:27:AC:2E:55 1500 
 eth1   10.0.0.201      10.0.0.0        0.0.0.0         192.168.0.1     08:00:27:47:85:E9 1500 


Check: Node connectivity for interface "eth0"
  Source                          Destination                     Connected?     
  ------------------------------  ------------------------------  ----------------
  szrac2[192.168.0.202]           szrac1[192.168.0.201]           yes            
Result: Node connectivity passed for interface "eth0"


Check: TCP connectivity of subnet "192.168.0.0"
  Source                          Destination                     Connected?     
  ------------------------------  ------------------------------  ----------------
  szrac1:192.168.0.201            szrac2:192.168.0.202            failed         

ERROR:
PRVF-7617 : Node connectivity between "szrac1 : 192.168.0.201" and "szrac2 : 192.168.0.202" failed
Result: TCP connectivity check failed for subnet "192.168.0.0"

Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.0.0".
Subnet mask consistency check passed for subnet "10.0.0.0".
Subnet mask consistency check passed.

Result: Node connectivity check failed


Verification of node connectivity was unsuccessful on all the specified nodes.

 

根据文章: ID 1427202.1 ,我们察看

 Refer to note 1335136.1 for details.

 

 


作者: 郑全    时间: 2013-10-7 13:10
标题: PRVF-7617: TCP connectivity check failed for subnet (文档 ID 1335136.1)

In this Document
 Purpose
 Details
  Known Issues
  To verify manually
 
When to ignore the error?
 References
--------------------------------------------------------------------------------

Applies to:
 Oracle Database - Enterprise Edition - Version 11.2.0.1 and later
Information in this document applies to any platform.

Purpose

The note is to list problems, solutions or workarounds that's related to the following error:

PRVF-7617: TCP connectivity check failed for subnet

OR

PRVF-7617 : Node connectivity between "racnode1 : 10.10.10.148" and "racnode2 : 10.10.10.149" failed


TCP connectivity check failed for subnet "10.10.10.0"
 
OR
 
 
PRVF-7616 : Node connectivity failed for subnet "10.10.16.0" between "racnode1 - eth5 : 10.10.16.109" and "racnode2 - eth5 : 10.10.16.121"

Result: Node connectivity failed for subnet "10.10.16.0"

 

When the error happens, likely OUI will report:

 

[INS-41110] Specified network interface doesnt maintain connectivity across cluster nodes.
[INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes.

Details

 

Known Issues

 
?bug 12849377 - CVU should check only selected network interfaces (ignore "do not use")

CVU checks network interfaces that's marked "do not use", fixed in 11.2.0.3 GI PSU1


?bug 9952812 - CVU SHOULD RETURN WARNING INSTEAD OF FATAL ERROR FOR VIRBR0

Happens on Linux if network adapter virbr0 exists, fixed in 11.2.0.3.

The fix introduces new CVU parameter (-network) to check only specified networks:

runcluvfy.sh stage -pre crsinst -n <racnode1>,<racnode2> -networks "eth*" -verbose

 
?bug 11903488 - affects Solaris only, fixed 11.2.0.3

As Solaris does not support the socket option SO_RCVTIMEO, TCP server fails to start:

In this example, racnode1 is nodename and 10.1.0.11 is the IP to test connectivity:

/tmp/CVU_<version>_<user>/exectask.sh -runTCPserver racnode1 10.1.0.11

<CV_ERR>location:prvnconss1 opname:free port unavailable category:0 DepInfo: 99</CV_ERR>
<CV_LOG>Exectask:runTCPServer failed</CV_LOG>
..
<CV_ERR>Error running TCP server</CV_ERR>

bug 11903488 also remove the port range of 49900-50000 to use the first available
 exectask.sh -chkTCPclient <server> <server-IP> <server-port> <client> <client-IP>

 
?bug 12353524 - affects hp-ux only, fixed in 11.2.0.3


<CV_ERR>location:prvnconcc3 opname:client to server connection fail
 category:0 otherInfo: Client to server connection failed, errno: 227
 DepInfo: 227</CV_ERR> <CV_VAL>-1</CV_VAL>
 <CV_LOG>Exectask:chkTCPClient failed</CV_LOG> <CV_VRES>1</CV_VRES>
 <CV_ERR>Error checking TCP communication</CV_ERR>
 <CV_ERES>1</CV_ERES>

 
?bug 12608083 - affects Windows only, fixed in 11.2.0.3

When more than one network interface are on the same subnet, it is possible that the wrong interface is used to verify TCP connectivity.

 
?bug 10106374 - affects Windows only, fixed in 11.2.0.2

Refer to note 1286394.1 for details.

 
?bug 16953470 - affects Solaris only, happens when "hostmodel" is set to strong

CVU trace:
[7041@racnode1] [Thread-408] [ 2013-06-13 12:41:17.772 GMT+04:00 ] [StreamReader.run:65] OUTPUT><CV_CMD>/usr/sbin/ping -i 192.168.169.2 192.168.169.2 3 </CV_CMD><CV_VAL>/usr/sbin/ping: sendto Network is unreachable

Manually run the "ping -i" command, receives same error

To find out current "hostmodel":

# ipadm show-prop -p hostmodel ip
PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE
ipv6 hostmodel rw weak weak weak strong, src-prio, rity, weak
ipv4 hostmodel rw weak weak weak strong, src-prio, rity, weak

To change hostmodel:

ipadm set-prop -p hostmodel=weak ipv4
ipadm set-prop -p hostmodel=weak ipv6

The workaround is to set hostmodel to weak 
In addition, Solaris bug 16827053 is open to fix on OS level.
 
?bug 17043435

The bug is closed as duplicate of internal bug 17070860 which is fixed in 11.2.0.4

 

To verify manually


Repeat the following for each interface as grid user:

 

runcluvfy.sh comp nodecon -i <network-interface> -n <racnode1>,<racnode2>,<racnode3> -verbose

 
When to ignore the error?

If the error happened on network that's not related to Oracle Clusterware, it can be ignored, i.e. if happened on administrative network and not affecting anything, it can be ignored.


 


作者: 郑全    时间: 2013-10-7 13:15

通过这个文档,也没有看出什么问题,但看到另一篇文档,

 


作者: 郑全    时间: 2013-10-7 13:15
标题: PRVF-7617 Cluster Verify Fails For Private Network if Firewall Exists (文档 I

PRVF-7617 Cluster Verify Fails For Private Network if Firewall Exists (文档 ID 1357657.1)

In this Document
 Symptoms
 Cause
 Solution
 References
--------------------------------------------------------------------------------

Applies to:
 Oracle Database - Enterprise Edition - Version 11.2.0.1 and later
Information in this document applies to any platform.

Symptoms

During Cluster Verification, a part of cluster installation, the connectivity check between nodes may fail with the following errors

 

Check: TCP connectivity of subnet "10.0.0.0"
Source                         Destination                    Connected?
------------------------------ ------------------------------ ----------------
racnode01:10.0.0.1             racnode02:10.0.0.2             failed

ERROR:
PRVF-7617 : Node connectivity between "racnode01 : 10.0.0.1" and "racnode02 : 10.0.0.2" failed
Result: TCP connectivity check failed for subnet "10.0.0.0"

This may occur on any of the interconnects

Cause

iptables (a Linux firewall) is active between the nodes, blocking network traffic on the cluster interconnect network.

Solution

A temporary solution is to disable iptables. A more permament soution, if iptables is required, is to configure the iptables such that it does not block interconnect traffic(no firewall should exist between cluster nodes).

To disable iptables, use the following commands as root:

For IPV4:

 service iptables save
 service iptables stop
 chkconfig iptables off

 

For IPV6:

service ip6tables save
service ipt6ables stop
chkconfig ip6tables off

 

Note: IPV6 is not supported with Oracle Clusterware/RAC 11gR2

 


作者: 郑全    时间: 2013-10-7 13:18

看到这里,突然想到,我的防火墙没有关,只是关闭了selinux,但没有关闭防火墙。马上关闭防火墙,再测试网络的联通性,

成功,同时,再去安装界面,问题搞定。

 

[grid@szrac1 grid]$ ./runcluvfy.sh comp nodecon -i eth0 -n szrac1,szrac2 -verbose

Verifying node connectivity

Checking node connectivity...

Checking hosts config file...
  Node Name                             Status                 
  ------------------------------------  ------------------------
  szrac2                                passed                 
  szrac1                                passed                 

Verification of the hosts config file successful


Interface information for node "szrac2"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU  
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth0   192.168.0.202   192.168.0.0     0.0.0.0         192.168.0.1     08:00:27:AC:2E:55 1500 
 eth1   10.0.0.202      10.0.0.0        0.0.0.0         192.168.0.1     08:00:27:47:85:E9 1500 


Interface information for node "szrac1"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU  
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth0   192.168.0.201   192.168.0.0     0.0.0.0         192.168.0.1     08:00:27:AC:2E:55 1500 
 eth1   10.0.0.201      10.0.0.0        0.0.0.0         192.168.0.1     08:00:27:47:85:E9 1500 


Check: Node connectivity for interface "eth0"
  Source                          Destination                     Connected?     
  ------------------------------  ------------------------------  ----------------
  szrac2[192.168.0.202]           szrac1[192.168.0.201]           yes            
Result: Node connectivity passed for interface "eth0"


Check: TCP connectivity of subnet "192.168.0.0"
  Source                          Destination                     Connected?     
  ------------------------------  ------------------------------  ----------------
  szrac1:192.168.0.201            szrac2:192.168.0.202            passed         
Result: TCP connectivity check passed for subnet "192.168.0.0"

Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.0.0".
Subnet mask consistency check passed for subnet "10.0.0.0".
Subnet mask consistency check passed.

Result: Node connectivity check passed


Verification of node connectivity was successful.


作者: 郑全    时间: 2013-10-7 13:25

至此,问题搞定。

 

这个问题,之所以出现,就是没有任何文档,随心安装所致,没有作任何准备。

 

在安装rac之前,一定要做充分准备,最好有一篇参考安装文档,这样,就不会在安装过程中,不停的出现错误。

 


作者: 杨芳超    时间: 2013-10-7 13:55
必须顶……
作者: 郑全    时间: 2013-10-7 16:44

最后说一下,防火墙不是必须关闭的,可以不关闭,但要允许私网 multicast通过。

 






欢迎光临 重庆思庄Oracle、Redhat认证学习论坛 (http://bbs.cqsztech.com/) Powered by Discuz! X3.2