ORACLE RAC 11.2.0.4 FOR RHEL6集群无法启动与RHEL的NetworkManager服务有关系,本文介绍一例相关
故障的处理。
1、问题描述:
oracle rac 11.2.0.4 for rhel6主机重启,重启之后发现oracle rac集群无法启动。
rac2主机启动后,发现rac集群启动不了,检查集群进程状态,发现cssd一直处于starting状态
[[email protected] bin]# pwd
/u01/oracle/app/grid/home/bin
[[email protected] bin]# ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE OFFLINE Instance Shutdown
ora.cluster_interconnect.haip
1 ONLINE OFFLINE
ora.crf
1 ONLINE ONLINE rac2
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE OFFLINE STARTING
ora.cssdmonitor
1 ONLINE ONLINE rac2
ora.ctssd
1 ONLINE OFFLINE
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE OFFLINE
ora.evmd
1 ONLINE OFFLINE
ora.gipcd
1 ONLINE ONLINE rac2
ora.gpnpd
1 ONLINE ONLINE rac2
ora.mdnsd
1 ONLINE ONLINE rac2
[[email protected] bin]#
2、问题分析
--由于集群启动到cssd无法继续正常启动,查看cssd的日志ocssd.log
发现有报错:has a disk HB, but no network HB
[[email protected] bin]# tail -f /u01/oracle/app/grid/home/log/rac2/cssd/ocssd.log
2018-12-06 10:16:35.506: [ CSSD][2489263872]clssgmJoinGrock: global grock CRF- new client 0x7f268c118700 with con 0x7f2600004253, requested num -1, flags 0x4000e00
2018-12-06 10:16:35.506: [ CSSD][2489263872]clssgmJoinGrock: ignoring grock join for client not requiring fencing until group information has been received from the master; group name CRF-, member number -1, flags 0x4000e00
2018-12-06 10:16:35.506: [ CSSD][2489263872]clssgmDiscEndpcl: gipcDestroy 0x4253
2018-12-06 10:16:35.506: [ CSSD][2489263872]clssgmDeadProc: proc 0x7f268c116f40
2018-12-06 10:16:35.506: [ CSSD][2489263872]clssgmDestroyProc: cleaning up proc(0x7f268c116f40) con(0x4224) skgpid ospid 2599 with 0 clients, refcount 0
2018-12-06 10:16:35.506: [ CSSD][2489263872]clssgmDiscEndpcl: gipcDestroy 0x4224
2018-12-06 10:16:35.754: [ CSSD][2275362560]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 439510562, wrtcnt, 336931, LATS 4294670080, lastSeqNo 336928, uniqueness 1544015571, timestamp 1544025004/8530004
2018-12-06 10:16:35.992: [ CSSD][2261169920]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2018-12-06 10:16:36.288: [ CSSD][2265900800]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 439510562, wrtcnt, 336933, LATS 4294670610, lastSeqNo 336930, uniqueness 1544015571, timestamp 1544025004/8530494
2018-12-06 10:16:36.775: [ CSSD][2275362560]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 439510562, wrtcnt, 336934, LATS 4294671100, lastSeqNo 336931, uniqueness 1544015571, timestamp 1544025005/8531004
2018-12-06 10:16:36.993: [ CSSD][2261169920]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2018-12-06 10:16:36.993: [GIPCHALO][2280093440] gipchaLowerProcessNode: no valid interfaces found to node for 4294671320 ms, node 0x7f267c0bd080 { host 'rac1', haName 'CSS_raccls', srcLuid 43347245-1561779d, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [572 : 572], createTime 4294099030, sentRegister 1, localMonitor 1, flags 0x4 }
2018-12-06 10:16:37.007: [GIPCHDEM][2485143296] gipchaDaemonInfRequest: sent local interfaceRequest, hctx 0x115c700 [0000000000000010] { gipchaContext : host 'rac2', name 'CSS_raccls', luid '43347245-00000000', numNode 1, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2018-12-06 10:16:37.289: [ CSSD][2265900800]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 439510562, wrtcnt, 336936, LATS 4294671610, lastSeqNo 336933, uniqueness 1544015571, timestamp 1544025005/8531494
2018-12-06 10:16:37.784: [ CSSD][2275362560]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 439510562, wrtcnt, 336937, LATS 4294672110, lastSeqNo 336934, uniqueness 1544015571, timestamp 1544025006/8532004
2018-12-06 10:16:37.907: [ CSSD][2489263872]clssgmExecuteClientRequest: MAINT recvd from proc 2 (0x7f268c0593b0)
2018-12-06 10:16:37.907: [ CSSD][2489263872]clssgmShutDown: Received abortive shutdown request from client.
2018-12-06 10:16:37.907: [ CSSD][2489263872]###################################
2018-12-06 10:16:37.907: [ CSSD][2489263872]clssscExit: CSSD aborting from thread GMClientListener
2018-12-06 10:16:37.907: [ CSSD][2489263872]###################################
2018-12-06 10:16:37.907: [ CSSD][2489263872](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
[[email protected] bin]#
--根据cssd的日志ocssd.log检查节点间通信发现rac1到rac2的public网卡地址无法互相ping通
[[email protected] ~]$ ping rac2
PING rac2 (20.20.20.37) 56(84) bytes of data.
From rac1 (20.20.20.34) icmp_seq=1 Destination Host Unreachable
From rac1 (20.20.20.34) icmp_seq=2 Destination Host Unreachable
From rac1 (20.20.20.34) icmp_seq=3 Destination Host Unreachable
^C
--- rac2 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4998ms
pipe 3
[[email protected] ~]$
[[email protected] bin]# ping rac1
PING rac1 (20.20.20.34) 56(84) bytes of data.
From rac2 (20.20.20.37) icmp_seq=2 Destination Host Unreachable
From rac2 (20.20.20.37) icmp_seq=3 Destination Host Unreachable
From rac2 (20.20.20.37) icmp_seq=4 Destination Host Unreachable
From rac2 (20.20.20.37) icmp_seq=5 Destination Host Unreachable
From rac2 (20.20.20.37) icmp_seq=6 Destination Host Unreachable
From rac2 (20.20.20.37) icmp_seq=7 Destination Host Unreachable
^C
--- rac1 ping statistics ---
20 packets transmitted, 0 received, +15 errors, 100% packet loss, time 19395ms
pipe 4
[[email protected] bin]#
--但是rac1和rac2的priv私有网卡是通的
[[email protected] bin]# ping rac1priv
PING rac1priv (172.25.25.1) 56(84) bytes of data.
64 bytes from rac1priv (172.25.25.1): icmp_seq=1 ttl=64 time=0.636 ms
^C
--- rac1priv ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 727ms
rtt min/avg/max/mdev = 0.636/0.636/0.636/0.000 ms
[[email protected] bin]# ping rac2priv
PING rac2priv (172.25.25.2) 56(84) bytes of data.
64 bytes from rac2priv (172.25.25.2): icmp_seq=1 ttl=64 time=0.018 ms
64 bytes from rac2priv (172.25.25.2): icmp_seq=2 ttl=64 time=0.030 ms
^C
--- rac2priv ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1300ms
rtt min/avg/max/mdev = 0.018/0.024/0.030/0.006 ms
[[email protected] bin]#
[[email protected] ~]$ ping rac1priv
PING rac1priv (172.25.25.1) 56(84) bytes of data.
64 bytes from rac1priv (172.25.25.1): icmp_seq=1 ttl=64 time=0.016 ms
^C
--- rac1priv ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 897ms
rtt min/avg/max/mdev = 0.016/0.016/0.016/0.000 ms
[[email protected] ~]$ ping rac2priv
PING rac2priv (172.25.25.2) 56(84) bytes of data.
64 bytes from rac2priv (172.25.25.2): icmp_seq=1 ttl=64 time=0.162 ms
^C
--- rac2priv ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 819ms
rtt min/avg/max/mdev = 0.162/0.162/0.162/0.000 ms
[[email protected] ~]$
--查看rac2的网络接口和网卡IP地址信息,发现四张网卡的IP地址均为集群私有网卡地址bond1的IP,
虽然bond0有IP地址,但是ping不通
[[email protected] ~]# ifconfig -a
bond0 Link encap:Ethernet HWaddr 08:00:27:11:69:E3
inet addr:20.20.20.37 Bcast:20.20.20.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe11:69e3/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:355 errors:0 dropped:0 overruns:0 frame:0
TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:220099 (214.9 KiB) TX bytes:4611 (4.5 KiB)
bond1 Link encap:Ethernet HWaddr 08:00:27:72:BE:6E
inet addr:172.25.25.2 Bcast:172.25.25.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe72:be6e/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:296 errors:0 dropped:0 overruns:0 frame:0
TX packets:188 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:185467 (181.1 KiB) TX bytes:27072 (26.4 KiB)
eth0 Link encap:Ethernet HWaddr 08:00:27:11:69:E3
inet addr:172.25.25.2 Bcast:172.25.25.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:110 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:88432 (86.3 KiB) TX bytes:0 (0.0 b)
eth1 Link encap:Ethernet HWaddr 08:00:27:9D:E6:00
inet addr:172.25.25.2 Bcast:172.25.25.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:245 errors:0 dropped:0 overruns:0 frame:0
TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:131667 (128.5 KiB) TX bytes:4611 (4.5 KiB)
eth2 Link encap:Ethernet HWaddr 08:00:27:72:BE:6E
inet addr:172.25.25.2 Bcast:172.25.25.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:230 errors:0 dropped:0 overruns:0 frame:0
TX packets:188 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:127461 (124.4 KiB) TX bytes:27072 (26.4 KiB)
eth3 Link encap:Ethernet HWaddr 08:00:27:CA:0D:9C
inet addr:172.25.25.2 Bcast:172.25.25.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:66 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:58006 (56.6 KiB) TX bytes:0 (0.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:20 errors:0 dropped:0 overruns:0 frame:0
TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1141 (1.1 KiB) TX bytes:1141 (1.1 KiB)
[[email protected] ~]#
--如果使用root执行service network restart之后,bond网卡均失去IP地址,并且外部机器均无法与rac2网络互通
--这种bond异常的状况,与rhel的NetworkManager服务管理网络服务有关
3、问题处理
在rhel主机使用网卡bond的状况下,rhel的NetworkManager管理网卡服务,会导致bond网卡绑定ip地址异常,需要
关闭NetworkManager并设置其不随机启动。
--关闭NetworkManager并设置其不随机启动
--重启网络服务,检查bond网卡正常
[[email protected] ~]# ifconfig -a
bond0 Link encap:Ethernet HWaddr 08:00:27:11:69:E3
inet addr:20.20.20.37 Bcast:20.20.20.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe11:69e3/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:5988 errors:0 dropped:0 overruns:0 frame:0
TX packets:173 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5637836 (5.3 MiB) TX bytes:21701 (21.1 KiB)
bond1 Link encap:Ethernet HWaddr 08:00:27:72:BE:6E
inet addr:172.25.25.2 Bcast:172.25.25.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe72:be6e/64 Scope:Link
UP BROADCAST MASTER MULTICAST MTU:1500 Metric:1
RX packets:7780 errors:0 dropped:0 overruns:0 frame:0
TX packets:2532 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5549280 (5.2 MiB) TX bytes:368936 (360.2 KiB)
eth0 Link encap:Ethernet HWaddr 08:00:27:11:69:E3
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:2171 errors:0 dropped:0 overruns:0 frame:0
TX packets:73 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2021551 (1.9 MiB) TX bytes:7343 (7.1 KiB)
eth1 Link encap:Ethernet HWaddr 08:00:27:9D:E6:00
BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3817 errors:0 dropped:0 overruns:0 frame:0
TX packets:100 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3616285 (3.4 MiB) TX bytes:14358 (14.0 KiB)
eth2 Link encap:Ethernet HWaddr 08:00:27:72:BE:6E
BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
RX packets:5916 errors:0 dropped:0 overruns:0 frame:0
TX packets:2532 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3764442 (3.5 MiB) TX bytes:368936 (360.2 KiB)
eth3 Link encap:Ethernet HWaddr 08:00:27:CA:0D:9C
BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
RX packets:1864 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1784838 (1.7 MiB) TX bytes:0 (0.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:811 errors:0 dropped:0 overruns:0 frame:0
TX packets:811 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4242930 (4.0 MiB) TX bytes:4242930 (4.0 MiB)
[[email protected] ~]#
--检查rac2节点集群,发现单节点能够启动
[[email protected] ~]$ crsctl status res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac2 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE rac2
ora.crf
1 ONLINE ONLINE rac2
ora.crsd
1 ONLINE ONLINE rac2
ora.cssd
1 ONLINE ONLINE rac2
ora.cssdmonitor
1 ONLINE ONLINE rac2
ora.ctssd
1 ONLINE ONLINE rac2 ACTIVE:0
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE rac2
ora.evmd
1 ONLINE ONLINE rac2
ora.gipcd
1 ONLINE ONLINE rac2
ora.gpnpd
1 ONLINE ONLINE rac2
ora.mdnsd
1 ONLINE ONLINE rac2
[[email protected] ~]$
小插曲,相比节点rac2,rac1的NetworkManager也在运行,节点rac2的eth网卡却均为public网卡地址,
虽然rac1的eth网卡均为public网卡地址,但是rac1节点的集群实例能正常启动。
[[email protected] ~]# service NetworkManager status
NetworkManager (pid 1848) 正在运行...
[[email protected] ~]#
--节点rac2的eth网卡却均为public网卡地址
[[email protected] ~]$ ifconfig -a
bond0 Link encap:Ethernet HWaddr 08:00:27:F8:54:9E
inet addr:20.20.20.34 Bcast:20.20.20.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fef8:549e/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:42512 errors:0 dropped:0 overruns:0 frame:0
TX packets:1542 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:39197489 (37.3 MiB) TX bytes:170518 (166.5 KiB)
bond0:1 Link encap:Ethernet HWaddr 08:00:27:F8:54:9E
inet addr:20.20.20.26 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
bond0:2 Link encap:Ethernet HWaddr 08:00:27:F8:54:9E
inet addr:20.20.20.28 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
bond0:3 Link encap:Ethernet HWaddr 08:00:27:F8:54:9E
inet addr:20.20.20.25 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
bond1 Link encap:Ethernet HWaddr 08:00:27:6F:A8:F7
inet addr:172.25.25.1 Bcast:172.25.25.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe6f:a8f7/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:27140 errors:0 dropped:0 overruns:0 frame:0
TX packets:43504 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:14877318 (14.1 MiB) TX bytes:39723988 (37.8 MiB)
bond1:1 Link encap:Ethernet HWaddr 08:00:27:6F:A8:F7
inet addr:169.254.220.217 Bcast:169.254.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
eth0 Link encap:Ethernet HWaddr 08:00:27:F8:54:9E
inet addr:20.20.20.34 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:22881 errors:0 dropped:0 overruns:0 frame:0
TX packets:1391 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:20771703 (19.8 MiB) TX bytes:158110 (154.4 KiB)
eth1 Link encap:Ethernet HWaddr 08:00:27:76:6A:BA
inet addr:20.20.20.34 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:19631 errors:0 dropped:0 overruns:0 frame:0
TX packets:151 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18425786 (17.5 MiB) TX bytes:12408 (12.1 KiB)
eth2 Link encap:Ethernet HWaddr 08:00:27:6F:A8:F7
inet addr:20.20.20.34 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:13048 errors:0 dropped:0 overruns:0 frame:0
TX packets:43504 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1914331 (1.8 MiB) TX bytes:39723988 (37.8 MiB)
eth3 Link encap:Ethernet HWaddr 08:00:27:CA:32:4E
inet addr:20.20.20.34 Bcast:20.20.20.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:14092 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:12962987 (12.3 MiB) TX bytes:0 (0.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:79993 errors:0 dropped:0 overruns:0 frame:0
TX packets:79993 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:55828936 (53.2 MiB) TX bytes:55828936 (53.2 MiB)
[[email protected] ~]$
--虽然rac1的eth网卡均为public网卡地址,但是rac1节点的集群实例能正常启动
[[email protected] ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.ARCHDG.dg ora....up.type ONLINE ONLINE rac1
ora.CRSDG.dg ora....up.type ONLINE ONLINE rac1
ora.DATADG.dg ora....up.type ONLINE ONLINE rac1
ora....ER.lsnr ora....er.type ONLINE ONLINE rac1
ora....N1.lsnr ora....er.type ONLINE ONLINE rac1
ora.asm ora.asm.type ONLINE ONLINE rac1
ora.cvu ora.cvu.type ONLINE ONLINE rac1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE rac1
ora.oc4j ora.oc4j.type ONLINE ONLINE rac1
ora.ons ora.ons.type ONLINE ONLINE rac1
ora.orcl.db ora....se.type ONLINE ONLINE rac1
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application OFFLINE OFFLINE
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip ora....t1.type ONLINE ONLINE rac1
ora.rac2.vip ora....t1.type ONLINE ONLINE rac1
ora....ry.acfs ora....fs.type ONLINE ONLINE rac1
ora.scan1.vip ora....ip.type ONLINE ONLINE rac1
[[email protected] ~]$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac1 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE rac1
ora.crf
1 ONLINE ONLINE rac1
ora.crsd
1 ONLINE ONLINE rac1
ora.cssd
1 ONLINE ONLINE rac1
ora.cssdmonitor
1 ONLINE ONLINE rac1
ora.ctssd
1 ONLINE ONLINE rac1 ACTIVE:0
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE rac1
ora.evmd
1 ONLINE ONLINE rac1
ora.gipcd
1 ONLINE ONLINE rac1
ora.gpnpd
1 ONLINE ONLINE rac1
ora.mdnsd
1 ONLINE ONLINE rac1
[[email protected] ~]$
继续处理集群启动异常的问题。
--关闭节点rac1的NetworkManager服务并并设置其不开机启动,然后重启主机
[[email protected] ~]# service NetworkManager status
NetworkManager (pid 1848) 正在运行...
[[email protected] ~]# service NetworkManager stop
停止 NetworkManager 守护进程: [确定]
[[email protected] ~]#
[[email protected] ~]# chkconfig NetworkManager off
[[email protected] ~]#
[[email protected] ~]# reboot
Broadcast message from [email protected]
(/dev/pts/0) at 0:16 ...
The system is going down for reboot NOW!
[[email protected] ~]#
4、问题处理结果,调整完2个节点的NetworkManager后,同时重启2台服务器后,oracle rac集群恢复正常
--调整完2个节点的NetworkManager后,检查集群状态,oracle rac集群恢复正常
[[email protected] ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHDG.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.CRSDG.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.DATADG.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.registry.acfs
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.cvu
1 ONLINE ONLINE rac1
ora.oc4j
1 ONLINE ONLINE rac1
ora.orcl.db
1 ONLINE ONLINE rac2 Open
2 ONLINE ONLINE rac1 Open
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
[[email protected] ~]$
[[email protected] ~]$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac1 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE rac1
ora.crf
1 ONLINE ONLINE rac1
ora.crsd
1 ONLINE ONLINE rac1
ora.cssd
1 ONLINE ONLINE rac1
ora.cssdmonitor
1 ONLINE ONLINE rac1
ora.ctssd
1 ONLINE ONLINE rac1 ACTIVE:0
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE rac1
ora.evmd
1 ONLINE ONLINE rac1
ora.gipcd
1 ONLINE ONLINE rac1
ora.gpnpd
1 ONLINE ONLINE rac1
ora.mdnsd
1 ONLINE ONLINE rac1
[[email protected] ~]$
[[email protected] ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.ARCHDG.dg ora....up.type ONLINE ONLINE rac1
ora.CRSDG.dg ora....up.type ONLINE ONLINE rac1
ora.DATADG.dg ora....up.type ONLINE ONLINE rac1
ora....ER.lsnr ora....er.type ONLINE ONLINE rac1
ora....N1.lsnr ora....er.type ONLINE ONLINE rac2
ora.asm ora.asm.type ONLINE ONLINE rac1
ora.cvu ora.cvu.type ONLINE ONLINE rac1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE rac1
ora.oc4j ora.oc4j.type ONLINE ONLINE rac1
ora.ons ora.ons.type ONLINE ONLINE rac1
ora.orcl.db ora....se.type ONLINE ONLINE rac2
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application OFFLINE OFFLINE
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip ora....t1.type ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application OFFLINE OFFLINE
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip ora....t1.type ONLINE ONLINE rac2
ora....ry.acfs ora....fs.type ONLINE ONLINE rac1
ora.scan1.vip ora....ip.type ONLINE ONLINE rac2
[[email protected] ~]$