有时候不免会遇到更改RAC两台主机的主机名,但这种情况一般都比较特殊。
查阅资料:
在官方文档和MOS上查到如下两段文字,可见clusterware是不直接支持更改主机名的,但可通过删除节点,再添加节点的方式更改。
思想:
更改前的/etc/hosts:
192.168.0.234 node1 node1-public
192.168.35.8 node1-priv
192.168.0.235 node1-vip
192.168.0.236 node2 node2-public
192.168.35.9 node2-priv
192.168.0.237 node2-vip
192.168.0.238 rac-scan
更改后的/etc/hosts规划:
192.168.0.234 r1 r1-public
192.168.35.8 r1-priv
192.168.0.235 r1-vip
192.168.0.236 r2 r2-public
192.168.35.9 r2-priv
192.168.0.237 r2-vip
192.168.0.238 rac-scan
方法:现在要把两个节点(node1、node2)的主机名改为r1、r2,删除2节点,改2节点主机名r2,加2节点进入crs;删除1节点,改1节点主机名为r1,加1节点进入crs。
这里就涉及到四步,删除节点2,添加节点2,删除节点1,添加节点1
(为了方便阅读,这里成删除的节点为旧节点)
1.删除2节点(旧节点)
A.检查2节点是否是active和Unpinned:
olsnodes -s -t
node1 Active Unpinned
node2 Active Unpinned
B. root用户在旧节点 GRID_HOME 上执行,停止当前节点的CRS,执行重新配置。
[root@node2 install]# pwd
/u01/app/11.2.0/grid_1/crs/install
[root@node2 install]# ./rootcrs.pl -deconfig -force
提示:
CRS-4133: Oracle High Availability Services has been stopped.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on this node
C.从1节点上root执行,在正常的节点删除另旧节点
[root@node1 bin]# pwd
/u01/app/11.2.0/grid_1/bin
[root@node1 bin]# ./crsctl delete node -n node2
CRS-4661: Node node2 successfully deleted.
D. 从2节点上grid用户执行,更新旧节点的列表
[root@node2 install]# pwd
/u01/app/11.2.0/grid/oui/bin
[grid@node2 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid_1 "CLUSTER_NODES={node2}" CRS=TRUE -silent -local
提示:'UpdateNodeList' was successful
E. 从2节点上grid用户执行
/u01/app/11.2.0/grid/deinstall/deinstall –local
F.从1节点上grid用户执行
[root@node1 install]# pwd
/u01/app/11.2.0/grid/oui/bin
[grid@node1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid_1 "CLUSTER_NODES={node1}"
CRS=TRUE -silent -local
提示:'UpdateNodeList' was successful
G.在1节点上检查2节点是否被删除成功
[grid@node1 bin]$ cluvfy stage -post nodedel -n node2 -verbose
H.节点2被正确删除后,修改节点2的主机名为r2,修改两个节点的/etc/host
2.加节点2到CRS
A.节点1上grid用户,检查节点2是否满足
[grid@node1 bin]$ cluvfy stage -pre nodeadd -n r2 -fixup -fixupdir /tmp -verbose
因为主机名修改了,两节点间grid用户信任关系需要重建,可以手工配置信任关系ssh
[grid@node1 deinstall]$ pwd
/u01/app/11.2.0/grid_1/deinstall
[grid@node1 deinstall]$ ./sshUserSetup.sh -user grid -hosts node1 r2 -noPromptPassphrase
B.节点1上grid用户执行
$ORACLE_HOME/oui/bin/
[grid@node1 bin]$ ./addNode.sh "CLUSTER_NEW_NODES={r2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={r2-vip}"
执行完全后会出现一个提示
WARNING:
The following configuration scripts need to be executed as the "root" user in each new cluster node. Each script in the list below is followed by a list of nodes.
/u01/app/11.2.0/grid_1/root.sh #On nodes rac2
To execute the configuration scripts:
1. Open a terminal window
2. Log in as "root"
3. Run the scripts in each cluster node
The Cluster Node Addition of /u01/app/11.2.0/grid_1 was successful.
Please check '/tmp/silentInstall.log' for more details.
按照提示在2节点上用root执行:/u01/app/11.2.0/grid_1/root.sh
C. 检查CRS状态
crs_stat -t
--在旧节点名称上删除实例
[root@r2 bin]# ./srvctl remove instance -d orcl -i orcl2 -f -y
##加实例到新节点
srvctl add instance -d orcl -i orcl2 -n r2 -f
##开始实例在新节点
srvctl start instance -d orcl -i orcl2
##查看数据库在节点上的状态
srvctl status database -d orcl
Instance orcl1 is running on node node1
Instance orcl2 is running on node r2
3.删除节点1
A、删除实例(使用root用户在任意节点操作,我在节点一操作的)
[root@node1 ~]# cd /u01/app/11.2.0/grid_1/bin/
[root@node1 bin]# ./srvctl stop instance -d orcl -i orcl1
[root@node1 bin]# ./srvctl remove instance -d orcl -i orcl1 -f -y
B、检查节点是否active
[root@node1 bin]# ./olsnodes -s -t
node1 Active Unpinned
r2 Active Unpinned
C、root用户在旧节点的grid_home上执行
[root@node1 install]# ./rootcrs.pl -deconfig -force
...
CRS-4133: Oracle High Availability Services has been stopped.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on this node
从日志上看可以知道此步骤是在停止旧节点上的CRS
D、root用户在节点2上执行,删除旧节点
[root@r2 ~]# cd /u01/app/11.2.0/grid_1/bin/
[root@r2 ~]#./crsctl delete node -n node1
E、grid用户在旧节点上执行
[grid@node1 ~]cd /u01/app/11.2.0/grid_1/oui/bin/
[grid@node1 bin]# ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid_1 "CLUSTER_NODES={node1}" CRS=TRUE -silent -local
为什么这里参数给的node1?原因是因为此时主机名还没有更改,依然是node1,下面参数给的r2是同理
F、grid用户在旧节点上执行
[grid@node1 ~]cd /u01/app/11.2.0/grid_1/deinstall/
[grid@node1 deinstall]# ./deinstall -local
G、grid用户在节点2上执行
[grid@node1 ~]cd /u01/app/11.2.0/grid_1/oui/bin/
[grid@node1 ~]./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid "CLUSTER_NODES={r2}" CRS=TRUE -silent -local
H、grid用户在节点2上检查旧节点是否被删除成功
[grid@node1 ~]cluvfy stage -post nodedel -n node1 -verbose
I.节点2被正确删除后,修改节点2的主机名为r2,修改两个节点的/etc/host
192.168.0.234 r1 r1-public
192.168.35.8 r1-priv
192.168.0.235 r1-vip
192.168.0.236 r2 r2-public
192.168.35.9 r2-priv
192.168.0.237 r2-vip
192.168.0.238 rac-scan
4.节点2检查CRS,准备接受新节点r1
A、有主node1的主机名被更改为r1,所以需要重新设置连同等效性
方法1使用grid用户自动设置:/u01/app/11.2.0/grid/deinstall/sshUserSetup.sh -user grid -hosts rac1 rac2 -noPromptPassphrase
方法2手工设置连同等效性
B、检查新节点1,r1是否满足加入集群?
用grid用户在节点2上执行:
cluvfy stage -pre nodeadd -n rac1 -fixup -fixupdir /tmp -verbose
C.加节点1到CRS,grid用户在节点2执行
[grid@r2 bin]$ pwd
/u01/app/11.2.0/grid_1/oui/bin
[grid@r2 bin]$ ./addNode.sh "CLUSTER_NEW_NODES={r1}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={r1-vip}"
按提示在新节点r1执行root.sh集群初始化脚本:
[root@r1 ~]# cd /u01/app/11.2.0/grid_1
[root@r1 grid_1]# ./root.sh
直到Configure Oracle Grid Infrastructure for a Cluster ... succeeded
同理:
##在旧节点名称上删除实例
[root@r1 bin]# ./srvctl remove instance -d orcl -i orcl1 -f -y
##加实例到新节点
srvctl add instance -d orcl -i orcl2 -n r1 -f
##开始实例在新节点
srvctl start instance -d orcl -i orcl1
最后查看集群状态成功更改主机名: