在centos 7.4上安装oracle rac 11.2.0.4 报错及相关解决
$ cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
1 udev绑定共享磁盘
之前在centos 6上面的命令/sbin/scsi_id 在7上面没有,替换成/usr/lib/udev/scsi_id
--没有分区 for i in b c d e f g; do echo "KERNEL==\"sd*\", SUBSYSTEM==\"block\", PROGRAM==\"/usr/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/\$name\", RESULT==\"`/usr/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/sd$i`\", NAME=\"asm-disk$i\", OWNER=\"grid\", GROUP=\"asmadmin\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules done
[root@rac01 ~]# cat /etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sd*", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/$name", RESULT=="36000c29ea85262d4a23086fbce428b09", NAME="asm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660"
6和7有些区别,不然会报错
SYMLINK+=\"asm-disk$i\"
NAME=\"asm-disk$i\"
[root@rac01 ~]# ls -l /dev/asm* Jul 31 16:31:04 rac01 systemd-udevd[664]: unknown key 'BUS' in /etc/udev/rules.d/99-oracle-asmdevices.rules:11 Jul 31 16:31:04 rac01 systemd-udevd[664]: invalid rule '/etc/udev/rules.d/99-oracle-asmdevices.rules:11' Jul 31 16:31:04 rac01 systemd-udevd[664]: unknown key 'BUS' in /etc/udev/rules.d/99-oracle-asmdevices.rules:12 Jul 31 16:31:04 rac01 systemd-udevd[664]: invalid rule '/etc/udev/rules.d/99-oracle-asmdevices.rules:12' Jul 31 16:44:37 rac01 systemd-udevd[7121]: NAME="asm-diskb" ignored, kernel device nodes can not be renamed; please fix it in /etc/udev/rules.d/99-oracle-asmdevices.rules:1 Jul 31 16:44:41 rac01 systemd-udevd[7133]: NAME="asm-diskc" ignored, kernel device nodes can not be renamed; please fix it in /etc/udev/rules.d/99-oracle-asmdevices.rules:2
重新加载分区
/sbin/partprobe /dev/sdb
[root@rac01 ~]# /usr/lib/udev/scsi_id -g -u /dev/sdb 36000c29ea85262d4a23086fbce428b09
启动udev
/usr/sbin/udevadm control --reload-rules systemctl status systemd-udevd.service systemctl enable systemd-udevd.service
[root@rac01 ~]# /sbin/udevadm trigger --type=devices --action=change [root@rac01 ~]# ll /dev/asm-disk* lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskb -> sdb lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskc -> sdc lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskd -> sdd lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diske -> sde lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskf -> sdf lrwxrwxrwx. 1 root root 3 Jul 31 16:57 /dev/asm-diskg -> sdg
2 grid安装时候,执行root脚本报错
--节点1
[root@rac01 ~]# /u01/app/oraInventory/orainstRoot.sh Changing permissions of /u01/app/oraInventory. Adding read,write permissions for group. Removing read,write,execute permissions for world. Changing groupname of /u01/app/oraInventory to oinstall. The execution of the script is complete. [root@rac01 ~]# /u01/app/11.2.0/grid/root.sh Performing root user operation for Oracle 11g The following environment variables are set as: ORACLE_OWNER= grid ORACLE_HOME= /u01/app/11.2.0/grid Enter the full pathname of the local bin directory: [/usr/local/bin]: Copying dbhome to /usr/local/bin ... Copying oraenv to /usr/local/bin ... Copying coraenv to /usr/local/bin ... Creating /etc/oratab file... Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root script. Now product-specific root actions will be performed. Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params Creating trace directory User ignored Prerequisites during installation Installing Trace File Analyzer OLR initialization - successful root wallet root wallet cert root cert export peer wallet profile reader wallet pa wallet peer wallet keys pa wallet keys peer cert request pa cert request peer cert pa cert peer root cert TP profile reader root cert TP pa root cert TP peer pa cert TP pa peer cert TP profile reader pa cert TP profile reader peer cert TP peer user cert pa user cert Adding Clusterware entries to inittab ohasd failed to start Failed to start the Clusterware. Last 20 lines of the alert log follow: 2019-08-01 09:35:59.951: [client(14411)]CRS-2101:The OLR was formatted using version 3. ^CINT at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1446. /u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed Oracle root script execution aborted!
一开始以为是共享磁盘权限问题
[root@rac01 ~]# ll /dev/asm-disk* lrwxrwxrwx 1 root root 3 Aug 1 09:07 /dev/asm-diskb -> sdb ##修改 [root@rac01 ~]# chown grid:asmadmin /dev/asm-disk* ##并没有作用
参考https://blog.csdn.net/DBAngelica/article/details/85002591
[root@rac01 ~]# touch /usr/lib/systemd/system/ohasd.service [root@rac01 ~]# vim /usr/lib/systemd/system/ohasd.service [Unit] Description=Oracle High Availability Services After=syslog.target [Service] ExecStart=/etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple Restart=always [Install] WantedBy=multi-user.target [root@rac01 ~]# systemctl daemon-reload [root@rac01 ~]# systemctl enable ohasd.service Created symlink from /etc/systemd/system/multi-user.target.wants/ohasd.service to /usr/lib/systemd/system/ohasd.service. [root@rac01 ~]# systemctl start ohasd.service [root@rac01 ~]# systemctl status ohasd.service ● ohasd.service - Oracle High Availability Services Loaded: loaded (/usr/lib/systemd/system/ohasd.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-08-01 10:53:38 CST; 6s ago Main PID: 18621 (init.ohasd) CGroup: /system.slice/ohasd.service └─18621 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple Aug 01 10:53:38 rac01 systemd[1]: Started Oracle High Availability Services. Aug 01 10:53:38 rac01 systemd[1]: Starting Oracle High Availability Services... [root@rac01 ~]# /u01/app/11.2.0/grid/root.sh CRS-4266: Voting file(s) successfully replaced ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 61e053dbaca94f40bfa468e31c9c927f (/dev/asm-diskb) [OCR] 2. ONLINE 6b25d06268b84fe9bfc6125298d94018 (/dev/asm-diskd) [OCR] 3. ONLINE b1fd0f59a3474f92bf0b2d3344fe91cc (/dev/asm-diskc) [OCR] Located 3 voting disk(s). CRS-2672: Attempting to start 'ora.asm' on 'rac01' CRS-2676: Start of 'ora.asm' on 'rac01' succeeded CRS-2672: Attempting to start 'ora.OCR.dg' on 'rac01' CRS-2676: Start of 'ora.OCR.dg' on 'rac01' succeeded Configure Oracle Grid Infrastructure for a Cluster ... succeeded
--节点2执行报错
注意: 为了避免其余节点遇到这种报错,可以在root.sh执行过程中,待/etc/init.d/目录下生成了init.ohasd 文件后执行systemctl start ohasd.service 启动ohasd服务即可。 若没有/etc/init.d/init.ohasd文件 systemctl start ohasd.service 则会启动失败。 [root@rac02 ~]# systemctl status ohasd.service ● ohasd.service - Oracle High Availability Services Loaded: loaded (/usr/lib/systemd/system/ohasd.service; enabled; vendor preset: disabled) Active: failed (Result: start-limit) since Thu 2019-08-01 11:03:58 CST; 3s ago Process: 22754 ExecStart=/etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple (code=exited, status=203/EXEC) Main PID: 22754 (code=exited, status=203/EXEC) Aug 01 11:03:57 rac02 systemd[1]: Unit ohasd.service entered failed state. Aug 01 11:03:57 rac02 systemd[1]: ohasd.service failed. Aug 01 11:03:58 rac02 systemd[1]: ohasd.service holdoff time over, scheduling restart. Aug 01 11:03:58 rac02 systemd[1]: start request repeated too quickly for ohasd.service Aug 01 11:03:58 rac02 systemd[1]: Failed to start Oracle High Availability Services. Aug 01 11:03:58 rac02 systemd[1]: Unit ohasd.service entered failed state. Aug 01 11:03:58 rac02 systemd[1]: ohasd.service failed.
错误日志
[root@rac02 ~]# ll /etc/init.d/init.ohasd ls: cannot access /etc/init.d/init.ohasd: No such file or directory [root@rac02 ~]# ll /etc/init.d/init.ohasd -rwxr-xr-x 1 root root 8782 Aug 1 11:06 /etc/init.d/init.ohasd [root@rac02 ~]# systemctl start ohasd.service [root@rac02 ~]# systemctl status ohasd.service ● ohasd.service - Oracle High Availability Services Loaded: loaded (/usr/lib/systemd/system/ohasd.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-08-01 11:06:20 CST; 4s ago Main PID: 24186 (init.ohasd) CGroup: /system.slice/ohasd.service ├─24186 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple └─24211 /bin/sleep 10 Aug 01 11:06:20 rac02 systemd[1]: Started Oracle High Availability Services. Aug 01 11:06:20 rac02 systemd[1]: Starting Oracle High Availability Services... [root@rac01 rac01]# tail -n 100 -f /u01/app/11.2.0/grid/log/rac01/alertrac01.log 2019-08-01 14:16:30.453: [cssd(21789)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac01 rac02 . [root@rac02 ~]# tail -n 100 -f /u01/app/11.2.0/grid/log/rac02/alertrac02.log The execution of the script is complete. 2019-08-01 14:15:48.037: [ohasd(3604)]CRS-2112:The OLR service started on node rac02. 2019-08-01 14:15:48.059: [ohasd(3604)]CRS-1301:Oracle High Availability Service started on node rac02. 2019-08-01 14:15:48.060: [ohasd(3604)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2019-08-01 14:15:48.545: [/u01/app/11.2.0/grid/bin/oraagent.bin(6497)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:15:51.622: [/u01/app/11.2.0/grid/bin/orarootagent.bin(6501)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 2019-08-01 14:15:53.823: [gpnpd(6592)]CRS-2328:GPNPD started on node rac02. 2019-08-01 14:15:56.234: [cssd(6658)]CRS-1713:CSSD daemon is started in clustered mode 2019-08-01 14:15:58.006: [ohasd(3604)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2019-08-01 14:15:58.006: [ohasd(3604)]CRS-2769:Unable to failover resource 'ora.diskmon'. 2019-08-01 14:16:21.832: [cssd(6658)]CRS-1707:Lease acquisition for node rac02 number 2 completed 2019-08-01 14:16:23.138: [cssd(6658)]CRS-1605:CSSD voting file is online: /dev/asm-diskc; details in /u01/app/11.2.0/grid/log/rac02/cssd/ocssd.log. 2019-08-01 14:16:23.140: [cssd(6658)]CRS-1605:CSSD voting file is online: /dev/asm-diskd; details in /u01/app/11.2.0/grid/log/rac02/cssd/ocssd.log. 2019-08-01 14:16:23.146: [cssd(6658)]CRS-1605:CSSD voting file is online: /dev/asm-diskb; details in /u01/app/11.2.0/grid/log/rac02/cssd/ocssd.log. 2019-08-01 14:16:29.466: [cssd(6658)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac01 rac02 . 2019-08-01 14:16:31.434: [ctssd(7290)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac01. 2019-08-01 14:16:31.435: [ctssd(7290)]CRS-2401:The Cluster Time Synchronization Service started on host rac02. 2019-08-01 14:16:33.170: [ohasd(3604)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2019-08-01 14:16:33.171: [ohasd(3604)]CRS-2769:Unable to failover resource 'ora.diskmon'. 2019-08-01 14:17:30.167: [/u01/app/11.2.0/grid/bin/orarootagent.bin(6603)]CRS-5818:Aborted command 'start' for resource 'ora.ctssd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/rac02/agent/ohasd/orarootagent_root/orarootagent_root.log. 2019-08-01 14:17:34.169: [ohasd(3604)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.ctssd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0/grid/log/rac02/ohasd/ohasd.log. 2019-08-01 14:17:34.183: [ohasd(3604)]CRS-2807:Resource 'ora.asm' failed to start automatically. 2019-08-01 14:17:34.183: [ohasd(3604)]CRS-2807:Resource 'ora.crsd' failed to start automatically. 2019-08-01 14:17:34.183: [ohasd(3604)]CRS-2807:Resource 'ora.evmd' failed to start automatically. 2019-08-01 14:17:51.734: [/u01/app/11.2.0/grid/bin/oraagent.bin(6568)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:19:04.174: [ohasd(3604)]CRS-2765:Resource 'ora.ctssd' has failed on server 'rac02'. 2019-08-01 14:19:06.776: [ctssd(8408)]CRS-2401:The Cluster Time Synchronization Service started on host rac02. 2019-08-01 14:19:06.776: [ctssd(8408)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac01. 2019-08-01 14:19:07.533: [/u01/app/11.2.0/grid/bin/oraagent.bin(6568)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:19:13.266: [/u01/app/11.2.0/grid/bin/oraagent.bin(6568)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/rac02/agent/ohasd/oraagent_grid/oraagent_grid.log" 2019-08-01 14:19:36.864: [crsd(8918)]CRS-1012:The OCR service started on node rac02.
/u01/app/11.2.0/grid/log/rac01/agent/ohasd/oraagent_grid/oraagent_grid.log 019-08-01 14:40:53.671: [ora.gipcd][4109874944]{0:0:156} [check] clsdmc_respget return: status=0, ecode=0 2019-08-01 14:41:18.893: [ CRSCOMM][4152755968] IpcC: IPC client connection 18 to member 0 has been removed 2019-08-01 14:41:18.893: [CLSFRAME][4152755968] Removing IPC Member:{Relative|Node:0|Process:0|Type:2} 2019-08-01 14:41:18.893: [CLSFRAME][4152755968] Disconnected from OHASD:rac01 process: {Relative|Node:0|Process:0|Type:2} 2019-08-01 14:41:18.894: [ AGENT][4142249728]{0:13:10} {0:13:10} Created alert : (:CRSAGF00117:) : Disconnected from server, Agent is shutting down. 2019-08-01 14:41:18.894: [ AGFW][4142249728]{0:13:10} Agent is exiting with exit code: 1 /u01/app/11.2.0/grid/log/rac01/agent/ohasd/oracssdagent_root/oracssdagent_root.log 2019-08-01 15:02:53.928: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member 2019-08-01 15:02:58.928: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member 2019-08-01 15:03:03.928: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member 2019-08-01 15:03:08.929: [ USRTHRD][1509222144]{0:19:163} clsnomon_HangExit: no member
在节点2执行了节点1同样的方法,但root.sh始终执行不成功。
包括手动执行了
# /bin/dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1
依然不行。。。