測试oracle 11g cluster 中OLR的重要性

时间:2024-12-11 13:36:26


測试oracle 11g cluster 中OLR的重要性

called an Oracle Local Registry (OLR): each node in a cluster has a local registry for node-specific resources

測试一:模拟olr异常丢失的情况:

这里首先将olr renam

[root@vmrac2 cdata]# mv vmrac2.olr vmrac2.olr.bak

然后尝试去启动crs

[root@vmrac2 cdata]# crsctl start crs

 CRS-4124: Oracle High Availability Services startup failed.

CRS-4000: Command Start failed, or completed with errors.

然后我们观察下 集群alert log的日志输出情况:

[grid@vmrac2 vmrac2]$ tailf alertvmrac2.log

[ohasd(2495)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in

/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

2014-06-16 16:51:59.491

[ohasd(2506)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in

/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

2014-06-16 16:51:59.698

[ohasd(2517)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in

/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

2014-06-16 16:51:59.901

[ohasd(2528)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in

/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

2014-06-16 16:52:00.113

[ohasd(2539)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage Operating System error [No such file or directory] [2]]. Details at (:OHAS00106:) in

/u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

[client(2554)]CRS-10001:CRS-10132: No msg for has:crs-10132 [10][60]

2014-06-16 16:56:00.720

[ohasd(2717)]CRS-2112:The OLR service started on node vmrac2.

2014-06-16 16:56:00.788

[ohasd(2717)]CRS-1301:Oracle High Availability Service started on node vmrac2.

2014-06-16 16:56:00.855

[ohasd(2717)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors

occurred

2014-06-16 16:56:01.836

[/u02/app/11.2.0.3/grid/bin/orarootagent.bin(2768)]CRS-5016:Process "/u02/app/11.2.0.3/grid/bin/acfsload" spawned by agent

"/u02/app/11.2.0.3/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in

"/u02/app/11.2.0.3/grid/log/vmrac2/agent/ohasd/orarootagent_root/orarootagent_root.log"

2014-06-16 16:56:19.876

[ohasd(2717)]CRS-2302:Cannot get GPnP profile.Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).


2014-06-16 16:56:19.909

[gpnpd(2873)]CRS-2328:GPNPD started on node vmrac2.

2014-06-16 16:56:22.751

[cssd(2947)]CRS-1713:CSSD daemon is started in clustered mode

2014-06-16 16:56:24.073

[ohasd(2717)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE

2014-06-16 16:56:32.512

[cssd(2947)]CRS-1707:Lease acquisition for node vmrac2 number 2 completed

2014-06-16 16:56:33.798

[cssd(2947)]CRS-1605:CSSD voting file is online: ORCL:CRSVOL1; details in /u02/app/11.2.0.3/grid/log/vmrac2/cssd/ocssd.log.

2014-06-16 16:56:40.342

[cssd(2947)]CRS-1601:CSSD Reconfiguration complete. Active nodes are vmrac1 vmrac2 .

2014-06-16 16:56:42.635

[ctssd(3009)]CRS-2401:The Cluster Time Synchronization Service started on host vmrac2.

2014-06-16 16:56:42.635

[ctssd(3009)]CRS-2407:The new Cluster Time Synchronization Service reference node is host vmrac1.

2014-06-16 16:56:46.726

[ctssd(3009)]CRS-2408:The clock on host vmrac2 has been updated by the Cluster Time Synchronization Service to be

synchronous with the mean cluster time.

[client(3047)]CRS-10001:16-Jun-14 16:56 ACFS-9391: Checking for existing ADVM/ACFS installation.

[client(3060)]CRS-10001:16-Jun-14 16:56 ACFS-9392: Validating ADVM/ACFS installation files for operating system.

[client(3062)]CRS-10001:16-Jun-14 16:56 ACFS-9393: Verifying ASM Administrator setup.

[client(3065)]CRS-10001:16-Jun-14 16:56 ACFS-9308: Loading installed ADVM/ACFS drivers.

[client(3069)]CRS-10001:16-Jun-14 16:56 ACFS-9154: Loading 'oracleoks.ko' driver.

[client(3080)]CRS-10001:16-Jun-14 16:56 ACFS-9154: Loading 'oracleadvm.ko' driver.

[client(3096)]CRS-10001:16-Jun-14 16:56 ACFS-9154: Loading 'oracleacfs.ko' driver.

[client(3180)]CRS-10001:16-Jun-14 16:56 ACFS-9327: Verifying ADVM/ACFS devices.

[client(3183)]CRS-10001:16-Jun-14 16:56 ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.

[client(3187)]CRS-10001:16-Jun-14 16:56 ACFS-9156: Detecting control device '/dev/ofsctl'.

[client(3193)]CRS-10001:16-Jun-14 16:56 ACFS-9322: completed

測试二:清空olr的内容,使用一个空文件来取代:

观察alert.log内容例如以下:

[ohasd(5451)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage]. Details at (:OHAS00106:) in /u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

2014-06-16 17:19:02.723

[ohasd(5462)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-26: Error while

accessing the physical storage]. Details at (:OHAS00106:) in /u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log.

[client(5477)]CRS-10001:CRS-10132: No msg for has:crs-10132 [10][60]

观察对应的ohasd.log 日志的内容:

[grid@vmrac2 vmrac2]$ tail -300 /u02/app/11.2.0.3/grid/log/vmrac2/ohasd/ohasd.log

2014-06-16 17:19:02.722: [  OCROSD][1923920288]utread:3: Problem reading buffer 150c4000 buflen 4096 retval 0 phy_offset

102400 retry 5

2014-06-16 17:19:02.722: [  OCRRAW][1923920288]propriogid:1_1: Failed to read the whole bootblock. Assumes invalid format.

2014-06-16 17:19:02.722: [  OCRRAW][1923920288]proprioini: all disks are not OCR/OLR formatted

2014-06-16 17:19:02.722: [  OCRRAW][1923920288]proprinit: Could not open raw device


2014-06-16 17:19:02.722: [  OCRAPI][1923920288]a_init:16!: Backend init unsuccessful : [26]

2014-06-16 17:19:02.723: [  CRSOCR][1923920288] OCR context init failure.  Error: PROCL-26: Error while accessing the

physical storage

2014-06-16 17:19:02.723: [ default][1923920288] Created alert : (:OHAS00106:) :  OLR initialization failed, error: PROCL-

26: Error while accessing the physical storage

2014-06-16 17:19:02.723: [ default][1923920288][PANIC] OHASD exiting; Could not init OLR

2014-06-16 17:19:02.723: [ default][1923920288] Done

总结:

依据上面的測试 能够发现ohasd (Oracle High Availability Service) 依赖于 olr (Oracle Local Registry)中的配置信息 假设olr 异

常,或者丢失都会导致ohasd 进程启动失败。