删除11.2.0.4 rac中 ora.crf的两类文件 和 禁止ora.crf随ohas启动而启动

时间:2020-11-28 18:25:57

删除11.2.0.4 rac中 ora.crf的两类文件 和  禁止ora.crf随ohas启动而启动

背景:
ora.crf服务是为Cluster Health Monitor(以下简称CHM)提供服务的,用来自动收集操作系统的资源(CPU、内存、SWAP、进程、I/O以及网络等)的使用情况。由于bug 10165314,ora.crf服务生成的文件($GI_HOME/crf/db/$HOMENAME/目录中所有的crf*.bdb和$HOSTNAME.ldb)会很大,这样就对$GI_HOME的使用率造成了压力。

下面来说明删除删除这两类文件($GI_HOME/crf/db/$HOMENAME/目录中所有的crf*.bdb和$HOSTNAME.ldb)的步骤 和 禁止ora.crf随ohas启动而启动的步骤。

一、删除crf*.bdb和$HOSTNAME.ldb的步骤

[root@leihost1 leihost1]# /u01/11.2.0/grid/bin/crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'leihost1'
CRS-2677: Stop of 'ora.crf' on 'leihost1' succeeded
[root@leihost1 leihost1]#
[root@leihost1 leihost1]# cd /u01/11.2.0/grid/crf/db/leihost1
[root@leihost1 leihost1]# ls -lrt
total 195748
-rw-r--r-- 1 root root 120000000 Oct 24 20:20 leihost1.ldb
-rw-r--r-- 1 root root 1063396 Oct 24 20:27 24-OCT-2015-20:27:46.txt
-rw-r----- 1 root root 8192 Oct 24 20:28 crfconn.bdb
-rw-r----- 1 root root 16777216 Oct 25 11:06 log.0000000018
-rw-r----- 1 root root 24576 Oct 25 11:37 __db.001
-rw-r----- 1 root root 8192 Oct 25 11:38 repdhosts.bdb
-rw-r----- 1 root root 16777216 Oct 25 11:38 log.0000000019
-rw-r----- 1 root root 2301952 Oct 25 11:38 crfts.bdb
-rw-r----- 1 root root 3522560 Oct 25 11:38 crfloclts.bdb
-rw-r----- 1 root root 2621440 Oct 25 11:38 crfhosts.bdb
-rw-r----- 1 root root 3686400 Oct 25 11:38 crfcpu.bdb
-rw-r----- 1 root root 143720448 Oct 25 11:38 crfclust.bdb
-rw-r----- 1 root root 3244032 Oct 25 11:38 crfalert.bdb
-rw-r----- 1 root root 57344 Oct 25 11:38 __db.006
-rw-r----- 1 root root 401408 Oct 25 11:38 __db.002
-rw-r----- 1 root root 1187840 Oct 25 11:38 __db.005
-rw-r----- 1 root root 2162688 Oct 25 11:38 __db.004
-rw-r----- 1 root root 2629632 Oct 25 11:38 __db.003
[root@leihost1 leihost1]# rm *.bdb
rm: remove regular file `crfalert.bdb'? y
rm: remove regular file `crfclust.bdb'? y
rm: remove regular file `crfconn.bdb'? y
rm: remove regular file `crfcpu.bdb'? y
rm: remove regular file `crfhosts.bdb'? y
rm: remove regular file `crfloclts.bdb'? y
rm: remove regular file `crfts.bdb'? y
rm: remove regular file `repdhosts.bdb'? y
[root@leihost1 leihost1]# du -sh
40M .
[root@leihost1 leihost1]# ls -lrt
total 40200
-rw-r--r-- 1 root root 120000000 Oct 24 20:20 leihost1.ldb
-rw-r--r-- 1 root root 1063396 Oct 24 20:27 24-OCT-2015-20:27:46.txt
-rw-r----- 1 root root 16777216 Oct 25 11:06 log.0000000018
-rw-r----- 1 root root 24576 Oct 25 11:37 __db.001
-rw-r----- 1 root root 16777216 Oct 25 11:38 log.0000000019
-rw-r----- 1 root root 57344 Oct 25 11:38 __db.006
-rw-r----- 1 root root 401408 Oct 25 11:38 __db.002
-rw-r----- 1 root root 1187840 Oct 25 11:38 __db.005
-rw-r----- 1 root root 2162688 Oct 25 11:38 __db.004
-rw-r----- 1 root root 2629632 Oct 25 11:38 __db.003
[root@leihost1 leihost1]# mv leihost1.ldb back_leihost1.ldb
[root@leihost1 leihost1]# ls -lrt
total 40200
-rw-r--r-- 1 root root 120000000 Oct 24 20:20 back_leihost1.ldb
-rw-r--r-- 1 root root 1063396 Oct 24 20:27 24-OCT-2015-20:27:46.txt
-rw-r----- 1 root root 16777216 Oct 25 11:06 log.0000000018
-rw-r----- 1 root root 24576 Oct 25 11:37 __db.001
-rw-r----- 1 root root 16777216 Oct 25 11:38 log.0000000019
-rw-r----- 1 root root 57344 Oct 25 11:38 __db.006
-rw-r----- 1 root root 401408 Oct 25 11:38 __db.002
-rw-r----- 1 root root 1187840 Oct 25 11:38 __db.005
-rw-r----- 1 root root 2162688 Oct 25 11:38 __db.004
-rw-r----- 1 root root 2629632 Oct 25 11:38 __db.003
[root@leihost1 leihost1]# /u01/11.2.0/grid/bin/crsctl status res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=OFFLINE
STATE=OFFLINE
[root@leihost1 leihost1]#

如上摘自:

ODA Nodes Lacking Space Due to Large Cluster Health Monitor FileCrfclust.Bdb (文档 ID 1616910.1)

 

如上步骤可以在clusterware和database 运行的时候操作,原因是:

Is stop/start ora.crf affecting clusterware function or cluster database function?

No, stop/start ora.crf resource will stop and start Cluster Health Monitor and its data collection, it will not affect clusterware or database functionality.


如上摘自:
Cluster Health Monitor (CHM) FAQ (文档 ID 1328466.1)

上面的步骤,请在集群的其他节点上也执行。

 

二、
禁止ora.crf随ohas启动而启动的步骤。
2.1

[root@leihost1 leihost1]# /u01/11.2.0/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

2.2查看修改后的效果:

[root@leihost1 leihost1]# /u01/11.2.0/grid/bin/crsctl  status res ora.crf -init -f
NAME=ora.crf
TYPE=ora.crf.type
STATE=OFFLINE
TARGET=ONLINE
ACL=owner:root:rw-,pgrp:oinstall:rw-,other::r--,user:grid:r-x
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=0
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
AUTO_START=always
CARDINALITY=1
CARDINALITY_ID=0
CHECK_ARGS=
CHECK_COMMAND=
CHECK_INTERVAL=30
CLEAN_ARGS=
CLEAN_COMMAND=
CREATION_SEED=16
DAEMON_LOGGING_LEVELS=CRFMOND=0,CRFLDREP=0,CRFLOGD=0,CRFPROXY=0,OCLUMON=0,OCRAPI=0,OCRCLI=0,OCRMSG=0,CSSCLNT=0,CRFM=0
DAEMON_TRACING_LEVELS=CRFMOND=0,CRFLDREP=0,CRFLOGD=0,CRFPROXY=0,OCLUMON=0,OCRAPI=0,OCRCLI=0,OCRMSG=0,CSSCLNT=0,CRFM=0
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION="Resource type for Crf Agents"
DETACHED=true
ENABLED=0 --------------->此处
FAILOVER_DELAY=0
FAILURE_INTERVAL=3
FAILURE_THRESHOLD=5
HOSTING_MEMBERS=
ID=ora.crf
LOAD=1
LOGGING_LEVEL=1
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
ORA_VERSION=11.2.0.4.0
PID_FILE=
PLACEMENT=balanced
PROCESS_TO_MONITOR=
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
SERVER_POOLS=
START_ARGS=
START_COMMAND=
START_DEPENDENCIES=hard(ora.gpnpd)
START_TIMEOUT=120
STATE_CHANGE_TEMPLATE=
STOP_ARGS=
STOP_COMMAND=
STOP_DEPENDENCIES=hard(shutdown:ora.gipcd)
STOP_TIMEOUT=120
UNRESPONSIVE_TIMEOUT=180
UPTIME_THRESHOLD=1m
USR_ORA_ENV=

[root@leihost1 leihost1]#

2.3 重启crs

# $GRID_ORACLE_HOME/bin/crsctl stop crs
# $GRID_ORACLE_HOME/bin/crsctl start crs

2.4 观察ora.crf的输出:

[root@leihost1 leihost1]# /u01/11.2.0/grid/bin/crsctl  status res ora.crf -init 
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=OFFLINE --->OFFLINE

2.5 观察$GI_HOME/crf/db/$HOMENAME/目录中所有的crf*.bdb和$HOSTNAME.ldb, 不再生成。

上面的步骤,请在集群的其他节点上也执行。


 

参考自:mos文章:
How to Move/Recreate GI Management Repository to Different Shared Storage (Diskgroup, CFS or NFS etc) (文档 ID 1589394.1)

该文章有这么一句话:
1. Stop and disable ora.crf resource.

On each node, as root user:
# <GI_HOME>/bin/crsctl stop res ora.crf -init
# <GI_HOME>/bin/crsctl modify res ora.crf -attr ENABLED=0 -init