ORA-15025 搭建DG环境,restore controlfile报错,提示oracle无法使用ASM存储

时间:2023-03-08 15:59:21

环境说明:

#主库RAC环境

#备库RAC环境,操作系统AIX 6.1 数据库版本11.2.0.3

报错说明:

#主库备份控制文件,传输至备库,备库restore 报错

本篇文档,分为两大阶段:第一阶段:出现报错,查询相关日志

第二阶段:处理报错,尝试操作解决问题

 第一阶段:出现报错,查询日志阶段:  

#如下操作:

RMAN> restore standby controlfile from '/tmp/con.ctl';

Starting restore at 03-JUL-18
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1 instance=rac11g770a device type=DISK

channel ORA_DISK_1: restoring control file
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 07/03/2018 19:36:27
RMAN-10038: database session for channel ORA_DISK_1 terminated unexpectedly

#查询Alert日志:
NOTE: MARK has subscribed
ORA-15025: could not open disk "/dev/rhdisk6"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied

第二阶段,解决问题:

#报错解决过程,经历了三个阶段:ABC,AB都是错误的,如果需要快速解决问题,直接看C 最后阶段,修改组完事

A:修改进程属主属组

chown oracle.asmadmin $ORACLE_HOME/bin/oracle

#查询如下:发现Oracle进程是oralce .oinstall 属组,查询OCR磁盘组所在操作系统层面的ASM磁盘
-bash-4.2# cd /picclife/app/oracle/product/11.2.0/dbhome_1/bin
-bash-4.2# ls -l oracle
-rwsr-s--x    1 oracle   oinstall  300820923 Jul 03 15:42 oracle

#ASM磁盘操作系统: 属组是ASMADMIN ,Oracle没有这个组,Oracle用户没有权限使用ASM磁盘
-bash-4.2# ls -l /dev/rhdisk2
crw-rw----    1 grid     asmadmin     13,  3 Jul 03 19:26 /dev/rhdisk2

#直接修改Oracle进程,属组为asmadmin
-bash-4.2# chown oracle:asmadmin oracle
-bash-4.2# ls -l oracle               
-rwsr-s--x    1 oracle   asmadmin  300820923 Jul 03 15:42 oracle

#使用SQLPLUS 报错---  提示组ID不对
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Not owner
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 501 (oinstall), current egid = 503 (asmadmin)

#修改Oracle进程权限,还原操作
-bash-4.2# chmod 6755 oracle
-bash-4.2# ls -l oracle
-rwsr-sr-x    1 oracle   asmadmin  300820923 Jul 03 15:42 oracle

-bash-4.2# ls -l oracle
-rwsrwsr-x    1 oracle   asmadmin  300820923 Jul 03 15:42 oracle
-bash-4.2# chown oracle.oinstall oracle
#修改进程的属主属组,不太安全,直接对Oracle用户进行添加属组:

B:对Oracle用户添加asmadmin组

chgroup users=oracle asmadmin

--通过此命令修改后,出现大坑

-bash-4.2# chgroup users=oracle asmadmin      
-bash-4.2# id oracle
uid=501(oracle) gid=501(oinstall) groups=503(asmadmin),502(dba),504(asmdba)

#重启库,使用restore 可以发现,解决了ORACLE 用户,可以使用ASM存储的目的

RMAN> restore standby controlfile from '/tmp/con.ctl';

rac11g770a:/picclife/app/grid$ id oracle
uid=501(oracle) gid=501(oinstall) groups=503(asmadmin),502(dba),504(asmdba)
rac11g770a:/picclife/app/grid$ id grid
uid=502(grid) gid=501(oinstall) groups=502(dba),504(asmdba),505(asmoper)

#重启库后,发现集群状态不太对

GRID: asmcmd   --连接   发现ASMCMD 无法使用,sqlplus / as sysasm --没有权限!!!!!!!!!!

[grid@rac1 ~]$ . oraenv

ORACLE_SID = [+ASM1] ?

The Oracle base remains unchanged with value /u01/app/oracle

[grid@rac1 ~]$ asmcmd
Connected to an idle instance.
ASMCMD> ls
ASMCMD-8102: no connection to ASM; command requires ASM to run

#查询集群状态:
-bash-4.2# while true ; do  /picclife/app/11.2.0/grid/bin/crsctl stat res -t -init ; sleep 1; done

ora.asm
      1        ONLINE  OFFLINE                                                  
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac11g770a                                  
ora.crf
      1        ONLINE  ONLINE       rac11g770a                                  
ora.crsd
      1        ONLINE  OFFLINE                                                  
ora.cssd
      1        ONLINE  ONLINE       rac11g770a                                  
ora.cssdmonitor
      1        ONLINE  ONLINE       rac11g770a                                  
ora.ctssd
      1        ONLINE  ONLINE       rac11g770a               OBSERVER           
ora.diskmon
      1        OFFLINE OFFLINE                                                  
ora.drivers.acfs
      1        ONLINE  ONLINE       rac11g770a                                  
ora.evmd
      1        ONLINE  INTERMEDIATE rac11g770a        ---非常不正常, ASM CRSD服务均未启动!!!!

#对集群进行启动、停止,尝试,查询集群各种日志,均为得到明显的ORA报错

C:对用户属组进行修改:使用命令Usermod

###中午睡了一觉,状态OK后,查询用户,发现一个惊奇的特点:差点以为眼花了!!!

Grid: 操作系统用户,属组asmadmin 组消失了!!!!!!!

-bash-4.2# id grid
uid=502(grid) gid=501(oinstall) groups=502(dba),504(asmdba),505(asmoper)

-bash-4.2# id oracle
   uid=501(oracle) gid=501(oinstall) groups=502(dba),503(asmadmin),504(asmdba)

#对Grid用户的属组进行添加

-bash-4.2# usermod -G asmadmin grid           --对于此命令  -G  需要所有的组都添加
-bash-4.2# id grid
uid=502(grid) gid=501(oinstall) groups=503(asmadmin)
-bash-4.2# id oracle
uid=501(oracle) gid=501(oinstall) groups=502(dba),504(asmdba)

-bash-4.2# usermod -G dba,asmdba,asmoper,asmadmin grid
-bash-4.2# usermod -G dba,asmdba,asmoper,asmadmin  oracle

#对集群进行重启:
   rac11g770a:/picclife/app/11.2.0/grid/bin$ ./crsctl stop has -f
   #rac11g770a:/picclife/app/11.2.0/grid/bin$ ./crsctl stop crs -f
   rac11g770a:/picclife/app/11.2.0/grid/bin$ crsctl start has
   rac11g770a:/picclife/app/11.2.0/grid/bin$ crsctl start crs
   rac11g770a:/picclife/app/11.2.0/grid/bin$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
  
   #查询用户
-bash-4.2# id oracle
uid=501(oracle) gid=501(oinstall) groups=503(asmadmin),502(dba),504(asmdba),505(asmoper)
-bash-4.2# id grid
uid=502(grid) gid=501(oinstall) groups=503(asmadmin),502(dba),504(asmdba),505(asmoper)

OK!!!!!