Troubleshooting ORA-12547 TNS: Lost Contact (Doc ID 555565.1)

时间:2022-04-12 06:01:34
Troubleshooting ORA-12547 TNS: Lost Contact (Doc ID 555565.1)

This error can occur in following scenarios:

Bequeath (Local) :

The following/below are common scenarios in which 12547 occurs while making a BEQ 
connection.

Problem:- 
BEQ connection fails when connecting with / as sysdba
AND/OR
Non-Oracle user fails where all other users ok.  (See nosuid mount cause below)

Cause:- 
Oracle binaries have not been linked correctly

Solution :- 
relink the Oracle binaries by executing the following command :

$ ORACLE_HOME/bin relink all

(OR)

Cause :- 
Shared memory segment and/or semaphores are not flushed properly.

A system level trace taken while connecting to the database " / as sysdba " reveals the following :

2654: write(5, " O R A - 0 0 6 0 0 : i".., 81) = 81 
2654: write(5, "\n", 1) = 1 
2654: write(5, " O R A - 2 7 1 0 0 : s".., 45) = 45

Solution :- 
Please identify and clear the problematic memory segment and semaphores.

The following Knowledge Base document describes the steps in detail. 
Note: 381566.1 connect / as sysdba Fails with Ora-12547 And TNS-12514

(OR)

Cause :-
External Shared Memory is configured in the environment.Extended Shared Memory is not supported by Oracle and should not be set/exist in the environment.

Truss of sqlplus connection shows a trace file being generated in $ORACLE_HOME/rdbms/log directory instead of the BDUMP location.

Excerpt from truss output:

569536: statx("/opt/oracle/oracle/product/11.1.0/rdbms/log/xxxx_ora_569536.trc", 0x0FFFFFFFFFFF7DF0, 176, 01) = 0
569536: statx("/opt/oracle/oracle/product/11.1.0/rdbms/log/xxxx_ora_569536.trc", 0x0FFFFFFFFFFF7DF0, 176, 0) = 0
569536: open("/opt/oracle/oracle/product/11.1.0/rdbms/log/xxxx_ora_569536.trc", O_WRONLY|O_APPEND|O_LARGEFILE) = 3
569536: kwrite(3, " D u m p f i l e ", 10) = 10
569536: kwrite(3, " / o p t / o r a c l e /".., 63) = 63
569536: kwrite(3, "\n", 1) = 1
569536: kwrite(3, "\n * * * 2 0 0 9 - 0 2".., 29) = 29
569536: kwrite(3, " O r a c l e D a t a b".., 200) = 200

Solution :-
1. Unset the EXTSHM environment variable in the session where you are connecting to database:
unsetenv EXTSHM
2. Run sqlplus:
sqlplus / as sysdba

(OR)

Cause :-
The ulimit settings for 'open files' is inadequate:

Truss on sqlplus may look like :

4074: shmat(9, 0x380000000, 040000) = 0x380000000
4074: Incurred fault #6, FLTBOUNDS %pc = 0x100E59824
4074: siginfo: SIGSEGV SEGV_MAPERR addr=0x000006C0
4074: Received signal #11, SIGSEGV [caught]
4074: siginfo: SIGSEGV SEGV_MAPERR addr=0x000006C0
4074: lwp_sigmask(SIG_SETMASK, 0x9FBEF457, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
4074: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
4074: Incurred fault #6, FLTBOUNDS %pc = 0x100E5BD3C
4074: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008 

Solution:-

Increase the open files resource limit to a higher value with ulimit

(OR)

Cause :-
Possibly the following issue:

Unpublished BUG 4335746 - STARTUP GIVES TNS LOST CONTACT , TRACES GIVE SKGM ERROR 27148

Collect a truss of the failure using:

% truss -aefo /tmp/sqlplus.out sqlplus / as sysdba

sqlplus.out 
----------- 
23: lxstat(2, "/rdbms1/ora10gr2/rdbms/log/xxxx_ora_23.trc", 0x08045AB8) Err#2 ENOENT 
23: xstat(2, "/rdbms1/ora10gr2/rdbms/log/xxxx_ora_23.trc", 0x08045AB8) Err#2 ENOENT 
23: open("/rdbms1/ora10gr2/rdbms/log/xxxx_ora_23.trc", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 
0660) = 3 
23: write(3, 0x0BED1F20, 0) = 0 
23: write(3, " / r d b m s 1 / o r a 1".., 44) = 44 
23: write(3, "\n", 1) = 1 
23: write(3, " O r a c l e D a t a b".., 122) = 122 
23: write(3, "\n", 1) = 1 
23: write(3, " O R A C L E _ H O M E ".., 31) = 31 
23: uname(0x0C4BF314) = 1 
23: write(3, " S y s t e m n a m e :".., 19) = 19 
23: write(3, " N o d e n a m e :\t o".., 19) = 19 
23: write(3, " R e l e a s e :\t 5 . 1".., 14) = 14 
23: write(3, " V e r s i o n :\t G e n".., 27) = 27 
23: write(3, " M a c h i n e :\t i 8 6".., 15) = 15 
23: write(3, " I n s t a n c e n a m".., 22) = 22 
23: write(3, " R e d o t h r e a d ".., 47) = 47 
23: write(3, " O r a c l e p r o c e".., 25) = 25 
23: write(3, " U n i x p r o c e s s".., 43) = 43 
23: write(3, "\n", 1) = 1 
23: write(3, "\n", 1) = 1 
23: brk(0x0C4DC778) = 0 
23: brk(0x0C4DE778) = 0 
23: open("/usr/share/lib/zoneinfo/xxx", O_RDONLY) = 5 
23: fstat64(5, 0x08045BD0) = 0 
23: read(5, " T Z i f\0\0\0\0\0\0\0\0".., 1017) = 1017 
23: close(5) = 0 
23: write(3, " * * * 2 0 0 8 - 1 0 -".., 27) = 27 
23: write(3, "\n", 1) = 1 
23: write(3, " s k g m e r r o r 2".., 46) = 46 
23: write(3, "\n", 1) = 1 
23: _exit(0) 
20: read(11, 0x0803FAF0, 64) = 0 
20: close(11) = 0 
20: close(10) = 0 
20: getpid() = 20 [29981]

Solution:-
Confirm what the current ulimit setting is for stack

% ulimit -a

Cause :-  /etc/fstab is mounted with the nosuid option.  All local connections fail with ORA-12547

Solution:-
Replacing it with suid fixed the issue.

Note 188149.1 How to Display and Change UNIX Process Resource Limits

Check the install guide for your specific platform and version of Oracle and set stack as stated.

example:  ulimit -s -1 ** This is Unlimited, the recommended setting on AIX **

Note 188149.1 How to Display and Change UNIX Process Resource Limits

Cause :- 
SHMMX has recently been changed.

An incident trace is being generated with the following code:

ORA-00600: internal error code, arguments: [SKGMINVALID], [13], [16777216], [0], [8164864]

Solution :-Set SHMMX correctly or back to value prior to change.

Remote Connections:

The following/below are common scenarios in which 12547 occurs while making a TCP connection.

Problem :- 
Remote connections to the database server fail with ORA-12547

Cause :- 
SQLNET.INBOUND_CONNECT_TIMEOUT and/or INBOUND_CONNECT_TIMEOUT_listener_name is set in the database server's sqlnet.ora and listener.ora

If the client fails to establish a connection and complete authentication in the time specified, then the database server terminates the connection.

In 10g and higher, you may see ORA-609 or ORA-3136 errors in the alert.log.

Solution :- 
Tune the parameters SQLNET.INBOUND_CONNECT_TIMEOUT and/or INBOUND_CONNECT_TIMEOUT_listener_name to appropriate values.

(OR)

Cause :- 
TCP.VALIDNODE_CHECKING is active on the database server and the TCP.INVITED_NODES list does not have the IP address of the failing client. 
Alternatively the TCP.EXCLUDED_NODES list contain the IP address of the failing client.

Solution :- 
Either add the IP address of the failing client in the TCP.INVITED_NODES list or remove it from the TCP.EXCLUDED_NODES list

It is recommended to restart (not reload)  the database listener for these parameters to take effect.

Listener specific :

Problem :- 
Listener is failing to start with the error ORA-12547 and  Error: 104: Connection reset by peer

Cause :- 
In the strace trace file of the tnslsnr executable, we see segmentation fault after following calls:

28567      0.000054 open("/u01/home/oracle/ldap", O_RDONLY) = -1 ENOENT (No such file or directory) <0.000021>
28567      0.000068 open("/u01/home/oracle/.ldap", O_RDONLY) = -1 ENOENT (No such file or directory) <0.000020>
28567      0.000063 open("ldap", O_RDONLY) = -1 ENOENT (No such file or directory) <0.000018>
28567      0.000084 --- SIGSEGV (Segmentation fault) @ 0 (0) ---

And we also see the first reference to ldap files being made after reading file nsswitch.conf

Solution :-

LDAP is not used for authentication and not configured. However, in nsswitch.conf file we have some entries with ldap resolution. The solution is to modify nsswitch.conf to assure there is no reference to ldap.

Problem :- 
Listener is failing to start with the error ORA-12547.

Cause :- 
TCP.VALIDNODE_CHECKING is active on the database server and the TCP.INVITED_NODES list does not have the IP address of the Database server specified in the address definition of the listener. 
Alternatively the TCP.EXCLUDED_NODES list contain the IP address of the Database server specified in the address definition of the listener.

Solution :- 
Either add the IP address of the Database server in the TCP.INVITED_NODES list and remove it from the TCP.EXCLUDED_NODES list if present

The listener must be restarted for these changes to take effect.

Problem :- 
Tnsping to the database listener is failing with ORA-12547

TNS Ping Utility for Linux: Version 10.1.0.5.0 - Production on 11-AUG-2016 04:01:33

Used parameter files:

/u01/oracle/product/10.1.0.5/db_1/network/admin/sqlnet.ora

Used TNSNAMES adapter to resolve the alias

Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=X.X.X.X)(PORT=10710)) (CONNECT_DATA= (SERVICE_NAME=myservice.oracle.com) (INSTANCE_NAME=myservice)))

TNS-12547: TNS:lost contact

Cause :-
TCP.VALIDNODE_CHECKING is active on the database server and the TCP.INVITED_NODES list does not have the IP address of the client that is attempting this tnsping.  Alternatively, TCP.EXCLUDED_NODES
list contains this client's ip address.

Solution :- 
Add the client's ip address to the server side sqlnet.ora file in the TCP.INVITED_NODES list OR remove it from the TCP.EXCLUDED_NODES list if present 
The listener must be restarted for these changes to take effect.

Problem:-

After changing variables in /etc/system and rebooting), a 10g or newer Listener will not start.

lsnrctl> start 
Starting /u01/oracle/product/10.2.0/Db_1/bin/tnslsnr: please wait... 
TNSLSNR for Solaris: Version 10.2.0.1.0 - Production 
System parameter file is /u01/oracle/product/10.2.0/Db_1/network/admin/listener.ora 
Log messages written to /u01/oracle/product/10.2.0/Db_1/network/log/listener.log 
Trace information written to /u01/oracle/product/10.2.0/Db_1/network/trace/listener.trc 
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=<hostname>)(PORT=1521))) 
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=extproc10)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=<hostname>)(PORT=1521))) 
TNS-12547: TNS:lost contact 
TNS-12560: TNS:protocol adapter error 
TNS-00517: Lost contact 
Solaris Error: 131: Connection reset by peer 
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=extproc10))) 
TNS-12541: TNS:no listener 
TNS-12560: TNS:protocol adapter error 
TNS-00511: No listener 
Solaris Error: 146: Connection refused

Cause:-
Issue is due to a DNS client process, 'BIND 9' not running/started on the Solaris 10 Server.

"BIND allows DNS clients and applications to query DNS servers for the IPv4 and IPv6 networks. 
BIND includes two main components: a stub resolver API, resolver(3resolv), and the DNS name server 
with various DNS tools."

Reference: 
http://docs.sun.com/app/docs/doc/817-0547/6mgbdbsna?a=view 
Solaris 10 Release and Installation Collection --> Solaris 10 What's New --> 2. What's New in the 
Solaris 10 3/05 Release -->Freeware Enhancements

Solution:-
Contact the System Administrator to start the BIND 9 DNS Client on the Solaris10 Server.

Permissions:

Problem:-  
Connections via the listener are failing with ORA-12547.  It is likely in this scenario that LOCAL or BEQ connections to the instance are successful.

Cause:-
The permissions on the Oracle binary and not correct and the listener cannot spawn a server process.  This is common in environments where the listener is running in the GRID home
and servicing connections to an instance in a different $ORACLE_HOME.  
See the following document for both RAC/SCAN:  
Note 1069517.1 ORA-12537 if Listener (including SCAN Listener) and Database are Owned by Different OS User

Issue the following checks:

$ ls -l $RDBMS_HOME/bin/oracle
ls: /home/oracle/app/oracle/product/11.2/db/bin/oracle: Permission denied

1. Listener owner (including SCAN listener) cannot access oracle binary in database home:

As listener owner:

$ ls -l $RDBMS_HOME/bin/oracle
ls: /home/oracle/app/oracle/product/11.2/db/bin/oracle: Permission denied

2. Oracle binary in database home has wrong permission:

ls -l $RDBMS_HOME/bin/oracle
-rwxr-x--x 1 oracle asmadmin 184286251 Aug  9 16:25 /home/oracle/app/oracle/product/11.2/db/bin/oracle
The permission "-rwxr-x--x" is wrong as it's missing suid bit, oracle binary should have permission of 6751:

-rwsr-s--x 1 oracle asmadmin 184286251 Aug  9 16:25 /home/oracle/app/oracle/product/11.2/db/bin/oracle

Solution:-

 Solution is to make sure file system for database home has setuid/suid set, database binary($RDBMS_HOME/bin/oracle) has correct ownership and permission, and listener owner is able to access database oracle binary (as listener owner, "ls -l $RDBMS_HOME/bin/oracle" will tell)

Diagnosis :

If the above discussed scenarios did not help than please collect the following diagnostic information and the provide the same while logging a service request .

The diagnosis on this error depends on the context or the scenario in which the error is occurring:

1. If the error is occurring during the connection establishment from a remote installation than

a. Enable a support level client, listener and server side SQLNET traces. 
Reproduce the issue or wait until the error reoccurs and upload the corresponding traces . 
Note: 395525.1 How to Enable Oracle SQLNet Client , Server , Listener , Kerberos and External procedure Tracing from Net Manager

b. Upload the SQLNET.ORA, LISTENER.ORA of the database server.

c. From the client where connection fails, check whether telnet to the listener port works.(expected to bring a blank screen) 
$ telnet <database server hostname or IP address> <listener port>

2. If the error is occurring on a remote installation after connection establishment than

a. Enable a support level client and server side SQLNET traces. 
Reproduce the issue or wait until the error reoccurs and upload the corresponding traces .

b. From the client where connection fails, check whether telnet or putty to the database server also fails with idle time. 
$ telnet <database server hostname or IP address>

3. If the error occurs while performing listener operations than

a. Enable a support level Listener traces. 
Reproduce the issue and upload the corresponding traces.

Please see the other referenced documents for more information.

REFERENCES

NOTE:1069517.1 - ORA-12537 / ORA-12547 or TNS-12518 if Listener (including SCAN Listener) and Database are Owned by Different OS User

NOTE:372143.1 - ORA-12547 connecting to sqlplus '/ as sysdba' on AIX
NOTE:422173.1 - Local SQL*Plus Connection and DBCA Fails With: ORA-12547: TNS:Lost Contact 
NOTE:381566.1 - connect / as sysdba Fails with Ora-12547 And Tns-12514
NOTE:2206832.1 - DBCA and Local SQL*Plus Connection Fails With: ORA-12547: TNS:Lost Contact ORA-12753

NOTE:395525.1 - How to Enable Oracle SQL*Net Client , Server , Listener , Kerberos and External procedure Tracing from Net Manager
NOTE:744512.1 - Ora-12547: Tns:Lost Contact Creating Database After Clean Installation