oracle 10.2.0.4
一生产系统监听异常停止了,listener.log中报出如下错误:
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Linux Error: 32: Broken pipe
并且操作系统日志/var/log/messages中抛出类似如下错误:
tnslsnr[5841]: segfault at 0000000000000018 rip 0000003eab66854d rsp 0000007fbfff9230 error 4
在metalink上有这篇文档:549932.1
版本:
Oracle Net Services - Version 10.2.0.1 to 11.1.0.7 [Release 10.2 to 11.1]
Generic UNIX
***Checked for relevance on 22-MAR-2013***
问题现象:
- There may be heavy load on the CPU shooting up to 100%.
- The number of sessions in the database is well below the upper or maximum limit defined in the parameter file.
- The listener crashes suddenly during this heavy CPU load generating the core.
- (Optional) Listener.Ora has SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER=OFF.
Extensive paging/swapping activity is a clear indication that the system is running out of the physical memory.
解决方法:
1. Increase the physical memory of the system.
OR
2. Apply the Patch 6139856 for unpublished Bug 6139856 if available for your platform.
OR
3. Configure Hugepages on the OS. Ref : Note 361323.1
--------------------------------------------------------------------------------------------------------------算是oracle bug问题了,当操作系统物理内存不足,swap/page 耗尽,将会导致listener异常崩溃。
而且从操作系统日志中,可以看到linux自己kill 进程的信息(由于事后总结,且信息在内网内,权限有限,贴不出日志内容)。
所以我的理解就是,当操作系统物理内存居高不下,操作系统会自己杀掉一些他认为的空闲进程之类,而不巧,杀掉的恰好是oracle的监听进程,
从而导致监听异常崩溃。
之所以说恰好是监听进程,是因为在/var/log/messages中,看到之前也有杀掉oracle进程的信息,但当时监听并未停掉,所以怀疑当时杀掉的并不是oracle监听进程,可能是其他非本地进程。