I'm writing a TCP server that can take 15 seconds or more to begin generating the body of a response to certain requests. Some clients like to close the connection at their end if the response takes more than a few seconds to complete.

我正在编写一个TCP服务器，可能需要15秒或更长时间才能开始生成对某些请求的响应体。如果响应需要几秒钟才能完成，一些客户端会喜欢在最后关闭连接。

Since generating the response is very CPU-intensive, I'd prefer to halt the task the instant the client closes the connection. At present, I don't find this out until I send the first payload and receive various hang-up errors.

由于生成响应非常占用CPU，因此我宁愿在客户端关闭连接的瞬间暂停任务。目前，在发送第一个有效载荷并收到各种挂起错误之前，我没有发现这一点。

How can I detect that the peer has closed the connection without sending or receiving any data? That means for recv that all data remains in the kernel, or for send that no data is actually transmitted.

如何在不发送或接收任何数据的情况下检测到对等方已关闭连接？这意味着对于recv，所有数据都保留在内核中，或者对于没有实际传输数据的发送。

6 个解决方案

#1

I've had a recurring problem communicating with equipment that had separate TCP links for send and receive. The basic problem is that the TCP stack doesn't generally tell you a socket is closed when you're just trying to read - you have to try and write to get told the other end of the link was dropped. Partly, that is just how TCP was designed (reading is passive).

我有一个反复出现的问题，与具有单独TCP链接的设备进行通信以进行发送和接收。基本问题是TCP堆栈通常不会告诉您当您尝试读取时套接字已关闭 - 您必须尝试写入以告知链接的另一端已被删除。部分地，这就是TCP的设计方式（阅读是被动的）。

I'm guessing Blair's answer works in the cases where the socket has been shut down nicely at the other end (i.e. they have sent the proper disconnection messages), but not in the case where the other end has impolitely just stopped listening.

我猜Blair的答案适用于套接字在另一端很好地关闭的情况（即它们已经发送了正确的断开消息），但是在另一端不礼貌地停止收听的情况下却没有。

Is there a fairly fixed-format header at the start of your message, that you can begin by sending, before the whole response is ready? e.g. an XML doctype? Also are you able to get away with sending some extra spaces at some points in the message - just some null data that you can output to be sure the socket is still open?

在消息开始时是否有一个相当固定格式的标题，您可以在整个响应准备好之前开始发送？例如XML文档类型？你也可以在消息中的某些点发送一些额外的空间 - 只需要输出一些空数据以确保套接字仍然打开？

#2

The select module contains what you'll need. If you only need Linux support and have a sufficiently recent kernel, select.epoll() should give you the information you need. Most Unix systems will support select.poll().

选择模块包含您需要的内容。如果您只需要Linux支持并且拥有足够新的内核，select.epoll（）应该会为您提供所需的信息。大多数Unix系统都支持select.poll（）。

If you need cross-platform support, the standard way is to use select.select() to check if the socket is marked as having data available to read. If it is, but recv() returns zero bytes, the other end has hung up.

如果需要跨平台支持，标准方法是使用select.select（）来检查套接字是否标记为具有可读取的数据。如果是，但recv（）返回零字节，另一端挂断。

I've always found Beej's Guide to Network Programming good (note it is written for C, but is generally applicable to standard socket operations), while the Socket Programming How-To has a decent Python overview.

我总是发现Beej的网络编程指南很好（注意它是为C编写的，但通常适用于标准的套接字操作），而Socket Programming How-To有一个不错的Python概述。

Edit: The following is an example of how a simple server could be written to queue incoming commands but quit processing as soon as it finds the connection has been closed at the remote end.

编辑：以下是一个示例，说明如何编写简单服务器以对传入命令进行排队，但一旦发现连接已在远程端关闭，就会退出处理。

import select
import socket
import time

# Create the server.
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind((socket.gethostname(), 7557))
serversocket.listen(1)

# Wait for an incoming connection.
clientsocket, address = serversocket.accept()
print 'Connection from', address[0]

# Control variables.
queue = []
cancelled = False

while True:
    # If nothing queued, wait for incoming request.
    if not queue:
        queue.append(clientsocket.recv(1024))

    # Receive data of length zero ==> connection closed.
    if len(queue[0]) == 0:
        break

    # Get the next request and remove the trailing newline.
    request = queue.pop(0)[:-1]
    print 'Starting request', request

    # Main processing loop.
    for i in xrange(15):
        # Do some of the processing.
        time.sleep(1.0)

        # See if the socket is marked as having data ready.
        r, w, e = select.select((clientsocket,), (), (), 0)
        if r:
            data = clientsocket.recv(1024)

            # Length of zero ==> connection closed.
            if len(data) == 0:
                cancelled = True
                break

            # Add this request to the queue.
            queue.append(data)
            print 'Queueing request', data[:-1]

    # Request was cancelled.
    if cancelled:
        print 'Request cancelled.'
        break

    # Done with this request.
    print 'Request finished.'

# If we got here, the connection was closed.
print 'Connection closed.'
serversocket.close()

To use it, run the script and in another terminal telnet to localhost, port 7557. The output from an example run I did, queueing three requests but closing the connection during the processing of the third one:

要使用它，请运行脚本，并在另一个终端telnet中运行到localhost，端口7557.我执行的示例运行的输出，排队三个请求但在处理第三个请求期间关闭连接：

Connection from 127.0.0.1
Starting request 1
Queueing request 2
Queueing request 3
Request finished.
Starting request 2
Request finished.
Starting request 3
Request cancelled.
Connection closed.

epoll alternative

Another edit: I've worked up another example using select.epoll to monitor events. I don't think it offers much over the original example as I cannot see a way to receive an event when the remote end hangs up. You still have to monitor the data received event and check for zero length messages (again, I'd love to be proved wrong on this statement).

另一个编辑：我已经使用select.epoll监视事件了另一个例子。我不认为它比原始示例提供了太多，因为当远程端挂起时我无法看到接收事件的方法。您仍然需要监视收到的数据事件并检查零长度消息（同样，我希望在此声明中证明是错误的）。

import select
import socket
import time

port = 7557

# Create the server.
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind((socket.gethostname(), port))
serversocket.listen(1)
serverfd = serversocket.fileno()
print "Listening on", socket.gethostname(), "port", port

# Make the socket non-blocking.
serversocket.setblocking(0)

# Initialise the list of clients.
clients = {}

# Create an epoll object and register our interest in read events on the server
# socket.
ep = select.epoll()
ep.register(serverfd, select.EPOLLIN)

while True:
    # Check for events.
    events = ep.poll(0)
    for fd, event in events:
        # New connection to server.
        if fd == serverfd and event & select.EPOLLIN:
            # Accept the connection.
            connection, address = serversocket.accept()
            connection.setblocking(0)

            # We want input notifications.
            ep.register(connection.fileno(), select.EPOLLIN)

            # Store some information about this client.
            clients[connection.fileno()] = {
                'delay': 0.0,
                'input': "",
                'response': "",
                'connection': connection,
                'address': address,
            }

            # Done.
            print "Accepted connection from", address

        # A socket was closed on our end.
        elif event & select.EPOLLHUP:
            print "Closed connection to", clients[fd]['address']
            ep.unregister(fd)
            del clients[fd]

        # Error on a connection.
        elif event & select.EPOLLERR:
            print "Error on connection to", clients[fd]['address']
            ep.modify(fd, 0)
            clients[fd]['connection'].shutdown(socket.SHUT_RDWR)

        # Incoming data.
        elif event & select.EPOLLIN:
            print "Incoming data from", clients[fd]['address']
            data = clients[fd]['connection'].recv(1024)

            # Zero length = remote closure.
            if not data:
                print "Remote close on ", clients[fd]['address']
                ep.modify(fd, 0)
                clients[fd]['connection'].shutdown(socket.SHUT_RDWR)

            # Store the input.
            else:
                print data
                clients[fd]['input'] += data

        # Run when the client is ready to accept some output. The processing
        # loop registers for this event when the response is complete.
        elif event & select.EPOLLOUT:
            print "Sending output to", clients[fd]['address']

            # Write as much as we can.
            written = clients[fd]['connection'].send(clients[fd]['response'])

            # Delete what we have already written from the complete response.
            clients[fd]['response'] = clients[fd]['response'][written:]

            # When all the the response is written, shut the connection.
            if not clients[fd]['response']:
                ep.modify(fd, 0)
                clients[fd]['connection'].shutdown(socket.SHUT_RDWR)

    # Processing loop.
    for client in clients.keys():
        clients[client]['delay'] += 0.1

        # When the 'processing' has finished.
        if clients[client]['delay'] >= 15.0:
            # Reverse the input to form the response.
            clients[client]['response'] = clients[client]['input'][::-1]

            # Register for the ready-to-send event. The network loop uses this
            # as the signal to send the response.
            ep.modify(client, select.EPOLLOUT)

        # Processing delay.
        time.sleep(0.1)

Note: This only detects proper shutdowns. If the remote end just stops listening without sending the proper messages, you won't know until you try to write and get an error. Checking for that is left as an exercise for the reader. Also, you probably want to perform some error checking on the overall loop so the server itself is shutdown gracefully if something breaks inside it.

注意：这仅检测正确的停机。如果远程端只是在没有发送正确消息的情况下停止监听，那么在您尝试编写并出现错误之前，您将无法知道。检查这是留给读者的练习。此外，您可能希望对整个循环执行一些错误检查，以便在内部出现问题时服务器本身正常关闭。

#3

The socket KEEPALIVE option allows to detect this kind of "drop the connection without telling the other end" scenarios.

套接字KEEPALIVE选项允许检测这种“丢弃连接而不告诉另一端”的情况。

You should set the SO_KEEPALIVE option at SOL_SOCKET level. In Linux, you can modify the timeouts per socket using TCP_KEEPIDLE (seconds before sending keepalive probes), TCP_KEEPCNT (failed keepalive probes before declaring the other end dead) and TCP_KEEPINTVL (interval in seconds between keepalive probes).

您应该在SOL_SOCKET级别设置SO_KEEPALIVE选项。在Linux中，您可以使用TCP_KEEPIDLE（发送keepalive探测之前的秒数），TCP_KEEPCNT（声明另一端死亡之前的keepalive探测失败）和TCP_KEEPINTVL（keepalive探测之间的间隔秒数）修改每个套接字的超时。

In Python:

在Python中：

import socket
...
s.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 5)

netstat -tanop will show that the socket is in keepalive mode:

netstat -tanop将显示套接字处于keepalive模式：

tcp        0      0 127.0.0.1:6666          127.0.0.1:43746         ESTABLISHED 15242/python2.6     keepalive (0.76/0/0)

while tcpdump will show the keepalive probes:

而tcpdump将显示keepalive探针：

01:07:08.143052 IP localhost.6666 > localhost.43746: . ack 1 win 2048 <nop,nop,timestamp 848683438 848683188>
01:07:08.143084 IP localhost.43746 > localhost.6666: . ack 1 win 2050 <nop,nop,timestamp 848683438 848682438>
01:07:09.143050 IP localhost.6666 > localhost.43746: . ack 1 win 2048 <nop,nop,timestamp 848683688 848683438>
01:07:09.143083 IP localhost.43746 > localhost.6666: . ack 1 win 2050 <nop,nop,timestamp 848683688 848682438>

#4

After struggling with a similar problem I found a solution that works for me, but it does require calling recv() in non-blocking mode and trying to read data, like this:

在遇到类似问题之后，我找到了一个适合我的解决方案，但它确实需要在非阻塞模式下调用recv（）并尝试读取数据，如下所示：

bytecount=recv(connectionfd,buffer,1000,MSG_NOSIGNAL|MSG_DONTWAIT);

The nosignal tells it to not terminate program on error, and the dontwait tells it to not block. In this mode, recv() returns one of 3 possible types of responses:

nosignal告诉它不要在出错时终止程序，并且dontwait告诉它不要阻塞。在此模式下，recv（）返回3种可能类型的响应之一：

-1 if there is no data to read or other errors.
-1如果没有要读取的数据或其他错误。
0 if the other end has hung up nicely
0，如果另一端很好地挂断了
1 or more if there was some data waiting.
如果有一些数据在等待，则为1或更多

So by checking the return value, if it is 0 then that means the other end hung up. If it is -1 then you have to check the value of errno. If errno is equal to EAGAIN or EWOULDBLOCK then the connection is still believed to be alive by the server's tcp stack.

因此，通过检查返回值，如果它为0则表示另一端挂断。如果是-1，那么你必须检查errno的值。如果errno等于EAGAIN或EWOULDBLOCK，那么服务器的tcp堆栈仍然认为连接仍然存在。

This solution would require you to put the call to recv() into your intensive data processing loop -- or somewhere in your code where it would get called 10 times a second or whatever you like, thus giving your program knowledge of a peer who hangs up.

这个解决方案要求你将对recv（）的调用放入密集的数据处理循环中 - 或者在你的代码中的某个地方，它会被调用10次或者你喜欢的任何东西，从而让你的程序知道挂起的同伴向上。

This of course will do no good for a peer who goes away without doing the correct connection shutdown sequence, but any properly implemented tcp client will correctly terminate the connection.

这当然对于没有正确连接关闭序列而离开的对等端没有好处，但任何正确实现的tcp客户端都将正确终止连接。

Note also that if the client sends a bunch of data then hangs up, recv() will probably have to read that data all out of the buffer before it'll get the empty read.

另请注意，如果客户端发送一堆数据然后挂断，则recv（）可能必须从缓冲区中读取该数据才能获得空读取。

#5

-1

You can select with a timeout of zero, and read with the MSG_PEEK flag.

您可以选择超时为零，并使用MSG_PEEK标志进行读取。

I think you really should explain what you precisely mean by "not reading", and why the other answer are not satisfying.

我认为你真的应该通过“不读”来解释你的确切含义，以及为什么其他答案并不令人满意。

#6

-2

Check out select module.

检查选择模块。

#1

我猜Blair的答案适用于套接字在另一端很好地关闭的情况（即它们已经发送了正确的断开消息），但是在另一端不礼貌地停止收听的情况下却没有。

#2

如果需要跨平台支持，标准方法是使用select.select（）来检查套接字是否标记为具有可读取的数据。如果是，但recv（）返回零字节，另一端挂断。

我总是发现Beej的网络编程指南很好（注意它是为C编写的，但通常适用于标准的套接字操作），而Socket Programming How-To有一个不错的Python概述。

Edit: The following is an example of how a simple server could be written to queue incoming commands but quit processing as soon as it finds the connection has been closed at the remote end.

编辑：以下是一个示例，说明如何编写简单服务器以对传入命令进行排队，但一旦发现连接已在远程端关闭，就会退出处理。

import select
import socket
import time

# Create the server.
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind((socket.gethostname(), 7557))
serversocket.listen(1)

# Wait for an incoming connection.
clientsocket, address = serversocket.accept()
print 'Connection from', address[0]

# Control variables.
queue = []
cancelled = False

while True:
    # If nothing queued, wait for incoming request.
    if not queue:
        queue.append(clientsocket.recv(1024))

    # Receive data of length zero ==> connection closed.
    if len(queue[0]) == 0:
        break

    # Get the next request and remove the trailing newline.
    request = queue.pop(0)[:-1]
    print 'Starting request', request

    # Main processing loop.
    for i in xrange(15):
        # Do some of the processing.
        time.sleep(1.0)

        # See if the socket is marked as having data ready.
        r, w, e = select.select((clientsocket,), (), (), 0)
        if r:
            data = clientsocket.recv(1024)

            # Length of zero ==> connection closed.
            if len(data) == 0:
                cancelled = True
                break

            # Add this request to the queue.
            queue.append(data)
            print 'Queueing request', data[:-1]

    # Request was cancelled.
    if cancelled:
        print 'Request cancelled.'
        break

    # Done with this request.
    print 'Request finished.'

# If we got here, the connection was closed.
print 'Connection closed.'
serversocket.close()

要使用它，请运行脚本，并在另一个终端telnet中运行到localhost，端口7557.我执行的示例运行的输出，排队三个请求但在处理第三个请求期间关闭连接：

Connection from 127.0.0.1
Starting request 1
Queueing request 2
Queueing request 3
Request finished.
Starting request 2
Request finished.
Starting request 3
Request cancelled.
Connection closed.

epoll alternative

import select
import socket
import time

port = 7557

# Create the server.
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind((socket.gethostname(), port))
serversocket.listen(1)
serverfd = serversocket.fileno()
print "Listening on", socket.gethostname(), "port", port

# Make the socket non-blocking.
serversocket.setblocking(0)

# Initialise the list of clients.
clients = {}

# Create an epoll object and register our interest in read events on the server
# socket.
ep = select.epoll()
ep.register(serverfd, select.EPOLLIN)

while True:
    # Check for events.
    events = ep.poll(0)
    for fd, event in events:
        # New connection to server.
        if fd == serverfd and event & select.EPOLLIN:
            # Accept the connection.
            connection, address = serversocket.accept()
            connection.setblocking(0)

            # We want input notifications.
            ep.register(connection.fileno(), select.EPOLLIN)

            # Store some information about this client.
            clients[connection.fileno()] = {
                'delay': 0.0,
                'input': "",
                'response': "",
                'connection': connection,
                'address': address,
            }

            # Done.
            print "Accepted connection from", address

        # A socket was closed on our end.
        elif event & select.EPOLLHUP:
            print "Closed connection to", clients[fd]['address']
            ep.unregister(fd)
            del clients[fd]

        # Error on a connection.
        elif event & select.EPOLLERR:
            print "Error on connection to", clients[fd]['address']
            ep.modify(fd, 0)
            clients[fd]['connection'].shutdown(socket.SHUT_RDWR)

        # Incoming data.
        elif event & select.EPOLLIN:
            print "Incoming data from", clients[fd]['address']
            data = clients[fd]['connection'].recv(1024)

            # Zero length = remote closure.
            if not data:
                print "Remote close on ", clients[fd]['address']
                ep.modify(fd, 0)
                clients[fd]['connection'].shutdown(socket.SHUT_RDWR)

            # Store the input.
            else:
                print data
                clients[fd]['input'] += data

        # Run when the client is ready to accept some output. The processing
        # loop registers for this event when the response is complete.
        elif event & select.EPOLLOUT:
            print "Sending output to", clients[fd]['address']

            # Write as much as we can.
            written = clients[fd]['connection'].send(clients[fd]['response'])

            # Delete what we have already written from the complete response.
            clients[fd]['response'] = clients[fd]['response'][written:]

            # When all the the response is written, shut the connection.
            if not clients[fd]['response']:
                ep.modify(fd, 0)
                clients[fd]['connection'].shutdown(socket.SHUT_RDWR)

    # Processing loop.
    for client in clients.keys():
        clients[client]['delay'] += 0.1

        # When the 'processing' has finished.
        if clients[client]['delay'] >= 15.0:
            # Reverse the input to form the response.
            clients[client]['response'] = clients[client]['input'][::-1]

            # Register for the ready-to-send event. The network loop uses this
            # as the signal to send the response.
            ep.modify(client, select.EPOLLOUT)

        # Processing delay.
        time.sleep(0.1)

#3

The socket KEEPALIVE option allows to detect this kind of "drop the connection without telling the other end" scenarios.

套接字KEEPALIVE选项允许检测这种“丢弃连接而不告诉另一端”的情况。

In Python:

在Python中：

import socket
...
s.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 1)
s.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 5)

netstat -tanop will show that the socket is in keepalive mode:

netstat -tanop将显示套接字处于keepalive模式：

tcp        0      0 127.0.0.1:6666          127.0.0.1:43746         ESTABLISHED 15242/python2.6     keepalive (0.76/0/0)

while tcpdump will show the keepalive probes:

而tcpdump将显示keepalive探针：

01:07:08.143052 IP localhost.6666 > localhost.43746: . ack 1 win 2048 <nop,nop,timestamp 848683438 848683188>
01:07:08.143084 IP localhost.43746 > localhost.6666: . ack 1 win 2050 <nop,nop,timestamp 848683438 848682438>
01:07:09.143050 IP localhost.6666 > localhost.43746: . ack 1 win 2048 <nop,nop,timestamp 848683688 848683438>
01:07:09.143083 IP localhost.43746 > localhost.6666: . ack 1 win 2050 <nop,nop,timestamp 848683688 848682438>

#4

After struggling with a similar problem I found a solution that works for me, but it does require calling recv() in non-blocking mode and trying to read data, like this:

在遇到类似问题之后，我找到了一个适合我的解决方案，但它确实需要在非阻塞模式下调用recv（）并尝试读取数据，如下所示：

bytecount=recv(connectionfd,buffer,1000,MSG_NOSIGNAL|MSG_DONTWAIT);

The nosignal tells it to not terminate program on error, and the dontwait tells it to not block. In this mode, recv() returns one of 3 possible types of responses:

nosignal告诉它不要在出错时终止程序，并且dontwait告诉它不要阻塞。在此模式下，recv（）返回3种可能类型的响应之一：

-1 if there is no data to read or other errors.
-1如果没有要读取的数据或其他错误。
0 if the other end has hung up nicely
0，如果另一端很好地挂断了
1 or more if there was some data waiting.
如果有一些数据在等待，则为1或更多

This of course will do no good for a peer who goes away without doing the correct connection shutdown sequence, but any properly implemented tcp client will correctly terminate the connection.

这当然对于没有正确连接关闭序列而离开的对等端没有好处，但任何正确实现的tcp客户端都将正确终止连接。

Note also that if the client sends a bunch of data then hangs up, recv() will probably have to read that data all out of the buffer before it'll get the empty read.

另请注意，如果客户端发送一堆数据然后挂断，则recv（）可能必须从缓冲区中读取该数据才能获得空读取。

#5

-1

You can select with a timeout of zero, and read with the MSG_PEEK flag.

您可以选择超时为零，并使用MSG_PEEK标志进行读取。

I think you really should explain what you precisely mean by "not reading", and why the other answer are not satisfying.

我认为你真的应该通过“不读”来解释你的确切含义，以及为什么其他答案并不令人满意。

#6

-2

Check out select module.

检查选择模块。

秒客网

检测套接字挂断而不发送或接收？

6 个解决方案

#1

#2

epoll alternative

#3

#4

#5

#6

#1

#2

epoll alternative

#3

#4

#5

#6

相关文章