写入关闭的本地TCP套接字，而不是失败。

I seem to be having a problem with my sockets. Below, you will see some code which forks a server and a client. The server opens a TCP socket, and the client connects to it and then closes it. Sleeps are used to coordinate the timing. After the client-side close(), the server tries to write() to its own end of the TCP connection. According to the write(2) man page, this should give me a SIGPIPE and an EPIPE errno. However, I don't see this. From the server's point of view, the write to a local, closed socket succeeds, and absent the EPIPE I can't see how the server should be detecting that the client has closed the socket.

我的眼窝似乎出了问题。下面，您将看到一些为服务器和客户机分叉的代码。服务器打开一个TCP套接字，客户端连接到它，然后关闭它。睡眠被用来协调时间。在客户端关闭()之后，服务器尝试将()写入到它自己的TCP连接端。根据写(2)人页，这应该给我一个SIGPIPE和EPIPE errno。但是，我没有看到这个。从服务器的角度来看，写到本地的、关闭的套接字成功了，并且没有EPIPE，我看不出服务器应该如何检测客户端已经关闭了套接字。

In the gap between the client closing its end and the server attempting to write, a call to netstat will show that the connection is in a CLOSE_WAIT/FIN_WAIT2 state, so the server end should definitely be able to reject the write.

在客户端关闭端和试图写入的服务器之间的间隙中，对netstat的调用将显示连接处于CLOSE_WAIT/FIN_WAIT2状态，因此服务器端肯定能够拒绝写入。

For reference, I'm on Debian Squeeze, uname -r is 2.6.39-bpo.2-amd64.

作为参考，我在Debian压缩，uname -r是2.6.39-bpo.2-amd64。

What's going on here?

这是怎么回事?

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <sys/select.h>
#include <netinet/tcp.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <fcntl.h>

#include <netdb.h>

#define SERVER_ADDRESS "127.0.0.7"
#define SERVER_PORT 4777


#define myfail_if( test, msg ) do { if((test)){ fprintf(stderr, msg "\n"); exit(1); } } while (0)
#define myfail_unless( test, msg ) myfail_if( !(test), msg )

int connect_client( char *addr, int actual_port )
{
    int client_fd;

    struct addrinfo hint;
    struct addrinfo *ailist, *aip;


    memset( &hint, '\0', sizeof( struct addrinfo ) );
    hint.ai_socktype = SOCK_STREAM;

    myfail_if( getaddrinfo( addr, NULL, &hint, &ailist ) != 0, "getaddrinfo failed." );

    int connected = 0;
    for( aip = ailist; aip; aip = aip->ai_next ) {
        ((struct sockaddr_in *)aip->ai_addr)->sin_port = htons( actual_port );
        client_fd = socket( aip->ai_family, aip->ai_socktype, aip->ai_protocol );

        if( client_fd == -1) { continue; }
        if( connect( client_fd, aip->ai_addr, aip->ai_addrlen) == 0 ) {
            connected = 1;
            break;
        }
        close( client_fd );
    }

    freeaddrinfo( ailist );

    myfail_unless( connected, "Didn't connect." );
    return client_fd;
}


void client(){
    sleep(1);
    int client_fd = connect_client( SERVER_ADDRESS, SERVER_PORT );

    printf("Client closing its fd... ");
    myfail_unless( 0 == close( client_fd ), "close failed" );
    fprintf(stdout, "Client exiting.\n");
    exit(0);
}


int init_server( struct sockaddr * saddr, socklen_t saddr_len )
{
    int sock_fd;

    sock_fd = socket( saddr->sa_family, SOCK_STREAM, 0 );
    if ( sock_fd < 0 ){
        return sock_fd;
    }

    myfail_unless( bind( sock_fd, saddr, saddr_len ) == 0, "Failed to bind." );
    return sock_fd;
}

int start_server( const char * addr, int port )
{
    struct addrinfo *ailist, *aip;
    struct addrinfo hint;
    int sock_fd;

    memset( &hint, '\0', sizeof( struct addrinfo ) );
    hint.ai_socktype = SOCK_STREAM;
    myfail_if( getaddrinfo( addr, NULL, &hint, &ailist ) != 0, "getaddrinfo failed." );

    for( aip = ailist; aip; aip = aip->ai_next ){
        ((struct sockaddr_in *)aip->ai_addr)->sin_port = htons( port );
        sock_fd = init_server( aip->ai_addr, aip->ai_addrlen );
        if ( sock_fd > 0 ){
            break;
        } 
    }
    freeaddrinfo( aip );

    myfail_unless( listen( sock_fd, 2 ) == 0, "Failed to listen" );
    return sock_fd;
}


int server_accept( int server_fd )
{
    printf("Accepting\n");
    int client_fd = accept( server_fd, NULL, NULL );
    myfail_unless( client_fd > 0, "Failed to accept" );
    return client_fd;
}


void server() {
    int server_fd = start_server(SERVER_ADDRESS, SERVER_PORT);
    int client_fd = server_accept( server_fd );

    printf("Server sleeping\n");
    sleep(60);

    printf( "Errno before: %s\n", strerror( errno ) );
    printf( "Write result: %d\n", write( client_fd, "123", 3 ) );
    printf( "Errno after:  %s\n", strerror( errno ) );

    close( client_fd );
}


int main(void){
    pid_t clientpid;
    pid_t serverpid;

    clientpid = fork();

    if ( clientpid == 0 ) {
        client();
    } else {
        serverpid = fork();

        if ( serverpid == 0 ) {
            server();
        }
        else {
            int clientstatus;
            int serverstatus;

            waitpid( clientpid, &clientstatus, 0 );
            waitpid( serverpid, &serverstatus, 0 );

            printf( "Client status is %d, server status is %d\n", 
                    clientstatus, serverstatus );
        }
    }

    return 0;
}

5 个解决方案

#1

This is what the Linux man page says about write and EPIPE:

这就是Linux手册页所写的关于写作和EPIPE的内容:

   EPIPE  fd is connected to a pipe or socket whose reading end is closed.
          When this happens the writing process will also receive  a  SIG-
          PIPE  signal.  (Thus, the write return value is seen only if the
          program catches, blocks or ignores this signal.)

When Linux is using a pipe or a socketpair, it can and will check the reading end of the pair, as these two programs would demonstrate:

当Linux使用一个管道或一个socketpair时，它可以并且将检查这对的读取端，因为这两个程序将会演示:

void test_socketpair () {
    int pair[2];
    socketpair(PF_LOCAL, SOCK_STREAM, 0, pair);
    close(pair[0]);
    if (send(pair[1], "a", 1, MSG_NOSIGNAL) < 0) perror("send");
}

void test_pipe () {
    int pair[2];
    pipe(pair);
    close(pair[0]);
    signal(SIGPIPE, SIG_IGN);
    if (write(pair[1], "a", 1) < 0) perror("send");
    signal(SIGPIPE, SIG_DFL);
}

Linux is able to do so, because the kernel has innate knowledge about the other end of the pipe or connected pair. However, when using connect, the state about the socket is maintained by the protocol stack. Your test demonstrates this behavior, but below is a program that does it all in a single thread, similar to the two tests above:

Linux能够做到这一点，因为内核对管道的另一端或连接的对具有天生的知识。但是，当使用connect时，该套接字的状态由协议栈维护。您的测试演示了这种行为，但是下面是一个程序，它在一个线程中完成所有工作，类似于上面的两个测试:

int a_sock = socket(PF_INET, SOCK_STREAM, 0);
const int one = 1;
setsockopt(a_sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));
struct sockaddr_in a_sin = {0};
a_sin.sin_port = htons(4321);
a_sin.sin_family = AF_INET;
a_sin.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
bind(a_sock, (struct sockaddr *)&a_sin, sizeof(a_sin));
listen(a_sock, 1);
int c_sock = socket(PF_INET, SOCK_STREAM, 0);
fcntl(c_sock, F_SETFL, fcntl(c_sock, F_GETFL, 0)|O_NONBLOCK);
connect(c_sock, (struct sockaddr *)&a_sin, sizeof(a_sin));
fcntl(c_sock, F_SETFL, fcntl(c_sock, F_GETFL, 0)&~O_NONBLOCK);
struct sockaddr_in s_sin = {0};
socklen_t s_sinlen = sizeof(s_sin);
int s_sock = accept(a_sock, (struct sockaddr *)&s_sin, &s_sinlen);
struct pollfd c_pfd = { c_sock, POLLOUT, 0 };
if (poll(&c_pfd, 1, -1) != 1) perror("poll");
int erropt = -1;
socklen_t errlen = sizeof(erropt);
getsockopt(c_sock, SOL_SOCKET, SO_ERROR, &erropt, &errlen);
if (erropt != 0) { errno = erropt; perror("connect"); }
puts("P|Recv-Q|Send-Q|Local Address|Foreign Address|State|");
char cmd[256];
snprintf(cmd, sizeof(cmd), "netstat -tn | grep ':%hu ' | sed 's/  */|/g'",
         ntohs(s_sin.sin_port));
puts("before close on client"); system(cmd);
close(c_sock);
puts("after close on client"); system(cmd);
if (send(s_sock, "a", 1, MSG_NOSIGNAL) < 0) perror("send");
puts("after send on server"); system(cmd);
puts("end of test");
sleep(5);

If you run the above program, you will get output similar to this:

如果您运行上述程序，您将得到与此类似的输出:

P|Recv-Q|Send-Q|Local Address|Foreign Address|State|
before close on client
tcp|0|0|127.0.0.1:35790|127.0.0.1:4321|ESTABLISHED|
tcp|0|0|127.0.0.1:4321|127.0.0.1:35790|ESTABLISHED|
after close on client
tcp|0|0|127.0.0.1:35790|127.0.0.1:4321|FIN_WAIT2|
tcp|1|0|127.0.0.1:4321|127.0.0.1:35790|CLOSE_WAIT|
after send on server
end of test

This shows it took one write for the sockets to transition to the CLOSED states. To find out why this occurred, a TCP dump of the transaction can be useful:

这显示了它为套接字写了一个字，以便向关闭状态转换。为了找出发生这种情况的原因，一个TCP转储文件可以是有用的:

16:45:28 127.0.0.1 > 127.0.0.1
 .809578 IP .35790 > .4321: S 1062313174:1062313174(0) win 32792 <mss 16396,sackOK,timestamp 3915671437 0,nop,wscale 7>
 .809715 IP .4321 > .35790: S 1068622806:1068622806(0) ack 1062313175 win 32768 <mss 16396,sackOK,timestamp 3915671437 3915671437,nop,wscale 7>
 .809583 IP .35790 > .4321: . ack 1 win 257 <nop,nop,timestamp 3915671437 3915671437>
 .840364 IP .35790 > .4321: F 1:1(0) ack 1 win 257 <nop,nop,timestamp 3915671468 3915671437>
 .841170 IP .4321 > .35790: . ack 2 win 256 <nop,nop,timestamp 3915671469 3915671468>
 .865792 IP .4321 > .35790: P 1:2(1) ack 2 win 256 <nop,nop,timestamp 3915671493 3915671468>
 .865809 IP .35790 > .4321: R 1062313176:1062313176(0) win 0

The first three lines represent the 3-way handshake. The fourth line is the FIN packet the client sends to the server, and the fifth line is the ACK from the server, acknowledging receipt. The sixth line is the server trying to send 1 byte of data to the client with the PUSH flag set. The final line is the client RESET packet, which causes the TCP state for the connection to be freed, and is why the third netstat command did not result in any output in the test above.

前三行表示三向握手。第四行是客户端发送到服务器的FIN包，第五行是来自服务器的ACK，确认接收。第六行是服务器试图发送1个字节的数据到客户端设置推动标志。最后一行是客户端重置数据包,导致TCP连接的释放状态,第三就是为什么netstat命令没有导致任何输出在上面的测试中。

So, the server doesn't know the client will reset the connection until after it tries to send some data to it. The reason for the reset is because the client called close, instead of something else.

因此，服务器不知道客户端将重置连接，直到它尝试发送一些数据到它。重置的原因是客户端调用close，而不是其他东西。

The server cannot know for certain what system call the client has actually issued, it can only follow the TCP state. For example, we could replace the close call with a call to shutdown instead.

服务器无法确定客户端实际发布了什么系统调用，它只能遵循TCP状态。例如，我们可以用调用关闭来代替close call。

//close(c_sock);
shutdown(c_sock, SHUT_WR);

The difference between shutdown and close is that shutdown only governs the state of the connection, while close also governs the state of the file descriptor that represents the socket. A shutdown will not close a socket.

关闭和关闭之间的区别是，关闭只控制连接的状态，而close也控制表示套接字的文件描述符的状态。关闭不会关闭套接字。

The output will be different with the shutdown change:

输出将与关机改变不同:

P|Recv-Q|Send-Q|Local Address|Foreign Address|State|
before close on client
tcp|0|0|127.0.0.1:4321|127.0.0.1:56355|ESTABLISHED|
tcp|0|0|127.0.0.1:56355|127.0.0.1:4321|ESTABLISHED|
after close on client
tcp|1|0|127.0.0.1:4321|127.0.0.1:56355|CLOSE_WAIT|
tcp|0|0|127.0.0.1:56355|127.0.0.1:4321|FIN_WAIT2|
after send on server
tcp|1|0|127.0.0.1:4321|127.0.0.1:56355|CLOSE_WAIT|
tcp|1|0|127.0.0.1:56355|127.0.0.1:4321|FIN_WAIT2|
end of test

The TCP dump will show also show something different:

TCP转储将显示一些不同的内容:

17:09:18 127.0.0.1 > 127.0.0.1
 .722520 IP .56355 > .4321: S 2558095134:2558095134(0) win 32792 <mss 16396,sackOK,timestamp 3917101399 0,nop,wscale 7>
 .722594 IP .4321 > .56355: S 2563862019:2563862019(0) ack 2558095135 win 32768 <mss 16396,sackOK,timestamp 3917101399 3917101399,nop,wscale 7>
 .722615 IP .56355 > .4321: . ack 1 win 257 <nop,nop,timestamp 3917101399 3917101399>
 .748838 IP .56355 > .4321: F 1:1(0) ack 1 win 257 <nop,nop,timestamp 3917101425 3917101399>
 .748956 IP .4321 > .56355: . ack 2 win 256 <nop,nop,timestamp 3917101426 3917101425>
 .764894 IP .4321 > .56355: P 1:2(1) ack 2 win 256 <nop,nop,timestamp 3917101442 3917101425>
 .764903 IP .56355 > .4321: . ack 2 win 257 <nop,nop,timestamp 3917101442 3917101442>
17:09:23
 .786921 IP .56355 > .4321: R 2:2(0) ack 2 win 257 <nop,nop,timestamp 3917106464 3917101442>

Notice the reset at the end comes 5 seconds after the last ACK packet. This reset is due to the program shutting down without properly closing the sockets. It is the ACK packet from the client to the server before the reset that is different than before. This is the indication that the client did not use close. In TCP, the FIN indication is really an indication that there is no more data to be sent. But since a TCP connection is bi-directional, the server that receives the FIN assumes the client can still receive data. In the case above, the client in fact does accept the data.

注意在最后一个ACK报文之后的5秒内重置。此重置是由于程序关闭而没有正确关闭套接字。在与以前不同的重置之前，它是客户端到服务器的ACK数据包。这表明客户端没有使用close。在TCP中，FIN指示实际上表明没有更多的数据要发送。但是由于TCP连接是双向的，接收FIN的服务器假设客户端仍然可以接收数据。在上面的例子中，客户端实际上接受数据。

Whether the client uses close or SHUT_WR to issue a FIN, in either case you can detect the arrival of the FIN by polling on the server socket for a readable event. If after calling read the result is 0, then you know the FIN has arrived, and you can do what you wish with that information.

无论客户端使用close还是SHUT_WR来发出FIN，在这两种情况下，您都可以通过在服务器套接字上的轮询来检测到FIN的到达，以便进行可读的事件。如果调用读取结果为0，那么您就知道FIN已经到达了，您可以用该信息做您想做的事情。

struct pollfd s_pfd = { s_sock, POLLIN|POLLOUT, 0 };
if (poll(&s_pfd, 1, -1) != 1) perror("poll");
if (s_pfd.revents|POLLIN) {
    char c;
    int r;
    while ((r = recv(s_sock, &c, 1, MSG_DONTWAIT)) == 1) {}
    if (r == 0) { /*...FIN received...*/ }
    else if (errno == EAGAIN) { /*...no more data to read for now...*/ }
    else { /*...some other error...*/ perror("recv"); }
}

Now, it is trivially true that if the server issues SHUT_WR with shutdown before it tries to do a write, it will in fact get the EPIPE error.

现在，如果服务器在尝试写之前关闭了SHUT_WR，那么它实际上会得到EPIPE错误。

shutdown(s_sock, SHUT_WR);
if (send(s_sock, "a", 1, MSG_NOSIGNAL) < 0) perror("send");

If, instead, you want the client to indicate an immediate reset to the server, you can force that to happen on most TCP stacks by enabling the linger option, with a linger timeout of 0 prior to calling close.

相反，如果您希望客户端指示立即对服务器进行重置，那么您可以通过启用linger选项来强制在大多数TCP堆栈上发生这种情况，在调用close之前，还需要停留在0。

struct linger lo = { 1, 0 };
setsockopt(c_sock, SOL_SOCKET, SO_LINGER, &lo, sizeof(lo));
close(c_sock);

With the above change, the output of the program becomes:

随着上述变化，程序的输出变为:

P|Recv-Q|Send-Q|Local Address|Foreign Address|State|
before close on client
tcp|0|0|127.0.0.1:35043|127.0.0.1:4321|ESTABLISHED|
tcp|0|0|127.0.0.1:4321|127.0.0.1:35043|ESTABLISHED|
after close on client
send: Connection reset by peer
after send on server
end of test

The send gets an immediate error in this case, but it is not EPIPE, it is ECONNRESET. The TCP dump reflects this as well:

在这种情况下，发送会有一个立即的错误，但是它不是EPIPE，它是ECONNRESET。TCP转储也反映了这一点:

17:44:21 127.0.0.1 > 127.0.0.1
 .662163 IP .35043 > .4321: S 498617888:498617888(0) win 32792 <mss 16396,sackOK,timestamp 3919204411 0,nop,wscale 7>
 .662176 IP .4321 > .35043: S 497680435:497680435(0) ack 498617889 win 32768 <mss 16396,sackOK,timestamp 3919204411 3919204411,nop,wscale 7>
 .662184 IP .35043 > .4321: . ack 1 win 257 <nop,nop,timestamp 3919204411 3919204411>
 .691207 IP .35043 > .4321: R 1:1(0) ack 1 win 257 <nop,nop,timestamp 3919204440 3919204411>

The RESET packet comes right after the 3-way handshake completes. However, using this option has its dangers. If the other end has unread data in the socket buffer when the RESET arrives, that data will be purged, causing the data to be lost. Forcing a RESET to be sent is usually used in request/response style protocols. The sender of the request can know there can be no data lost when it receives the entire response to its request. Then, it is safe for the request sender to force a RESET to be sent on the connection.

3路握手完成后，重置包就会出现。然而，使用这个选项有它的危险。如果另一端在重置到达时在套接字缓冲区中有未读数据，则将清除该数据，从而导致数据丢失。强制重置被发送通常用于请求/响应样式协议。请求的发送方可以知道，当接收到对其请求的全部响应时，不会丢失任何数据。然后，请求发送方可以安全地将一个重置发送到连接上。

#2

You have two sockets - one for the client and another for the server. Now your client is doing the active close.This means TCP's conection termination has been started by the client ( A tcp FIN segment has been sent from the client send).

您有两个套接字——一个用于客户机，另一个用于服务器。现在您的客户端正在进行主动关闭。这意味着TCP的连接终止由客户端启动(TCP FIN段已从客户端发送)。

At this stage you see the client socket in FIN_WAIT1 state. Now what is the state of the server socket now? It is in CLOSE_WAIT state.So the server socket is not closed.

在这个阶段，您将看到FIN_WAIT1状态的客户端套接字。现在服务器套接字的状态是什么?它位于CLOSE_WAIT状态。因此服务器套接字没有关闭。

The FIN from the server has not been sent yet. (Why - since the application has not closed the socket). At this stage you are writing over the server socket so you are not getting an error.

服务器的FIN还没有发送。(为什么——因为应用程序没有关闭套接字)。在这个阶段，您正在编写服务器套接字，这样您就不会出现错误。

Now if you want to see the error just write close(client_fd) before writing over the socket.

现在，如果您想看到错误，只需在写入套接字之前写入close(client_fd)。

close(client_fd);
printf( "Write result: %d\n", write( client_fd, "123", 3 ) );

Here the server socket is no more in CLOSE_WAIT state so you can see return value of write is -ve to indicate the error. I hope this clarifies.

在这里，服务器套接字不再位于CLOSE_WAIT状态，所以您可以看到write的返回值-ve表示错误。我希望这澄清。

#3

After having called write() one (first) time (as coded in your example) after the client close()ed the socket, you'll be getting the expected EPIPE and SIGPIPE on any successive call to write().

在客户端关闭了套接字之后，调用write()一个(第一次)时间(在您的示例中编码)之后，您将在任何连续的调用中获得预期的EPIPE和SIGPIPE()。

Just try adding another write() to provoke the error:

尝试添加另一个write()来引发错误:

...
printf( "Errno before: %s\n", strerror( errno ) );
printf( "Write result: %d\n", write( client_fd, "123", 3 ) );
printf( "Errno after:  %s\n", strerror( errno ) );

printf( "Errno before: %s\n", strerror( errno ) );
printf( "Write result: %d\n", write( client_fd, "A", 1 ) );
printf( "Errno after:  %s\n", strerror( errno ) );
...

The output will be:

的输出将会是:

Accepting
Server sleeping
Client closing its fd... Client exiting.
Errno before: Success
Write result: 3
Errno after:  Success
Errno before: Success
Client status is 0, server status is 13

The output of the last two printf()s is missing as the process terminates due to SIGPIPE being raised by the second call to write(). To avoid the termination of the process, you might like to make the process ignore SIGPIPE.

最后两个printf()s的输出在进程终止时丢失，因为第二个调用write()会引发SIGPIPE。为了避免流程的终止，您可能会希望进程忽略SIGPIPE。

#4

I suspect that what's happening is the server side socket is still valid so your write call is making a valid attempt at writing to your file descriptor even though your TCP session is in a closed state. If I am completely wrong let me know.

我怀疑发生的情况是服务器端套接字仍然有效，所以您的write调用正在对您的文件描述符进行有效的尝试，即使您的TCP会话处于关闭状态。如果我完全错了，请告诉我。

#5

I guess that you're running into the TCP stack detecting a failed send and attempting retransmission. Do subsequent calls to write() fail silently? In other words, try writing five times to the closed socket and see if you eventually get a SIGPIPE. And when you say the write 'succeeds', do you get a return result of 3?

我猜你正在运行TCP堆栈检测一个失败的发送和尝试重新传输。随后的write()调用会静默失败吗?换句话说，试着将5次写入到关闭的套接字中，看看是否最终得到了一个SIGPIPE。当你说写“成功”时，你得到3的返回结果吗?

#1