c++ kafka 客户端rdkafka报Receive failed: Disconnected问题原因以及解决方法

时间:2021-12-22 06:12:01

%3|1538976114.812|FAIL|rdkafka#producer-1| [thrd:kafka-server:9092/bootstrap]: kafka-server:9092/0: Receive failed: Disconnected
%3|1538976114.812|ERROR|rdkafka#producer-1| [thrd:kafka-server:9092/bootstrap]: kafka-server:9092/0: Receive failed: Disconnected
%3|1538976114.812|ERROR|rdkafka#producer-1| [thrd:kafka-server:9092/bootstrap]: 1/1 brokers are down

原因:

Why am I seeing Receive failed: Disconnected?

If the remote peer, typically the broker (but could also be an active TCP gateway of some kind), closes the connection you'll see a log message like this:

%3|1500588440.537|FAIL|rdkafka#producer-1| 10.255.84.150:9092/1: Receive failed: Disconnected

There are a number of possible reasons, in order of how common they are:

  • Broker's idle connection reaper closes the connection due to inactivity. This is controlled by the broker configuration property connections.max.idle.ms and defaults to 10 minutes. This is by far the most common reason for spontaneous disconnects.
  • The client sent an unsupported protocol request; see Broker version compatibility. This is considered a configuration error on the client. The broker should log an exception explaining why the connection was closed, see the broker logs.
  • The client sent a malformed protocol request; this is an indication of a bug in the client. The broker should log an exception explaining why the connection was closed, see the broker logs.
  • The broker is in an invalid state. The broker should log an exception explaining why the connection was closed, see the broker logs.
  • TCP gateway/load-balancer/firewall session timeout. Try enabling TCP keep-alives on the client by setting socket.keepalive.enable=true.

Since a TCP close can't signal why the remote peer closed the connection there is no way for the client to know what went wrong. If the disconnect logs are getting annoying and the admin deems they are caused by the idle connection reaper, the log.connection.close client configuration property can be set to false to silence all spontaneous disconnect logs.

NOTE: Whenever a connection is closed for whatever reason, librdkafka will automatically reconnect after reconnect.backoff.jitter.ms (default 500ms).

简单的说,其中之一是服务器会kill掉10分钟空闲的连接,librdkafka会在连接断开后500毫秒内重连。所以,根本解决方法就是没事每分钟发个心跳信息。

参考:https://github.com/edenhill/librdkafka/wiki/FAQ#why-am-i-seeing-receive-failed-disconnected