Apache Kafka集群启动失败

I'm trying to start a spark streaming session which consumes from a Kafka queue and I'm using Zookeeper for config mgt. However, when I try to start this following exception is being thrown.

我正在尝试启动一个spark流会话，该会话使用Kafka队列，我正在使用Zookeeper进行配置管理。然而，当我尝试启动这个异常时，抛出了以下异常。

18/03/26 09:25:49 INFO ZookeeperConnection: Checking Kafka topic core-data-tickets does exists ...

18/03/26 09:25:49 INFO Broker: Kafka topic core-data-tickets exists
18/03/26 09:25:49 INFO Broker: Processing topic : core-data-tickets
18/03/26 09:25:49 WARN ZookeeperConnection: Resetting Topic Offset
org.I0Itec.zkclient.exception.ZkNoNodeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/clt/offsets/core-data-tickets/4
    at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
    at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
    at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
    at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
    at kafka.utils.ZkUtils$.readData(ZkUtils.scala:443)
    at kafka.utils.ZkUtils.readData(ZkUtils.scala)
    at net.core.data.connection.ZookeeperConnection.readTopicPartitionOffset(ZookeeperConnection.java:145)

I have already created the relevant Kafka topic.

我已经创建了相关的Kafka主题。

Any insights on this would be highly appreciated.

对此的任何见解都将受到高度赞赏。

I'm using the following code to run the spark job

我正在使用以下代码运行spark作业

spark-submit --class net.core.data.compute.Broker     --executor-memory 512M     --total-executor-cores 2     --driver-java-options "-Dproperties.path=/ebs/tmp/continuous-loading-tool/continuous-loading-tool/src/main/resources/dev.properties"  --conf spark.ui.port=4045   /ebs/tmp/dev/data/continuous-loading-tool/target/continuous-loading-tool-1.0-SNAPSHOT.jar

1 个解决方案

#1

I guess that this error has to do with offsets retention. By default, offsets are stored for only 1440 minutes (i.e. 24 hours). Therefore, if the group has not committed offsets within a day, Kafka won't have information about it.

我猜这个错误与偏移保留有关。默认情况下，偏移量仅存储1440分钟(即24小时)。因此，如果该组织在一天内没有承诺补偿，Kafka就不会有关于它的信息。

A possible workaround is to set the value of offsets.retention.minutes accordingly.

一种可能的解决方法是设置off .retention的值。相应的分钟。

offsets.retention.minutes

offsets.retention.minutes

Offsets older than this retention period will be discarded

超过此保留期的偏移量将被丢弃

#1