我可以使用多少生产者来写一个主题

时间:2023-01-12 18:56:50

I have a web application which put messages into a Kafka topic. There are a lot of instances of this application (200) and each of them contains it's own Kafka Producer.

我有一个Web应用程序将消息放入Kafka主题。这个应用程序有很多实例(200),每个实例都包含它自己的Kafka Producer。

Questions:

  1. Does there exist any upper bound of Producers amount per topic?
  2. 每个主题的生产者数量是否存在上限?

  3. Does the number of Producers impact on Kafka performance? If yes, how?
  4. 生产者的数量是否会影响卡夫卡的表现?如果有,怎么样?

  5. What is the best practice for Producers? One synchronous producer per application, an asynchronous producer, or a custom pool of sync producers?
  6. 制片人的最佳做法是什么?每个应用程序一个同步生产者,异步生成器或同步生成器的自定义池?

1 个解决方案

#1


Is exists any upper bound of Producers amount per topic?

每个主题是否存在生产者数量的上限?

The only limitation I am aware of is the number of available IP addresses. It is unlikely you'd bump into any practical limit in your described application.

我所知道的唯一限制是可用的IP地址数量。您不太可能在所描述的应用程序中遇到任何实际限制。

Does Producer amount impact on Kafka performance? If yes, how?

生产者是否会影响卡夫卡的表现?如果有,怎么样?

No, all other things being equal (traffic volume, asynchronous vs synchronous (including batch size / time constraints), etc).

不,所有其他条件相同(流量,异步与同步(包括批量大小/时间限制)等)。

Presumably there's some overhead somewhere for the connection, but its small enough that I've never managed to notice it.

据推测,连接的某处有一些开销,但它足够小,我从来没有注意到它。

What is Producer best practice (One sync producer per application, async producer or custom pool of sync producers)

什么是生产者最佳实践(每个应用程序一个同步生成器,异步生成器或同步生成器的自定义池)

Depends a whole bunch on your use case, which I am not clear on. For the most part, asynchronous > synchronous. If you choose to use asynchronous, then you have to deal with the risks of batching on the producers (ie data loss), and the delays associated with building up enough messages for a batch / waiting for the batch timeout to trigger. Those delays could be significant if your use case is sufficiently demanding.

取决于你的用例,我不清楚。在大多数情况下,异步>同步。如果您选择使用异步,那么您必须处理批处理生产者的风险(即数据丢失),以及与为批处理构建足够的消息/等待批处理超时触发相关的延迟。如果您的用例要求足够,那么这些延迟可能会很严重。

#1


Is exists any upper bound of Producers amount per topic?

每个主题是否存在生产者数量的上限?

The only limitation I am aware of is the number of available IP addresses. It is unlikely you'd bump into any practical limit in your described application.

我所知道的唯一限制是可用的IP地址数量。您不太可能在所描述的应用程序中遇到任何实际限制。

Does Producer amount impact on Kafka performance? If yes, how?

生产者是否会影响卡夫卡的表现?如果有,怎么样?

No, all other things being equal (traffic volume, asynchronous vs synchronous (including batch size / time constraints), etc).

不,所有其他条件相同(流量,异步与同步(包括批量大小/时间限制)等)。

Presumably there's some overhead somewhere for the connection, but its small enough that I've never managed to notice it.

据推测,连接的某处有一些开销,但它足够小,我从来没有注意到它。

What is Producer best practice (One sync producer per application, async producer or custom pool of sync producers)

什么是生产者最佳实践(每个应用程序一个同步生成器,异步生成器或同步生成器的自定义池)

Depends a whole bunch on your use case, which I am not clear on. For the most part, asynchronous > synchronous. If you choose to use asynchronous, then you have to deal with the risks of batching on the producers (ie data loss), and the delays associated with building up enough messages for a batch / waiting for the batch timeout to trigger. Those delays could be significant if your use case is sufficiently demanding.

取决于你的用例,我不清楚。在大多数情况下,异步>同步。如果您选择使用异步,那么您必须处理批处理生产者的风险(即数据丢失),以及与为批处理构建足够的消息/等待批处理超时触发相关的延迟。如果您的用例要求足够,那么这些延迟可能会很严重。