是否有任何工具可以优化JMS队列上的消费者和生产者线程的数量？

I'm working on an application that is distributed over two JBoss instances and that produces/consumes JMS messages on several JMS queues.

我正在研究一个分布在两个JBoss实例上的应用程序,它在几个JMS队列上生成/使用JMS消息。

When we configured the application we had to determine which threading model we would use, in particular the number of producing and consuming threads per queue. We have done this in a rather ad-hoc fashion but after reading the most recent columns by Herb Sutter in Dr Dobbs (in particular this one) I would like to size our threads in a more rigorous manner.

当我们配置应用程序时,我们必须确定我们将使用哪个线程模型,特别是每个队列生成和使用线程的数量。我们以一种相当特别的方式做到了这一点,但是在阅读了Dobbs博士的Herb Sutter最新专栏(特别是这篇专栏文章)后,我想以更严格的方式调整我们的主题。

Are there any methods/tools to measure the throughput of JMS queues (in particular JBoss Messaging queues) as a function of the number of producing/consuming threads?

是否有任何方法/工具来衡量JMS队列(特别是JBoss Messaging队列)的吞吐量,作为生产/消费线程数量的函数?

2 个解决方案

#1

This is not really about a specific tool, but may be helpful.

这不是一个特定的工具,但可能会有所帮助。

Consumers:

Not sure what your inner architecture is, but let's assume it's an MDB reading in messages. I assert that your only requirement here for rigorous thread count sizing is to choose a maximum cap. If your MDB uses resources from a finite supplier like a JDBC connection pool, consider the maximum cap as the highest number of concurrent instances from that resource that you can tolerate taking. If the MDB's queue is remote, you probably want to consider remote connections (or technically, JMS sessions) a finite resource. If the MDB has less finite requirements (and the queue is local), your maximum cap becomes the number of threads, memory used and/or flat out CPU consumed by the working threads. The reasoning here is that the JBoss MDB container will simply keep allocating more MDB instances (and therefore threads) until the queue is empty or the maximum cap is reached. The only reason I can think of that you would really agonize over the minimum would be if the container's elapsed time or overhead to create new instances is above your tolerance and those operations are usually pretty small potatoes.

不确定你的内部架构是什么,但我们假设它是消息中的MDB读取。我断言,严格的线程数大小的唯一要求是选择最大上限。如果您的MDB使用来自有限供应商(如JDBC连接池)的资源,请将最大上限视为您可以容忍的来自该资源的最大并发实例数。如果MDB的队列是远程的,您可能希望将远程连接(或技术上,JMS会话)视为有限资源。如果MDB的有限要求较少(并且队列是本地的),则最大上限将变为工作线程消耗的线程数,内存数和/或平均CPU数。这里的原因是JBoss MDB容器将继续分配更多MDB实例(以及线程),直到队列为空或达到最大上限。我能想到的唯一原因是,如果容器的创建新实例的耗用时间或开销高于容差,那么这些操作通常都是非常小的土豆。

Producers

A general axiom of messaging is that producers nearly always outperform consumers. You would think this is pretty arbitrary, but it is a pattern I see recurring all the time, even in widely different messaging scenarios. Anyways, it's tough to say how the threading should work for the producer without knowing a bit about the application, but are you basically capable of [indefinitely] proportionally increasing the number of producer threads and the number of messages generated, or do you have some sort of cap where additional threads simply do not generate more messages ? I would guess it is the latter since most useful work has some limited data or calculation supplier. As I see it, the two drivers here are ordering and persistence.

消息传递的一般概念是生产者几乎总是胜过消费者。你会认为这是非常随意的,但它是我一直看到的模式,即使在广泛不同的消息传递方案中也是如此。无论如何,很难说线程如何在不知道应用程序的情况下如何为生产者工作,但是你基本上能够[无限期地]按比例增加生产者线程的数量和生成的消息数量,或者你是否有一些其他线程根本不生成更多消息的上限?我猜它是后者,因为大多数有用的工作都有一些有限的数据或计算供应商。在我看来,这里的两个驱动程序是订购和持久性。

First off, if you have strict message ordering where messages must be processed in strict (FPFP) First Produced First Processed then you're in a bit of a bind because you almost have to drop down to single threaded throughput unless you can devise some form of logical message demarcation (eg. a client number where any given client's messages are always sent to the same queue, but you may have multiple queues each serviced by one thread so each client is effectively FPFP).

首先,如果你有严格的消息排序,其中必须严格处理消息(FPFP)First Produced First Processed然后你有点绑定,因为你几乎必须下降到单线程吞吐量,除非你可以设计一些形式逻辑消息划分(例如,客户端编号,其中任何给定客户端的消息总是发送到同一队列,但是您可能有多个队列,每个队列由一个线程服务,因此每个客户端实际上是FPFP)。

Ordering aside, persistence is the next consideration in that if you have reliable and extensive message persistence, (or have a very high tolerance for message loss) just let the producer threads go to town. The messages will queue up reliably and eventually the consumers will [hopefully] catch up. However, if your message persistence message count or simple queue depths can potentially give you the willies when they get too high, here's where a tool might come in useful. If your producer thread count can be dynamically modified (which they can in many Java ThreadPool implementations) then you could sample the queue depths and raise or lower the producer thread count in accordance with the queue depth ranges you define, optionally to the point where if the consumers basically stall, so will the producers. I do not know of a specific tool that does this but between two JBoss servers this is fairly simple to whip up. Picking your queue depth-->producer thread count will be trickier.

排除在外,持久性是下一个考虑因素,如果你有可靠和广泛的消息持久性,(或者对消息丢失具有很高的容忍度),那就让生产者线程进入城镇。消息将可靠地排队,最终消费者将[希望]赶上。但是,如果您的消息持久性消息计数或简单的队列深度可能会在它们变得过高时为您提供帮助,那么这里的工具可能会有用。如果您的生产者线程计数可以动态修改(他们可以在许多Java ThreadPool实现中),那么您可以根据您定义的队列深度范围对队列深度进行采样并提高或降低生产者线程计数,可选地,如果消费者基本上都会失速,生产者也会这样。我不知道一个特定的工具可以做到这一点,但在两个JBoss服务器之间,这是非常简单的鞭打。选择队列深度 - >生产者线程计数将更加棘手。

Having said all that, I am going to actually read the article you linked to.....

说了这么多,我实际上会读到你链接的文章......

#2

I've got the perfect thing for you: IBM provide a free command line tool called perfharness.

我有一个完美的东西:IBM提供了一个名为perfharness的免费命令行工具。

It's aimed at benchmarking JMS providers, i.e. measuring the throughput of queues (single or multiple) given different numbers of producing or consuming threads.

它旨在对JMS提供程序进行基准测试,即在给定不同数量的生产或消费线程的情况下测量队列(单个或多个)的吞吐量。

Some features:

Send and consume messages at a fixed rate (msg/s) or at maximum rate possible on the queue

以固定速率(msg / s)或队列上可能的最大速率发送和使用消息

Use a specific number of threads

使用特定数量的线程

Use either JMS or native MQ

使用JMS或本机MQ

Can use data either generated randomly or taken from a file

可以使用随机生成的数据或从文件中获取的数据

Generates statistics telling you exactly how fast your queue is performing

生成统计信息,告诉您队列的确切执行速度

The only down side is that it's not super intuitive, given the number of operations it supports. And IBM haven't open sourced it, which is a shame. However it sounds perfect for your purposes.

唯一的缺点是,考虑到它支持的操作数量,它不是超级直观的。 IBM还没有开源,这是一种耻辱。然而,这听起来非常适合您的目的。

#1