Presto性能调优，并行执行时查询要慢得多

I have a presto cluster configured with 12 workers that is being queried by Java applications. The cluster is capable of performing 30 concurrent requests (if there are more, they are queued).

我有一个presto集群配置了12个工作程序,Java应用程序正在查询它们。群集能够执行30个并发请求(如果有更多,则它们排队)。

The applications might send around 80-100 distinct queries, which I expect to be handled by cluster.

应用程序可能会发送大约80-100个不同的查询,我希望这些查询由集群处理。

Problem: When queries are performed sequentially they complete significantly faster than when they are performed in parallel.

问题:当按顺序执行查询时,它们的完成速度明显快于并行执行查询时的速度。

For instance, if I run 100 queries sequentially each of them takes 1-12 seconds to complete and they all are completed in around 2 minutes. But if I start all of them in parallel it takes around 8-12 minutes to complete them all. At corner cases it takes up to 30 minutes.

例如,如果我按顺序运行100个查询,则每个查询需要1-12秒才能完成,并且它们都在大约2分钟内完成。但如果我并行启动所有这些,则需要大约8-12分钟才能完成所有这些操作。在拐角处,最多需要30分钟。

If I look on the presto console I see that most of the queries are blocked and only 1-3 are in fact in Running state.

如果我查看presto控制台,我发现大多数查询都被阻止,只有1-3个实际上处于Running状态。

Unfortunately I can't post any of the queries. They usually access different schemas (up to 6 in one query), they are full of joins and nested queries. At the same time most of them are written following presto best practices.

不幸的是我无法发布任何查询。它们通常访问不同的模式(一个查询中最多6个),它们充满了连接和嵌套查询。同时,大多数都是按照最佳实践编写的。

Question: How can I improve performance? At least what areas should I investigate to find out the root cause?

问题:如何提高性能?至少我应该调查哪些方面来找出根本原因?

Here are some metrics for one of the slowest queries (may be the numbers will say something to you).

以下是一个最慢查询的指标(可能是数字会对你说些什么)。

Resource Utilization Summary

CPU Time            8.42m
Scheduled Time      26.04m
Blocked Time        4.77d
Input Rows          298M
Input Data          9.94GB
Raw Input Rows      323M
Raw Input Data      4.34GB
Peak Memory         10.18GB
Memory Pool         reserved
Cumulative Memory   181G seconds

Timeline

Parallelism         477
Scheduled Time/s    1.47K
Input Rows/s        281K
Input Bytes/s       9.60MB
Memory Utilization  0B

1 个解决方案

#1

Figured out the issue myself.

我自己想出了这个问题。

Presto is a distributed SQL query engine. And key word here is distributed. It guarantees that if you run a query it is efficiently distributed among workers and performed with high speed.

Presto是一个分布式SQL查询引擎。这里的关键词是分布式的。它保证了如果您运行查询,它将在工作人员之间高效分配并高速执行。

Performing parallel queries and expecting that Presto will figure out how to efficiently parallel them is a misuse. It is more like relational database approach which unfortunately doesn't work in Presto.

执行并行查询并期望Presto将弄清楚如何有效地并行它们是一种误用。它更像是关系数据库方法,遗憾的是它在Presto中不起作用。

#1