I have a query of the following shape :
我查询了以下形状:
A
UNION ALL
B
UNION ALL
C
UNION ALL
D
And would like to parallelize it by calling A, B, C and D in parallel and then, union them in .Net. Those 4 queries all read data in the same pool of tables.
并希望通过并行调用A,B,C和D来并行化,然后将它们联合在.Net中。这4个查询都读取同一个表池中的数据。
The first step I tried is to execute A in one Tab of SSMS and B in a second tab. A, and B last about 1 minute each, so I have the time to launch A, then go to tab B and launch it.
我尝试的第一步是在SSMS的一个选项卡中执行A,在第二个选项卡中执行B. A和B各持续约1分钟,所以我有时间启动A,然后转到选项卡B并启动它。
But when I look at CPU usage, it stalls at 13% (I'm on a 8 core CPU @ Windows), which corresponds to 100% of one core.
但是当我查看CPU使用率时,它停留在13%(我在8核CPU @ Windows上),相当于一个核心的100%。
My questions : - do the "lock" mechanisms allow two queries on a given set of tables to run in parallel ? - if parallelism is possible, would I notice it by trying to run A and B in parallel in two SSMS tabs or is my test flawed ? - how would it be possible to improve the performance of that UNION ALL series eventually ?
我的问题: - “锁定”机制是否允许对一组给定表的两个查询并行运行? - 如果可能存在并行性,我会通过尝试在两个SSMS选项卡中并行运行A和B来检查它,还是我的测试存在缺陷? - 最终如何才能提高UNION ALL系列的性能?
1 个解决方案
#1
3
You don't force parallelism at all. The engine decides it based on the query cost
你根本不强迫并行性。引擎根据查询成本决定它
See how to run a parallel query on sql server 2008?
了解如何在sql server 2008上运行并行查询?
In this case, you could open 4 connections to the database and run 4 querys and append the results. You'd use .net parallel processing to then join the result sets.
在这种情况下,您可以打开4个数据库连接并运行4个查询并附加结果。您将使用.net并行处理然后加入结果集。
However, it will almost always be more efficient and simpler to UNION in the database.
但是,在数据库中,UNION几乎总是更有效,更简单。
Reading from the same pool of tables does not promote or prevent parallelism: shared locks are issued which don't block other readers. And the data will be cached so less IO is used.
从同一个表池中读取不会促进或阻止并行性:发布不会阻止其他读者的共享锁。并且数据将被缓存,因此使用的IO更少。
If each UNION clause is taking too long then you have other problems such as poor indexes, old/no statistics, too little RAM, badly structured queries, tempdb issues... and many other possibilities
如果每个UNION子句花费的时间太长,那么您还有其他问题,例如索引不佳,旧/无统计信息,RAM太少,查询结构错误,tempdb问题......以及许多其他可能性
tl;dr
Fix the queries. Don't work around them.
修复查询。不要在他们周围工作。
#1
3
You don't force parallelism at all. The engine decides it based on the query cost
你根本不强迫并行性。引擎根据查询成本决定它
See how to run a parallel query on sql server 2008?
了解如何在sql server 2008上运行并行查询?
In this case, you could open 4 connections to the database and run 4 querys and append the results. You'd use .net parallel processing to then join the result sets.
在这种情况下,您可以打开4个数据库连接并运行4个查询并附加结果。您将使用.net并行处理然后加入结果集。
However, it will almost always be more efficient and simpler to UNION in the database.
但是,在数据库中,UNION几乎总是更有效,更简单。
Reading from the same pool of tables does not promote or prevent parallelism: shared locks are issued which don't block other readers. And the data will be cached so less IO is used.
从同一个表池中读取不会促进或阻止并行性:发布不会阻止其他读者的共享锁。并且数据将被缓存,因此使用的IO更少。
If each UNION clause is taking too long then you have other problems such as poor indexes, old/no statistics, too little RAM, badly structured queries, tempdb issues... and many other possibilities
如果每个UNION子句花费的时间太长,那么您还有其他问题,例如索引不佳,旧/无统计信息,RAM太少,查询结构错误,tempdb问题......以及许多其他可能性
tl;dr
Fix the queries. Don't work around them.
修复查询。不要在他们周围工作。