I have a query that has 7 inner joins (because a lot of the information is distributed in other tables), a few coworkers have been surprised. I was wondering if they should be surprised or is having 7 inner joins normal?
我有一个有7个内连接的查询(因为很多信息都分布在其他表中),一些同事感到惊讶。我想知道他们是否应该感到惊讶,或者是否有7个内连接正常?
14 个解决方案
#1
25
it's not unheard of, but I would place it into a view for ease of use, and maintenance
这并不是闻所未闻,但我会把它放在一个易于使用和维护的视图中
#2
17
Two questions:
两个问题:
- Does it work?
- 它有用吗?
- Can you explain it?
- 你能解释一下吗?
If so, then seven is fine. If you can't explain the query, then seven is too many.
如果是这样,那么七是好的。如果你无法解释查询,那么七就太多了。
#3
4
Depending on what you are trying to accomplish, a large number of joins in a query is not remarkable.
根据您要完成的任务,查询中的大量连接并不显着。
Personally, I would be less concerned with the number of joins employed to return a desired result set and more concerned with whether the query is optimized and running within acceptable parameters.
就个人而言,我不太关心用于返回所需结果集的连接数,而是更关心查询是否已经优化并在可接受的参数范围内运行。
If the query is fully optimized and cannot be trimmed down but the query itself does not execute quickly enough then it is possible that the data structure design is not the right fit with what you're trying to do. At which point you can re-evaluate what you're trying to accomplish or the structure of the data that is feeding your business model.
如果查询已完全优化且无法修剪但查询本身执行速度不够快,那么数据结构设计可能与您尝试执行的操作不匹配。此时,您可以重新评估您要完成的工作或为您的业务模型提供数据的结构。
#5
3
It's not at all unusual. With a system like Siebel it's common to see join counts in double figures.
这并不奇怪。使用像Siebel这样的系统,通常会看到连接计数为两位数。
#6
3
Seven joins makes it tougher for readability, but more important are performance and scalability. If those are OK, go for it.
七个连接使得可读性变得更加困难,但更重要的是性能和可伸缩性。如果这些都没问题,那就去吧。
#7
2
It's probably not normal but it's certainly not excessive. If you find yourself joining the same tables over and over, create some views.
这可能不正常,但肯定不会过分。如果您发现自己一遍又一遍地加入相同的表,请创建一些视图。
#8
2
number of joins depends on your data model, 7 joins can be in your query if that is what you query for - I recall having similar queries in an app I worked on long time ago, the performance depends on many factors (table size, indexes, server load, server config to name few) and I do not think it can be generalized that 7 joins are bad.
连接数取决于您的数据模型,如果您查询的是7个连接可以在您的查询中 - 我记得在我很久以前工作的应用程序中有类似的查询,性能取决于许多因素(表大小,索引,服务器负载,服务器配置名称很少)我不认为它可以推广7连接是坏的。
if it works for you then I guess its fine :D
如果它适合你,那么我猜它很好:D
#9
2
Yes, it's normal - but, really, it's not such a great idea from a performance perspective. Since query plans are built on estimated costs, there is an increase in the number of errors as you increase joins (or any other operator, for that matter):
是的,这是正常的 - 但是,从性能的角度来看,它确实不是一个好主意。由于查询计划是基于估计成本构建的,因此当您增加连接(或任何其他运算符)时,错误数量会增加:
The SQL Server Query Optimizer will estimate a minimum of one row coming out of a seek operator. This is done to avoid the case when a very expensive subtree is picked due to an cardinality underestimation. If the subtree is estimated to return zero rows, many plans cost about the same and there can be errors in plan selection as a result. So, you’ll notice that the estimation is “high” for this case, and some errors could result. You also might notice that we estimate 20 executions of this branch instead of the actual 10. However, given the number of joins that have been evaluated before this operator, being off by a factor of 2 (10 rows) isn’t considered to be too bad. (Errors can increase exponentially with the number of joins).
SQL Server查询优化器将估计来自搜索运算符的最少一行。这样做是为了避免由于基数低估而挑选非常昂贵的子树的情况。如果估计子树返回零行,则许多计划的成本大致相同,因此计划选择中可能存在错误。因此,您会注意到这种情况下的估计值“很高”,并且可能会导致一些错误。您还可能会注意到我们估计该分支的20次执行而不是实际的10次。但是,考虑到在此运算符之前已经计算的连接数,因此不被认为是2(10行)的关闭太糟糕了。 (错误可以随着连接数呈指数增长)。
Also, the optimizer attempts to balance the time required to come up with a plan versus the potential savings - it won't spend all day trying to find the most optimal plan. The more joins, the greater the number of alternative plans exist - some of which may be more optimal than the optimizer has time to find.
此外,优化程序会尝试平衡计划制定所需的时间与潜在的节省 - 它不会花费一整天时间来寻找最佳计划。连接越多,替代计划的数量就越多 - 其中一些可能比优化器有时间找到的更优化。
#10
2
7 or even more is not at all unusual in data warehouses where a fact table could easily have foreign keys to a dozen dimensions. In the data warehouse scenario, the cardinality of the dimensions is usually low compared to the fact table, so filters on the dimensions help the fact table be utilized through an index seek or scan.
在数据仓库中,7或甚至更多是不常见的,其中事实表可以容易地具有十几个维度的外键。在数据仓库场景中,与事实表相比,维度的基数通常较低,因此维度上的过滤器有助于通过索引搜索或扫描来利用事实表。
For a normalized transactional schema, it is not usually a problem if the cardinality of the results set is low in the primary base table (i.e. select everything about one customer), because the foreign keys can normally simply result in index seeks or index scans.
对于规范化的事务模式,如果结果集的基数在主基表中较低(即选择关于一个客户的所有内容),则通常不会出现问题,因为外键通常可以简单地导致索引搜索或索引扫描。
#11
1
7 is fine if your database design requires it. However, if 7 is neccessary to achieve your goal, I'd reexamine the database design to make sure this level of obscurity is really neccessary.
如果你的数据库设计需要它,7很好。但是,如果7是实现目标所必需的,我会重新检查数据库设计,以确保这种隐蔽性确实是必要的。
Out of curiosity, is this DB2? Just a pattern I've noticed :)
出于好奇,这是DB2吗?只是我注意到的模式:)
#12
1
is this 7 inner joins on the same table, 7 inner joins on different tables, or 7 nested inner joins?
是同一个表上的这7个内连接,不同表上的7个内连接,还是7个嵌套的内连接?
...trick question! It really doesn't matter, if that is what your database structure requires, then it is correct
......技巧问题!这无关紧要,如果这是您的数据库结构所需要的,那么它是正确的
caveat: if it is 7 nested inner joins on the same table, you probably have a poorly-structured table ;-)
警告:如果它是同一个表上的7个嵌套内连接,你可能有一个结构不合理的表;-)
#13
0
I think what you want to avoid is a join depth greater than 7. 7 inner joins of less than 7 joins in depth certainly isn't unheard of, but sometimes people hear "7 joins" and think the no-no is 7 joins, not depth.
我认为你要避免的是一个大于7的连接深度.7个连接深度少于7个连接肯定不是闻所未闻的,但有时人们听到“7连接”并且认为no-no是7个连接,没有深度。
#14
0
It's certainly not unusual. However at least in Oracle, 7 is a special number, as any more than that and the optimizer can no longer test every join order (due to factorial growth in the number of possibilities). So it would be wise to avoid going over 7 unless you're prepared to babysit your execution plan.
这当然不常见。但是至少在Oracle中,7是一个特殊的数字,不止于此,优化器不能再测试每个连接顺序(由于可能性数量的因子增长)。因此,除非你准备照看你的执行计划,否则避免超过7是明智的。
#1
25
it's not unheard of, but I would place it into a view for ease of use, and maintenance
这并不是闻所未闻,但我会把它放在一个易于使用和维护的视图中
#2
17
Two questions:
两个问题:
- Does it work?
- 它有用吗?
- Can you explain it?
- 你能解释一下吗?
If so, then seven is fine. If you can't explain the query, then seven is too many.
如果是这样,那么七是好的。如果你无法解释查询,那么七就太多了。
#3
4
Depending on what you are trying to accomplish, a large number of joins in a query is not remarkable.
根据您要完成的任务,查询中的大量连接并不显着。
Personally, I would be less concerned with the number of joins employed to return a desired result set and more concerned with whether the query is optimized and running within acceptable parameters.
就个人而言,我不太关心用于返回所需结果集的连接数,而是更关心查询是否已经优化并在可接受的参数范围内运行。
If the query is fully optimized and cannot be trimmed down but the query itself does not execute quickly enough then it is possible that the data structure design is not the right fit with what you're trying to do. At which point you can re-evaluate what you're trying to accomplish or the structure of the data that is feeding your business model.
如果查询已完全优化且无法修剪但查询本身执行速度不够快,那么数据结构设计可能与您尝试执行的操作不匹配。此时,您可以重新评估您要完成的工作或为您的业务模型提供数据的结构。
#4
#5
3
It's not at all unusual. With a system like Siebel it's common to see join counts in double figures.
这并不奇怪。使用像Siebel这样的系统,通常会看到连接计数为两位数。
#6
3
Seven joins makes it tougher for readability, but more important are performance and scalability. If those are OK, go for it.
七个连接使得可读性变得更加困难,但更重要的是性能和可伸缩性。如果这些都没问题,那就去吧。
#7
2
It's probably not normal but it's certainly not excessive. If you find yourself joining the same tables over and over, create some views.
这可能不正常,但肯定不会过分。如果您发现自己一遍又一遍地加入相同的表,请创建一些视图。
#8
2
number of joins depends on your data model, 7 joins can be in your query if that is what you query for - I recall having similar queries in an app I worked on long time ago, the performance depends on many factors (table size, indexes, server load, server config to name few) and I do not think it can be generalized that 7 joins are bad.
连接数取决于您的数据模型,如果您查询的是7个连接可以在您的查询中 - 我记得在我很久以前工作的应用程序中有类似的查询,性能取决于许多因素(表大小,索引,服务器负载,服务器配置名称很少)我不认为它可以推广7连接是坏的。
if it works for you then I guess its fine :D
如果它适合你,那么我猜它很好:D
#9
2
Yes, it's normal - but, really, it's not such a great idea from a performance perspective. Since query plans are built on estimated costs, there is an increase in the number of errors as you increase joins (or any other operator, for that matter):
是的,这是正常的 - 但是,从性能的角度来看,它确实不是一个好主意。由于查询计划是基于估计成本构建的,因此当您增加连接(或任何其他运算符)时,错误数量会增加:
The SQL Server Query Optimizer will estimate a minimum of one row coming out of a seek operator. This is done to avoid the case when a very expensive subtree is picked due to an cardinality underestimation. If the subtree is estimated to return zero rows, many plans cost about the same and there can be errors in plan selection as a result. So, you’ll notice that the estimation is “high” for this case, and some errors could result. You also might notice that we estimate 20 executions of this branch instead of the actual 10. However, given the number of joins that have been evaluated before this operator, being off by a factor of 2 (10 rows) isn’t considered to be too bad. (Errors can increase exponentially with the number of joins).
SQL Server查询优化器将估计来自搜索运算符的最少一行。这样做是为了避免由于基数低估而挑选非常昂贵的子树的情况。如果估计子树返回零行,则许多计划的成本大致相同,因此计划选择中可能存在错误。因此,您会注意到这种情况下的估计值“很高”,并且可能会导致一些错误。您还可能会注意到我们估计该分支的20次执行而不是实际的10次。但是,考虑到在此运算符之前已经计算的连接数,因此不被认为是2(10行)的关闭太糟糕了。 (错误可以随着连接数呈指数增长)。
Also, the optimizer attempts to balance the time required to come up with a plan versus the potential savings - it won't spend all day trying to find the most optimal plan. The more joins, the greater the number of alternative plans exist - some of which may be more optimal than the optimizer has time to find.
此外,优化程序会尝试平衡计划制定所需的时间与潜在的节省 - 它不会花费一整天时间来寻找最佳计划。连接越多,替代计划的数量就越多 - 其中一些可能比优化器有时间找到的更优化。
#10
2
7 or even more is not at all unusual in data warehouses where a fact table could easily have foreign keys to a dozen dimensions. In the data warehouse scenario, the cardinality of the dimensions is usually low compared to the fact table, so filters on the dimensions help the fact table be utilized through an index seek or scan.
在数据仓库中,7或甚至更多是不常见的,其中事实表可以容易地具有十几个维度的外键。在数据仓库场景中,与事实表相比,维度的基数通常较低,因此维度上的过滤器有助于通过索引搜索或扫描来利用事实表。
For a normalized transactional schema, it is not usually a problem if the cardinality of the results set is low in the primary base table (i.e. select everything about one customer), because the foreign keys can normally simply result in index seeks or index scans.
对于规范化的事务模式,如果结果集的基数在主基表中较低(即选择关于一个客户的所有内容),则通常不会出现问题,因为外键通常可以简单地导致索引搜索或索引扫描。
#11
1
7 is fine if your database design requires it. However, if 7 is neccessary to achieve your goal, I'd reexamine the database design to make sure this level of obscurity is really neccessary.
如果你的数据库设计需要它,7很好。但是,如果7是实现目标所必需的,我会重新检查数据库设计,以确保这种隐蔽性确实是必要的。
Out of curiosity, is this DB2? Just a pattern I've noticed :)
出于好奇,这是DB2吗?只是我注意到的模式:)
#12
1
is this 7 inner joins on the same table, 7 inner joins on different tables, or 7 nested inner joins?
是同一个表上的这7个内连接,不同表上的7个内连接,还是7个嵌套的内连接?
...trick question! It really doesn't matter, if that is what your database structure requires, then it is correct
......技巧问题!这无关紧要,如果这是您的数据库结构所需要的,那么它是正确的
caveat: if it is 7 nested inner joins on the same table, you probably have a poorly-structured table ;-)
警告:如果它是同一个表上的7个嵌套内连接,你可能有一个结构不合理的表;-)
#13
0
I think what you want to avoid is a join depth greater than 7. 7 inner joins of less than 7 joins in depth certainly isn't unheard of, but sometimes people hear "7 joins" and think the no-no is 7 joins, not depth.
我认为你要避免的是一个大于7的连接深度.7个连接深度少于7个连接肯定不是闻所未闻的,但有时人们听到“7连接”并且认为no-no是7个连接,没有深度。
#14
0
It's certainly not unusual. However at least in Oracle, 7 is a special number, as any more than that and the optimizer can no longer test every join order (due to factorial growth in the number of possibilities). So it would be wise to avoid going over 7 unless you're prepared to babysit your execution plan.
这当然不常见。但是至少在Oracle中,7是一个特殊的数字,不止于此,优化器不能再测试每个连接顺序(由于可能性数量的因子增长)。因此,除非你准备照看你的执行计划,否则避免超过7是明智的。