SQL Server:一个查询中的两个COUNT在输出中相互相乘

时间:2021-08-16 23:07:18

I have a query is used to display information in a queue and part of that information is showing the amount of child entities (packages and labs) that belong to the parent entity (change). However instead of showing the individual counts of each type of child, they multiply with one another.

我有一个查询用于显示队列中的信息,部分信息显示属于父实体(更改)的子实体(包和实验室)的数量。然而,它们不是显示每种类型儿童的个人数量,而是相互增加。

In the below case, there are supposed to be 3 labs and 18 packages, however the the multiply with one another and the output is 54 of each.

在下面的例子中,应该有3个实验室和18个包,但是它们相互相乘,每个输出为54个。

Below is the offending portion of the query.

以下是查询的违规部分。

SELECT cef.ChangeId, COUNT(pac.PackageId) AS 'Packages', COUNT(lab.LabRequestId) AS 'Labs'
FROM dbo.ChangeEvaluationForm cef
LEFT JOIN dbo.Lab
ON cef.ChangeId = Lab.ChangeId
LEFT JOIN dbo.Package pac 
ON (cef.ChangeId = pac.ChangeId AND pac.PackageStatus != 6 AND pac.PackageStatus !=7)
WHERE cef.ChangeId = 255
GROUP BY cef.ChangeId

I feel like this is obvious but it's not occurring to me how to fix it so the two counts are independent of one another like to me they should be. There doesn't seem to be a scenario like this in any of my research either. Can anyone guide me in the right direction?

我觉得这很明显,但我没有想到如何解决这个问题,所以两个计数彼此独立,就像我应该这样。在我的任何研究中似乎都没有这样的场景。谁能引导我朝着正确的方向前进?

3 个解决方案

#1


2  

Because you do multiply source rows by each left join. So sometimes you have more likely cross join here.

因为您通过每个左连接将源行相乘。所以有时你更有可能在这里交叉加入。

SELECT cef.ChangeId, p.Packages, l.Labs
FROM dbo.ChangeEvaluationForm cef
OUTER APPLY(
  SELECT COUNT(*) as Labs
  FROM dbo.Lab
  WHERE cef.ChangeId = Lab.ChangeId
) l
OUTER APPLY(
  SELECT COUNT(*) AS Packages
  FROM dbo.Package pac 
  WHERE (cef.ChangeId = pac.ChangeId AND pac.PackageStatus != 6 AND pac.PackageStatus !=7)
) p
WHERE cef.ChangeId = 255
GROUP BY cef.ChangeId

perhaps GROUP BY is not needed now.

现在可能不需要GROUP BY。

#2


1  

From you question its difficult to derive what result do you expect from your query. So I presume you want following result:

从您的问题来看,很难得出您对查询的期望结果。所以我认为你想要以下结果:

+----------+----------+------+
| ChangeId | Packages | Labs |
+----------+----------+------+
|      255 |       18 |    3 |
+----------+----------+------+

Try below query if you are looking for above mentioned result.

如果您正在寻找上述结果,请尝试以下查询。

SELECT cef.ChangeId, ISNULL(pac.PacCount, 0) AS 'Packages', ISNULL(Lab.LabCount, 0) AS 'Labs'
FROM dbo.ChangeEvaluationForm cef
LEFT JOIN (SELECT Lab.ChangeId, COUNT(*) LabCount FROM dbo.Lab GROUP BY) Lab
ON cef.ChangeId = Lab.ChangeId
LEFT JOIN (SELECT pac.ChangeId, COUNT(*) PacCount FROM dbo.Package pac WHERE pac.PackageStatus != 6 AND pac.PackageStatus !=7 GROUP BY pac.ChangeId) pac
ON cef.ChangeId = pac.ChangeId
WHERE cef.ChangeId = 255

Query Explanation:

查询说明:

  • In your query you didn't use group by, so it ended up giving you 54 as count which is Cartesian product.
  • 在你的查询中你没有使用group by,所以最终给你54作为笛卡尔积的计数。
  • In this query I tried to group by 'ChangeId' and find aggregate before joining tables. So 3 labs and 18 packages will be counted before join.
  • 在此查询中,我尝试按“ChangeId”进行分组,并在连接表之前查找聚合。因此,在加入之前将计算3个实验室和18个包。
  • Your will also notice that I have moved PackageStatus filter before group by in pac table. So unwanted record won't mess with our count.
  • 您还会注意到我已经在pac table中的group by之前移动了PackageStatus过滤器。所以不必要的记录不会影响我们的计数。

#3


0  

You start with a particular ChangeId from the dbo.ChangeEvaluationForm table (ChangeId = 255 from your example), then join to the dbo.Lab table. This join makes your result go from 1 row to 3, considering there are 3 Labs with ChangeId = 255. Your problem is on the next join, you are joining all 3 resulting rows from the previous join with the dbo.Package table, which has 18 rows for ChangeId = 255. The resulting count for columns pac.PackageId and lab.LabRequestId will then be 3 x 18 = 54.

您从dbo.ChangeEvaluationForm表中的特定ChangeId开始(示例中的ChangeId = 255),然后加入dbo.Lab表。这个连接使你的结果从1行变为3,考虑到有3个实验室,其中ChangeId = 255.你的问题是在下一次连接时,你将使用dbo.Package表加入前一次连接的所有3个结果行ChangeId为18行= 255.列pac.PackageId和lab.LabRequestId的结果计数将为3 x 18 = 54。

To get what you want, there are 2 easy solutions:

为了得到你想要的,有两个简单的解决方案:

  • Use COUNT DISTINCT instead of COUNT. This will just count the different values of pac.PackageId and lab.LabRequestId and not the repeated ones.
  • 使用COUNT DISTINCT而不是COUNT。这将只计算pac.PackageId和lab.LabRequestId的不同值,而不是重复的值。
  • Split the joins into 2 subqueries and join their result (by ChangeId)
  • 将连接拆分为2个子查询并加入其结果(通过ChangeId)

#1


2  

Because you do multiply source rows by each left join. So sometimes you have more likely cross join here.

因为您通过每个左连接将源行相乘。所以有时你更有可能在这里交叉加入。

SELECT cef.ChangeId, p.Packages, l.Labs
FROM dbo.ChangeEvaluationForm cef
OUTER APPLY(
  SELECT COUNT(*) as Labs
  FROM dbo.Lab
  WHERE cef.ChangeId = Lab.ChangeId
) l
OUTER APPLY(
  SELECT COUNT(*) AS Packages
  FROM dbo.Package pac 
  WHERE (cef.ChangeId = pac.ChangeId AND pac.PackageStatus != 6 AND pac.PackageStatus !=7)
) p
WHERE cef.ChangeId = 255
GROUP BY cef.ChangeId

perhaps GROUP BY is not needed now.

现在可能不需要GROUP BY。

#2


1  

From you question its difficult to derive what result do you expect from your query. So I presume you want following result:

从您的问题来看,很难得出您对查询的期望结果。所以我认为你想要以下结果:

+----------+----------+------+
| ChangeId | Packages | Labs |
+----------+----------+------+
|      255 |       18 |    3 |
+----------+----------+------+

Try below query if you are looking for above mentioned result.

如果您正在寻找上述结果,请尝试以下查询。

SELECT cef.ChangeId, ISNULL(pac.PacCount, 0) AS 'Packages', ISNULL(Lab.LabCount, 0) AS 'Labs'
FROM dbo.ChangeEvaluationForm cef
LEFT JOIN (SELECT Lab.ChangeId, COUNT(*) LabCount FROM dbo.Lab GROUP BY) Lab
ON cef.ChangeId = Lab.ChangeId
LEFT JOIN (SELECT pac.ChangeId, COUNT(*) PacCount FROM dbo.Package pac WHERE pac.PackageStatus != 6 AND pac.PackageStatus !=7 GROUP BY pac.ChangeId) pac
ON cef.ChangeId = pac.ChangeId
WHERE cef.ChangeId = 255

Query Explanation:

查询说明:

  • In your query you didn't use group by, so it ended up giving you 54 as count which is Cartesian product.
  • 在你的查询中你没有使用group by,所以最终给你54作为笛卡尔积的计数。
  • In this query I tried to group by 'ChangeId' and find aggregate before joining tables. So 3 labs and 18 packages will be counted before join.
  • 在此查询中,我尝试按“ChangeId”进行分组,并在连接表之前查找聚合。因此,在加入之前将计算3个实验室和18个包。
  • Your will also notice that I have moved PackageStatus filter before group by in pac table. So unwanted record won't mess with our count.
  • 您还会注意到我已经在pac table中的group by之前移动了PackageStatus过滤器。所以不必要的记录不会影响我们的计数。

#3


0  

You start with a particular ChangeId from the dbo.ChangeEvaluationForm table (ChangeId = 255 from your example), then join to the dbo.Lab table. This join makes your result go from 1 row to 3, considering there are 3 Labs with ChangeId = 255. Your problem is on the next join, you are joining all 3 resulting rows from the previous join with the dbo.Package table, which has 18 rows for ChangeId = 255. The resulting count for columns pac.PackageId and lab.LabRequestId will then be 3 x 18 = 54.

您从dbo.ChangeEvaluationForm表中的特定ChangeId开始(示例中的ChangeId = 255),然后加入dbo.Lab表。这个连接使你的结果从1行变为3,考虑到有3个实验室,其中ChangeId = 255.你的问题是在下一次连接时,你将使用dbo.Package表加入前一次连接的所有3个结果行ChangeId为18行= 255.列pac.PackageId和lab.LabRequestId的结果计数将为3 x 18 = 54。

To get what you want, there are 2 easy solutions:

为了得到你想要的,有两个简单的解决方案:

  • Use COUNT DISTINCT instead of COUNT. This will just count the different values of pac.PackageId and lab.LabRequestId and not the repeated ones.
  • 使用COUNT DISTINCT而不是COUNT。这将只计算pac.PackageId和lab.LabRequestId的不同值,而不是重复的值。
  • Split the joins into 2 subqueries and join their result (by ChangeId)
  • 将连接拆分为2个子查询并加入其结果(通过ChangeId)