T-SQL GROUP BY：包含其他分组列的最佳方式

I'm a MySQL user who is trying to port some things over to MS SQL Server.

我是一个试图将一些东西移植到MS SQL Server的MySQL用户。

I'm joining a couple of tables, and aggregating some of the columns via GROUP BY.

我正在加入几个表，并通过GROUP BY聚合一些列。

A simple example would be employees and projects:

一个简单的例子是员工和项目：

select empID, fname, lname, title, dept, count(projectID)
from employees E left join projects P on E.empID = P.projLeader
group by empID

...that would work in MySQL, but MS SQL is stricter and requires that everything is either enclosed in an aggregate function or is part of the GROUP BY clause.

...它可以在MySQL中运行，但MS SQL更严格，并且要求所有内容都包含在聚合函数中，或者是GROUP BY子句的一部分。

So, of course, in this simple example, I assume I could just include the extra columns in the group by clause. But the actual query I'm dealing with is pretty complicated, and includes a bunch of operations performed on some of the non-aggregated columns... i.e., it would get REALLY ugly to try to include all of them in the group by clause.

所以，当然，在这个简单的例子中，我假设我可以在group by子句中包含额外的列。但是我正在处理的实际查询非常复杂，并且包括在一些非聚合列上执行的一系列操作...即，尝试将所有这些列包含在group by子句中会非常难看。

So is there a better way to do this?

那么有更好的方法吗？

5 个解决方案

#1

You can get it to work with something around these lines:

你可以让它与这些行周围的东西一起工作：

select e.empID, fname, lname, title, dept, projectIDCount
from
(
   select empID, count(projectID) as projectIDCount
   from employees E left join projects P on E.empID = P.projLeader
   group by empID
) idList
inner join employees e on idList.empID = e.empID

This way you avoid the extra group by operations, and you can get any data you want. Also you have a better chance to make good use of indexes on some scenarios (if you are not returning the full info), and can be better combined with paging.

这样，您可以通过操作避免额外的组，并且您可以获得所需的任何数据。此外，您有更好的机会在某些情况下充分利用索引（如果您没有返回完整信息），并且可以更好地与分页结合使用。

#2

"it would get REALLY ugly to try to include all of them in the group by clause."

“试图将所有这些都包括在group by子句中会非常难看。”

Yup - that's the only way to do it * - just copy and paste the non-aggregated columns into the group by clause, remove the aliases and that's as good as it gets...

是的 - 这是唯一的方法* - 只需将非聚合列复制并粘贴到group by子句中，删除别名就可以了...

*you could wrap it in a nested SELECT but that's probably just as ugly...

*你可以把它包装在一个嵌套的SELECT中，但这可能就像丑陋......

#3

MySQL is unusual - and technically not compliant with the SQL standard - in allowing you to omit items from the GROUP BY clause. In standard SQL, each non-aggregate column in the select-list must be listed in full in the GROUP BY clause (either by name or by ordinal number, but that is deprecated).

MySQL是不寻常的 - 在技术上不符合SQL标准 - 允许您省略GROUP BY子句中的项目。在标准SQL中，select-list中的每个非聚合列必须在GROUP BY子句中完整列出（通过名称或序号，但不推荐使用）。

(Oh, although MySQL is unusual, it is nice that it allows the shorthand.)

（哦，虽然MySQL很不寻常，但它允许速记很好。）

#4

You do not need join in the subquery as it not necessary to make group by based on empID from employees - you can do it on projectLeader field from projects.

您不需要加入子查询，因为不需要根据员工的empID进行分组 - 您可以在项目的projectLeader字段中执行此操作。

With the inner join (as I put) you'll get list of employees that have at least one project. If you want list of all employees just change it to left join

通过内部联接（正如我所说），您将获得至少拥有一个项目的员工列表。如果您想要所有员工的列表，只需将其更改为左连接

  select e.empID, e.fname, e.lname, e.title, e.dept, p.projectIDCount
    from employees e 
   inner join ( select projLeader, count(*) as projectIDCount
                  from projects
                 group by projLeader
              ) p on p.projLeader = e.empID

#5

A subquery in the select clause might also be suitable. It would work for the example given but might not for the actual complicated query you are dealing with.

select子句中的子查询也可能是合适的。它适用于给出的示例，但可能不适用于您正在处理的实际复杂查询。

select
        e.empID, fname, lname, title, dept
        , (select count(*) from projects p where p.projLeader = e.empId) as projectCount
from
   from employees E

#1