如何组合GROUP BY和ROW_NUMBER?

时间:2022-09-03 15:30:37

I hope following sample code is self-explanatory:

我希望以下示例代码不言自明:

declare @t1 table (ID int,Price money, Name varchar(10))
declare @t2 table (ID int,Orders int,  Name varchar(10))
declare @relation  table (t1ID int,t2ID int)
insert into @t1 values(1, 200, 'AAA');
insert into @t1 values(2, 150, 'BBB');
insert into @t1 values(3, 100, 'CCC');
insert into @t2 values(1,25,'aaa');
insert into @t2 values(2,35,'bbb');
insert into @relation values(1,1);
insert into @relation values(2,1);
insert into @relation values(3,2);

select T2.ID AS T2ID
,T2.Name as T2Name
,T2.Orders
,T1.ID AS T1ID
,T1.Name As T1Name
,T1Sum.Price
FROM @t2 T2
INNER JOIN (
    SELECT Rel.t2ID
        ,MAX(Rel.t1ID)AS t1ID 
-- the MAX returns an arbitrary ID, what i need is: 
--      ,ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
        ,SUM(Price)AS Price
        FROM @t1 T1 
        INNER JOIN @relation Rel ON Rel.t1ID=T1.ID
        GROUP BY Rel.t2ID
)AS T1Sum ON  T1Sum.t2ID = T2.ID
INNER JOIN @t1 T1 ON T1Sum.t1ID=T1.ID

Result:

结果:

T2ID   T2Name   Orders  T1ID    T1Name  Price     
 1      aaa       25     2       BBB    350,00     
 2      bbb       35     3       CCC    100,00

What i need is commented above, a way to get the ROW_NUMBER but also to Group By in the first place. So i need the sum of all T1-prices grouped by T2.ID in the relation-table and in the outer query the t1ID with the highest price.

我需要的是上面提到的,一种获取ROW_NUMBER的方法,也是首先获得Group By的方法。所以我需要在关系表中按T2.ID分组的所有T1价格和在外部查询中具有最高价格的t1ID的总和。

In other words: How to change MAX(Rel.t1ID)AS t1ID to somewhat returning the ID with the highest price?

换句话说:如何将MAX(Rel.t1ID)AS t1ID更改为稍微返回具有最高价格的ID?

So the desired result is(notice that first T1ID changed from 2 to 1 since it has the higher price):

所以期望的结果是(注意第一个T1ID从2变为1,因为它具有更高的价格):

T2ID   T2Name   Orders  T1ID    T1Name  Price     
 1      aaa       25     1       AAA    350,00     
 2      bbb       35     3       CCC    100,00

Note: in case you're wondering why i don't multiply Orders with Price: they are not realated(so i should have left off this column since it's a bit ambiguous, please ignore it, i've just added it to make all less abstract). Actually Orders must remain unchanged, that's the reason for the sub-query approach to join both and the reason why i need to group by in the first place.

注意:如果你想知道为什么我不会将订单与价格相乘:它们不是realated(所以我应该放弃这个专栏,因为它有点含糊不清,请忽略它,我刚刚添加它以使所有不那么抽象)。实际上订单必须保持不变,这就是子查询方法加入两者的原因以及我需要首先分组的原因。

Conclusion: obviously the core of my question can be answered by the OVER clause that can be applied to any aggregate function like SUM(see Damien's answer) what was new to me. Thank you all for your working approaches.

结论:显然我的问题的核心可以通过OVER子句来回答,该子句可以应用于任何聚合函数,例如SUM(参见Damien的答案)对我来说什么是新的。谢谢大家的工作方法。

4 个解决方案

#1


55  

Wow, the other answers look complex - so I'm hoping I've not missed something obvious.

哇,其他答案看起来很复杂 - 所以我希望我没有错过任何明显的东西。

You can use OVER/PARTITION BY against aggregates, and they'll then do grouping/aggregating without a GROUP BY clause. So I just modified your query to:

您可以对聚合使用OVER / PARTITION BY,然后它们将在没有GROUP BY子句的情况下进行分组/聚合。所以我只是将您的查询修改为:

select T2.ID AS T2ID
    ,T2.Name as T2Name
    ,T2.Orders
    ,T1.ID AS T1ID
    ,T1.Name As T1Name
    ,T1Sum.Price
FROM @t2 T2
INNER JOIN (
    SELECT Rel.t2ID
        ,Rel.t1ID
 --       ,MAX(Rel.t1ID)AS t1ID 
-- the MAX returns an arbitrary ID, what i need is: 
      ,ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
        ,SUM(Price)OVER(PARTITION BY Rel.t2ID) AS Price
        FROM @t1 T1 
        INNER JOIN @relation Rel ON Rel.t1ID=T1.ID
--        GROUP BY Rel.t2ID
)AS T1Sum ON  T1Sum.t2ID = T2.ID
INNER JOIN @t1 T1 ON T1Sum.t1ID=T1.ID
where t1Sum.PriceList = 1

Which gives the requested result set.

这给出了请求的结果集。

#2


3  

;with C as
(
  select Rel.t2ID,
         Rel.t1ID,
         t1.Price,
         row_number() over(partition by Rel.t2ID order by t1.Price desc) as rn
  from @t1 as T1
    inner join @relation as Rel
      on T1.ID = Rel.t1ID
)
select T2.ID as T2ID,
       T2.Name as T2Name,
       T2.Orders,
       T1.ID as T1ID,
       T1.Name as T1Name,
       T1Sum.Price
from @t2 as T2
  inner join (
              select C1.t2ID,
                     sum(C1.Price) as Price,
                     C2.t1ID
              from C as C1
                inner join C as C2 
                  on C1.t2ID = C2.t2ID and
                     C2.rn = 1
              group by C1.t2ID, C2.t1ID
             ) as T1Sum
    on T2.ID = T1Sum.t2ID
  inner join @t1 as T1
    on T1.ID = T1Sum.t1ID

#3


3  

Undoubtly this can be simplified but the results match your expectations.

毫无疑问,这可以简化,但结果符合您的期望。

The gist of this is to

这个的要点是

  • Calculate the maximum price in a seperate CTE for each t2ID
  • 计算每个t2ID的单独CTE的最高价格
  • Calculate the total price in a seperate CTE for each t2ID
  • 计算每个t2ID的单独CTE的总价格
  • Combine the results of both CTE's
  • 结合两个CTE的结果

SQL Statement

SQL语句

;WITH MaxPrice AS ( 
    SELECT  t2ID
            , t1ID
    FROM    (       
                SELECT  t2.ID AS t2ID
                        , t1.ID AS t1ID
                        , rn = ROW_NUMBER() OVER (PARTITION BY t2.ID ORDER BY t1.Price DESC)
                FROM    @t1 t1
                        INNER JOIN @relation r ON r.t1ID = t1.ID        
                        INNER JOIN @t2 t2 ON t2.ID = r.t2ID
            ) maxt1
    WHERE   maxt1.rn = 1                            
)
, SumPrice AS (
    SELECT  t2ID = t2.ID
            , Price = SUM(Price)
    FROM    @t1 t1
            INNER JOIN @relation r ON r.t1ID = t1.ID
            INNER JOIN @t2 t2 ON t2.ID = r.t2ID
    GROUP BY
            t2.ID           
)           
SELECT  t2.ID
        , t2.Name
        , t2.Orders
        , mp.t1ID
        , t1.ID
        , t1.Name
        , sp.Price
FROM    @t2 t2
        INNER JOIN MaxPrice mp ON mp.t2ID = t2.ID
        INNER JOIN SumPrice sp ON sp.t2ID = t2.ID
        INNER JOIN @t1 t1 ON t1.ID = mp.t1ID

#4


2  

The deduplication (to select the max T1) and the aggregation need to be done as distinct steps. I've used a CTE since I think this makes it clearer:

重复数据删除(选择最大T1)和聚合需要作为不同的步骤完成。我已经使用了CTE,因为我认为这使它更清晰:

;WITH sumCTE
AS
(
    SELECT  Rel.t2ID, SUM(Price) price
    FROM    @t1         AS T1
    JOIN    @relation   AS Rel 
    ON      Rel.t1ID=T1.ID
    GROUP 
    BY      Rel.t2ID
)
,maxCTE
AS
(
    SELECT  Rel.t2ID, Rel.t1ID, 
            ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
    FROM    @t1         AS T1
    JOIN    @relation   AS Rel 
    ON      Rel.t1ID=T1.ID
)
SELECT T2.ID AS T2ID
,T2.Name as T2Name
,T2.Orders
,T1.ID AS T1ID
,T1.Name As T1Name
,sumT1.Price
FROM    @t2 AS T2
JOIN    sumCTE AS sumT1
ON      sumT1.t2ID = t2.ID
JOIN    maxCTE AS maxT1
ON      maxT1.t2ID = t2.ID
JOIN    @t1 AS T1
ON      T1.ID = maxT1.t1ID
WHERE   maxT1.PriceList = 1

#1


55  

Wow, the other answers look complex - so I'm hoping I've not missed something obvious.

哇,其他答案看起来很复杂 - 所以我希望我没有错过任何明显的东西。

You can use OVER/PARTITION BY against aggregates, and they'll then do grouping/aggregating without a GROUP BY clause. So I just modified your query to:

您可以对聚合使用OVER / PARTITION BY,然后它们将在没有GROUP BY子句的情况下进行分组/聚合。所以我只是将您的查询修改为:

select T2.ID AS T2ID
    ,T2.Name as T2Name
    ,T2.Orders
    ,T1.ID AS T1ID
    ,T1.Name As T1Name
    ,T1Sum.Price
FROM @t2 T2
INNER JOIN (
    SELECT Rel.t2ID
        ,Rel.t1ID
 --       ,MAX(Rel.t1ID)AS t1ID 
-- the MAX returns an arbitrary ID, what i need is: 
      ,ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
        ,SUM(Price)OVER(PARTITION BY Rel.t2ID) AS Price
        FROM @t1 T1 
        INNER JOIN @relation Rel ON Rel.t1ID=T1.ID
--        GROUP BY Rel.t2ID
)AS T1Sum ON  T1Sum.t2ID = T2.ID
INNER JOIN @t1 T1 ON T1Sum.t1ID=T1.ID
where t1Sum.PriceList = 1

Which gives the requested result set.

这给出了请求的结果集。

#2


3  

;with C as
(
  select Rel.t2ID,
         Rel.t1ID,
         t1.Price,
         row_number() over(partition by Rel.t2ID order by t1.Price desc) as rn
  from @t1 as T1
    inner join @relation as Rel
      on T1.ID = Rel.t1ID
)
select T2.ID as T2ID,
       T2.Name as T2Name,
       T2.Orders,
       T1.ID as T1ID,
       T1.Name as T1Name,
       T1Sum.Price
from @t2 as T2
  inner join (
              select C1.t2ID,
                     sum(C1.Price) as Price,
                     C2.t1ID
              from C as C1
                inner join C as C2 
                  on C1.t2ID = C2.t2ID and
                     C2.rn = 1
              group by C1.t2ID, C2.t1ID
             ) as T1Sum
    on T2.ID = T1Sum.t2ID
  inner join @t1 as T1
    on T1.ID = T1Sum.t1ID

#3


3  

Undoubtly this can be simplified but the results match your expectations.

毫无疑问,这可以简化,但结果符合您的期望。

The gist of this is to

这个的要点是

  • Calculate the maximum price in a seperate CTE for each t2ID
  • 计算每个t2ID的单独CTE的最高价格
  • Calculate the total price in a seperate CTE for each t2ID
  • 计算每个t2ID的单独CTE的总价格
  • Combine the results of both CTE's
  • 结合两个CTE的结果

SQL Statement

SQL语句

;WITH MaxPrice AS ( 
    SELECT  t2ID
            , t1ID
    FROM    (       
                SELECT  t2.ID AS t2ID
                        , t1.ID AS t1ID
                        , rn = ROW_NUMBER() OVER (PARTITION BY t2.ID ORDER BY t1.Price DESC)
                FROM    @t1 t1
                        INNER JOIN @relation r ON r.t1ID = t1.ID        
                        INNER JOIN @t2 t2 ON t2.ID = r.t2ID
            ) maxt1
    WHERE   maxt1.rn = 1                            
)
, SumPrice AS (
    SELECT  t2ID = t2.ID
            , Price = SUM(Price)
    FROM    @t1 t1
            INNER JOIN @relation r ON r.t1ID = t1.ID
            INNER JOIN @t2 t2 ON t2.ID = r.t2ID
    GROUP BY
            t2.ID           
)           
SELECT  t2.ID
        , t2.Name
        , t2.Orders
        , mp.t1ID
        , t1.ID
        , t1.Name
        , sp.Price
FROM    @t2 t2
        INNER JOIN MaxPrice mp ON mp.t2ID = t2.ID
        INNER JOIN SumPrice sp ON sp.t2ID = t2.ID
        INNER JOIN @t1 t1 ON t1.ID = mp.t1ID

#4


2  

The deduplication (to select the max T1) and the aggregation need to be done as distinct steps. I've used a CTE since I think this makes it clearer:

重复数据删除(选择最大T1)和聚合需要作为不同的步骤完成。我已经使用了CTE,因为我认为这使它更清晰:

;WITH sumCTE
AS
(
    SELECT  Rel.t2ID, SUM(Price) price
    FROM    @t1         AS T1
    JOIN    @relation   AS Rel 
    ON      Rel.t1ID=T1.ID
    GROUP 
    BY      Rel.t2ID
)
,maxCTE
AS
(
    SELECT  Rel.t2ID, Rel.t1ID, 
            ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
    FROM    @t1         AS T1
    JOIN    @relation   AS Rel 
    ON      Rel.t1ID=T1.ID
)
SELECT T2.ID AS T2ID
,T2.Name as T2Name
,T2.Orders
,T1.ID AS T1ID
,T1.Name As T1Name
,sumT1.Price
FROM    @t2 AS T2
JOIN    sumCTE AS sumT1
ON      sumT1.t2ID = t2.ID
JOIN    maxCTE AS maxT1
ON      maxT1.t2ID = t2.ID
JOIN    @t1 AS T1
ON      T1.ID = maxT1.t1ID
WHERE   maxT1.PriceList = 1