SQL - 如何按ID分组并识别具有最高值的列?

时间:2021-02-14 09:17:39

I have a SQL challenge which I need a little help with.

我有一个SQL挑战,我需要一些帮助。

Below is a simplified example, in my real case I have about 500k rows in a slow VIEW. So if you have a solution that is effective as well, I would appreciate it. I'm thinking I have to use GROUP BY in one way or another, but I'm not sure.

下面是一个简化的例子,在我的实际情况中,我在慢速VIEW中有大约500k行。所以,如果你有一个有效的解决方案,我将不胜感激。我想我必须以这种或那种方式使用GROUP BY,但我不确定。

Let's say I have a table like this

假设我有一张这样的桌子

╔═════════╦══════════╦══════════╦═══════╗
║ ORDERID ║   NAME   ║   TYPE   ║ PRICE ║
╠═════════╬══════════╬══════════╬═══════╣
║       1 ║ Broccoli ║ Food     ║ 1     ║
║       1 ║ Beer     ║ Beverage ║ 5     ║
║       1 ║ Coke     ║ Beverage ║ 2     ║
║       2 ║ Beef     ║ Food     ║ 2.5   ║
║       2 ║ Juice    ║ Beverage ║ 1.5   ║
║       3 ║ Beer     ║ Beverage ║ 5     ║
║       4 ║ Tomato   ║ Food     ║ 1     ║
║       4 ║ Apple    ║ Food     ║ 1     ║
║       4 ║ Broccoli ║ Food     ║ 1     ║
╚═════════╩══════════╩══════════╩═══════╝

So what I want to do is:

所以我想做的是:

In each order, where there are BOTH food and beverage order line, I want the highest beverage price

在每个订单中,有两个食品和饮料订单行,我想要最高的饮料价格

So in this example i would like to have a result set of this:

所以在这个例子中我想有一个结果集:

╔═════════╦═══════╦═══════╗
║ ORDERID ║ NAME  ║ PRICE ║
╠═════════╬═══════╬═══════╣
║       1 ║ Beer  ║ 5     ║
║       2 ║ Juice ║ 1.5   ║
╚═════════╩═══════╩═══════╝

How can I acheive this in an effective way?

我怎样才能有效地实现这一目标?

4 个解决方案

#1


2  

Since you have tagged SQL Server, make use of Common Table Expression and Window Functions.

由于您已标记SQL Server,因此请使用公用表表达式和窗口函数。

;WITH filteredList
AS
(
  SELECT OrderID
  FROM tableName
  WHERE Type IN ('Food','Beverage')
  GROUP BY OrderID
  HAVING COUNT(DISTINCT Type) = 2
),
greatestList
AS
(
    SELECT  a.OrderID, a.Name, a.Type, a.Price,
            DENSE_RANK() OVER (PARTITION BY a.OrderID
                                ORDER BY a.Price DESC) rn
    FROM tableName  a
          INNER JOIN filteredList b
              ON a.OrderID = b.OrderID
    WHERE a.Type = 'Beverage'
)
SELECT  OrderID, Name, Type, Price
FROM    greatestList
WHERE   rn = 1

#2


3  

You can use the a subquery that gets the max(price) for each order with both food and beverage and then join that back to your table to get the result:

您可以使用获得食品和饮料的每个订单的最大(价格)的子查询,然后将其加入到您的表中以获得结果:

select t1.orderid,
  t1.name,
  t1.price
from yourtable t1
inner join
(
  select max(price) MaxPrice, orderid
  from yourtable t
  where type = 'Beverage'
    and exists (select orderid
                from yourtable o
                where type in ('Food', 'Beverage')
                  and t.orderid = o.orderid
                group by orderid
                having count(distinct type) = 2)
  group by orderid
) t2
  on t1.orderid = t2.orderid
  and t1.price = t2.MaxPrice

See SQL Fiddle with Demo

请参阅SQL Fiddle with Demo

The result is:

结果是:

| ORDERID |  NAME | PRICE |
---------------------------
|       1 |  Beer |     5 |
|       2 | Juice |   1.5 |

#3


2  

This is relational division: link 1, link 2.

这是关系划分:链接1,链接2。

If the divisor table ( only food and beverage) is static then you could use one of these solutions:

如果除数表(仅限食品和饮料)是静态的,那么您可以使用以下解决方案之一:

DECLARE @OrderDetail TABLE 
    ([OrderID] int, [Name] varchar(8), [Type] varchar(8), [Price] decimal(10,2))
;

INSERT INTO @OrderDetail
    ([OrderID], [Name], [Type], [Price])
SELECT 1, 'Broccoli', 'Food', 1.0
UNION ALL SELECT 1, 'Beer', 'Beverage', 5.0
UNION ALL SELECT 1, 'Coke', 'Beverage', 2.0
UNION ALL SELECT 2, 'Beef', 'Food', 2.5
UNION ALL SELECT 2, 'Juice', 'Beverage', 1.5
UNION ALL SELECT 3, 'Beer', 'Beverage', 5.0
UNION ALL SELECT 4, 'Tomato', 'Food', 1.0
UNION ALL SELECT 4, 'Apple', 'Food', 1.0
UNION ALL SELECT 4, 'Broccoli', 'Food', 1.0

-- Solution 1
SELECT  od.OrderID, 
        COUNT(DISTINCT od.Type) AS DistinctTypeCount, 
        MAX(CASE WHEN od.Type='beverage' THEn od.Price END) AS MaxBeveragePrice
FROM    @OrderDetail od
WHERE   od.Type IN ('food', 'beverage')
GROUP BY od.OrderID
HAVING  COUNT(DISTINCT od.Type) = 2 -- 'food' & 'beverage'

-- Solution 2: better performance
SELECT  pvt.OrderID,
        pvt.food AS MaxFoodPrice,
        pvt.beverage AS MaxBeveragePrice
FROM (
    SELECT  od.OrderID, od.Type, od.Price
    FROM    @OrderDetail od
    WHERE   od.Type IN ('food', 'beverage')
) src
PIVOT ( MAX(src.Price) FOR src.Type IN ([food], [beverage]) ) pvt
WHERE   pvt.food IS NOT NULL
AND     pvt.beverage IS NOT NULL

Results (for Solution 1 & 2):

结果(对于解决方案1和2):

OrderID     DistinctTypeCount MaxBeveragePrice
----------- ----------------- ---------------------------------------
1           2                 5.00
2           2                 1.50

Table 'Worktable'. Scan count 2, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#09DE7BCC'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

OrderID     MaxFoodPrice                            MaxBeveragePrice
----------- --------------------------------------- ---------------------------------------
1           1.00                                    5.00
2           2.50                                    1.50

Table '#09DE7BCC'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

#4


1  

If you're using Sql-Server 2005 or greater you can use a CTE with DENSE_RANK function:

如果您使用的是Sql-Server 2005或更高版本,则可以使用具有DENSE_RANK函数的CTE:

WITH CTE 
     AS (SELECT orderid, 
                name, 
                type, 
                price, 
                RN = Dense_rank() 
                       OVER ( 
                         PARTITION BY orderid 
                         ORDER BY CASE WHEN type='Beverage' THEN 0 ELSE 1 END ASC 
                         , price DESC) 
         FROM   dbo.tablename t 
         WHERE  EXISTS(SELECT 1 
                       FROM   dbo.tablename t2 
                       WHERE  t2.orderid = t.orderid 
                              AND type = 'Food') 
         AND    EXISTS(SELECT 1 
                       FROM   dbo.tablename t2 
                       WHERE  t2.orderid = t.orderid 
                              AND type = 'Beverage')) 
SELECT orderid, 
       name, 
       price 
FROM   CTE
WHERE  rn = 1 

Use DENSE_RANK if you want all orders with the same highest price and ROW_NUMBER if you want one.

如果您希望所有订单具有相同的最高价格,请使用DENSE_RANK;如果需要,请使用ROW_NUMBER。

DEMO

#1


2  

Since you have tagged SQL Server, make use of Common Table Expression and Window Functions.

由于您已标记SQL Server,因此请使用公用表表达式和窗口函数。

;WITH filteredList
AS
(
  SELECT OrderID
  FROM tableName
  WHERE Type IN ('Food','Beverage')
  GROUP BY OrderID
  HAVING COUNT(DISTINCT Type) = 2
),
greatestList
AS
(
    SELECT  a.OrderID, a.Name, a.Type, a.Price,
            DENSE_RANK() OVER (PARTITION BY a.OrderID
                                ORDER BY a.Price DESC) rn
    FROM tableName  a
          INNER JOIN filteredList b
              ON a.OrderID = b.OrderID
    WHERE a.Type = 'Beverage'
)
SELECT  OrderID, Name, Type, Price
FROM    greatestList
WHERE   rn = 1

#2


3  

You can use the a subquery that gets the max(price) for each order with both food and beverage and then join that back to your table to get the result:

您可以使用获得食品和饮料的每个订单的最大(价格)的子查询,然后将其加入到您的表中以获得结果:

select t1.orderid,
  t1.name,
  t1.price
from yourtable t1
inner join
(
  select max(price) MaxPrice, orderid
  from yourtable t
  where type = 'Beverage'
    and exists (select orderid
                from yourtable o
                where type in ('Food', 'Beverage')
                  and t.orderid = o.orderid
                group by orderid
                having count(distinct type) = 2)
  group by orderid
) t2
  on t1.orderid = t2.orderid
  and t1.price = t2.MaxPrice

See SQL Fiddle with Demo

请参阅SQL Fiddle with Demo

The result is:

结果是:

| ORDERID |  NAME | PRICE |
---------------------------
|       1 |  Beer |     5 |
|       2 | Juice |   1.5 |

#3


2  

This is relational division: link 1, link 2.

这是关系划分:链接1,链接2。

If the divisor table ( only food and beverage) is static then you could use one of these solutions:

如果除数表(仅限食品和饮料)是静态的,那么您可以使用以下解决方案之一:

DECLARE @OrderDetail TABLE 
    ([OrderID] int, [Name] varchar(8), [Type] varchar(8), [Price] decimal(10,2))
;

INSERT INTO @OrderDetail
    ([OrderID], [Name], [Type], [Price])
SELECT 1, 'Broccoli', 'Food', 1.0
UNION ALL SELECT 1, 'Beer', 'Beverage', 5.0
UNION ALL SELECT 1, 'Coke', 'Beverage', 2.0
UNION ALL SELECT 2, 'Beef', 'Food', 2.5
UNION ALL SELECT 2, 'Juice', 'Beverage', 1.5
UNION ALL SELECT 3, 'Beer', 'Beverage', 5.0
UNION ALL SELECT 4, 'Tomato', 'Food', 1.0
UNION ALL SELECT 4, 'Apple', 'Food', 1.0
UNION ALL SELECT 4, 'Broccoli', 'Food', 1.0

-- Solution 1
SELECT  od.OrderID, 
        COUNT(DISTINCT od.Type) AS DistinctTypeCount, 
        MAX(CASE WHEN od.Type='beverage' THEn od.Price END) AS MaxBeveragePrice
FROM    @OrderDetail od
WHERE   od.Type IN ('food', 'beverage')
GROUP BY od.OrderID
HAVING  COUNT(DISTINCT od.Type) = 2 -- 'food' & 'beverage'

-- Solution 2: better performance
SELECT  pvt.OrderID,
        pvt.food AS MaxFoodPrice,
        pvt.beverage AS MaxBeveragePrice
FROM (
    SELECT  od.OrderID, od.Type, od.Price
    FROM    @OrderDetail od
    WHERE   od.Type IN ('food', 'beverage')
) src
PIVOT ( MAX(src.Price) FOR src.Type IN ([food], [beverage]) ) pvt
WHERE   pvt.food IS NOT NULL
AND     pvt.beverage IS NOT NULL

Results (for Solution 1 & 2):

结果(对于解决方案1和2):

OrderID     DistinctTypeCount MaxBeveragePrice
----------- ----------------- ---------------------------------------
1           2                 5.00
2           2                 1.50

Table 'Worktable'. Scan count 2, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#09DE7BCC'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

OrderID     MaxFoodPrice                            MaxBeveragePrice
----------- --------------------------------------- ---------------------------------------
1           1.00                                    5.00
2           2.50                                    1.50

Table '#09DE7BCC'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

#4


1  

If you're using Sql-Server 2005 or greater you can use a CTE with DENSE_RANK function:

如果您使用的是Sql-Server 2005或更高版本,则可以使用具有DENSE_RANK函数的CTE:

WITH CTE 
     AS (SELECT orderid, 
                name, 
                type, 
                price, 
                RN = Dense_rank() 
                       OVER ( 
                         PARTITION BY orderid 
                         ORDER BY CASE WHEN type='Beverage' THEN 0 ELSE 1 END ASC 
                         , price DESC) 
         FROM   dbo.tablename t 
         WHERE  EXISTS(SELECT 1 
                       FROM   dbo.tablename t2 
                       WHERE  t2.orderid = t.orderid 
                              AND type = 'Food') 
         AND    EXISTS(SELECT 1 
                       FROM   dbo.tablename t2 
                       WHERE  t2.orderid = t.orderid 
                              AND type = 'Beverage')) 
SELECT orderid, 
       name, 
       price 
FROM   CTE
WHERE  rn = 1 

Use DENSE_RANK if you want all orders with the same highest price and ROW_NUMBER if you want one.

如果您希望所有订单具有相同的最高价格,请使用DENSE_RANK;如果需要,请使用ROW_NUMBER。

DEMO