
时间:2021-11-19 12:49:16

I'm using PostgreSQL. I need to select the max of each group, the situation is that the table represents the products sell on each day, and I want to know the top sold product of each day.


SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as producto
      ,to_char(date_trunc('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia
FROM detalle_orden
LEFT JOIN orden ON orden.id = detalle_orden.order_id
GROUP BY orden.fecha,detalle_orden.producto_id 
ORDER BY dia,suma desc

Is returning:

suma  producto  dia
4     1         FRI
1     2         FRI
5     3         TUE
2     2         TUE

I want to get:


suma  producto  dia
4     1         FRI
5     3         TUE

Only the top product of each day (with the max(suma) of each group).


I tried different approaches, like subqueries, but the aggregate function used make things a bit difficult.


3 个解决方案


You can still use DISTINCT ON to get this done in a single query level without subquery, because DISTINCT is applied after GROUP BY and aggregate functions (and after window functions):

您仍然可以使用DISTINCT ON在没有子查询的单个查询级别中完成此操作,因为DISTINCT在GROUP BY和聚合函数之后(以及在窗口函数之后)应用:

       sum(d.cantidad) AS suma
     , d.producto_id AS producto
     , to_char(o.fecha AT TIME ZONE 'MST', 'DY') AS dia
FROM   detalle_orden d
LEFT   JOIN orden o ON o.id = d.order_id
GROUP  BY o.fecha, d.producto_id 
ORDER  BY 3, 1 DESC NULLS LAST, d.producto_id;


  • This solution returns exactly one row per dia (if available). if multiple products tie for top sales my arbitrary (but deterministic and reproducible) pick is the one with the smaller producto_id.
    If you need all peers tying for one day use rank() as suggested by @Houari.


  • The sequence of events in an SQL SELECT query is explained in this related answer:

    SQL SELECT查询中的事件序列在相关答案中进行了解释:

  • date_trunc() was just noise in the calculation of dia. I removed it.


  • I added NULLS LAST to the descending sort order since it is unclear whether there might be rows with NULL for suma in the result:

    我将NULLS LAST添加到降序排序中,因为不清楚结果中是否有suma行为NULL:

  • The numbers in DISTINCT ON and GROUP BY are just a syntactical shorthand notation for convenience. Similar:

    为方便起见,DISTINCT ON和GROUP BY中的数字只是一种语法简写符号。类似:

    As are the added table aliases (syntactical shorthand notation).


  • Basics for DISTINCT ON

    DISTINCT ON的基础知识


You can (ab)use SELECT DISTINCT ON with the appropriate ordering clause. Assuming you made your previous query into a view:

您可以(ab)使用SELECT DISTINCT ON和相应的排序子句。假设您将之前的查询放入视图中:

SELECT DISTINCT ON (dia, producto) * FROM some_view ORDER BY dia, producto, suma DESC;

the DISTINCT ensures you will retain only one row for every day and product, and the ORDER BY ensures it retains the correct one

DISTINCT确保您每天只保留一行产品,ORDER BY确保它保留正确的一行


By the windowing function: RANK you can easely get it:


select * from
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from your_query
where ranking = 1

So you final query will be something like:


select * from
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as     producto,to_char(date_trunc
    ('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia FROM detalle_orden     LEFT JOIN
    orden ON orden.id= detalle_orden.order_id GROUP by
    orden.fecha,detalle_orden.producto_id ) B
) A
where ranking = 1


You can still use DISTINCT ON to get this done in a single query level without subquery, because DISTINCT is applied after GROUP BY and aggregate functions (and after window functions):

您仍然可以使用DISTINCT ON在没有子查询的单个查询级别中完成此操作,因为DISTINCT在GROUP BY和聚合函数之后(以及在窗口函数之后)应用:

       sum(d.cantidad) AS suma
     , d.producto_id AS producto
     , to_char(o.fecha AT TIME ZONE 'MST', 'DY') AS dia
FROM   detalle_orden d
LEFT   JOIN orden o ON o.id = d.order_id
GROUP  BY o.fecha, d.producto_id 
ORDER  BY 3, 1 DESC NULLS LAST, d.producto_id;


  • This solution returns exactly one row per dia (if available). if multiple products tie for top sales my arbitrary (but deterministic and reproducible) pick is the one with the smaller producto_id.
    If you need all peers tying for one day use rank() as suggested by @Houari.


  • The sequence of events in an SQL SELECT query is explained in this related answer:

    SQL SELECT查询中的事件序列在相关答案中进行了解释:

  • date_trunc() was just noise in the calculation of dia. I removed it.


  • I added NULLS LAST to the descending sort order since it is unclear whether there might be rows with NULL for suma in the result:

    我将NULLS LAST添加到降序排序中,因为不清楚结果中是否有suma行为NULL:

  • The numbers in DISTINCT ON and GROUP BY are just a syntactical shorthand notation for convenience. Similar:

    为方便起见,DISTINCT ON和GROUP BY中的数字只是一种语法简写符号。类似:

    As are the added table aliases (syntactical shorthand notation).


  • Basics for DISTINCT ON

    DISTINCT ON的基础知识


You can (ab)use SELECT DISTINCT ON with the appropriate ordering clause. Assuming you made your previous query into a view:

您可以(ab)使用SELECT DISTINCT ON和相应的排序子句。假设您将之前的查询放入视图中:

SELECT DISTINCT ON (dia, producto) * FROM some_view ORDER BY dia, producto, suma DESC;

the DISTINCT ensures you will retain only one row for every day and product, and the ORDER BY ensures it retains the correct one

DISTINCT确保您每天只保留一行产品,ORDER BY确保它保留正确的一行


By the windowing function: RANK you can easely get it:


select * from
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from your_query
where ranking = 1

So you final query will be something like:


select * from
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as     producto,to_char(date_trunc
    ('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia FROM detalle_orden     LEFT JOIN
    orden ON orden.id= detalle_orden.order_id GROUP by
    orden.fecha,detalle_orden.producto_id ) B
) A
where ranking = 1