I'm using PostgreSQL. I need to select the max of each group, the situation is that the table represents the products sell on each day, and I want to know the top sold product of each day.
我正在使用PostgreSQL。我需要选择每组的最大值,情况是表格代表每天销售的产品,我想知道每天最畅销的产品。
SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as producto
,to_char(date_trunc('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia
FROM detalle_orden
LEFT JOIN orden ON orden.id = detalle_orden.order_id
GROUP BY orden.fecha,detalle_orden.producto_id
ORDER BY dia,suma desc
Is returning:
suma producto dia
4 1 FRI
1 2 FRI
5 3 TUE
2 2 TUE
I want to get:
我想得到:
suma producto dia
4 1 FRI
5 3 TUE
Only the top product of each day (with the max(suma)
of each group).
只有每天的最佳产品(每组的最大(suma))。
I tried different approaches, like subqueries, but the aggregate function used make things a bit difficult.
我尝试了不同的方法,比如子查询,但使用的聚合函数使事情变得有点困难。
3 个解决方案
#1
You can still use DISTINCT ON
to get this done in a single query level without subquery, because DISTINCT
is applied after GROUP BY
and aggregate functions (and after window functions):
您仍然可以使用DISTINCT ON在没有子查询的单个查询级别中完成此操作,因为DISTINCT在GROUP BY和聚合函数之后(以及在窗口函数之后)应用:
SELECT DISTINCT ON (3)
sum(d.cantidad) AS suma
, d.producto_id AS producto
, to_char(o.fecha AT TIME ZONE 'MST', 'DY') AS dia
FROM detalle_orden d
LEFT JOIN orden o ON o.id = d.order_id
GROUP BY o.fecha, d.producto_id
ORDER BY 3, 1 DESC NULLS LAST, d.producto_id;
Notes
-
This solution returns exactly one row per
dia
(if available). if multiple products tie for top sales my arbitrary (but deterministic and reproducible) pick is the one with the smallerproducto_id
.
If you need all peers tying for one day userank()
as suggested by @Houari.此解决方案每个直接返回一行(如果可用)。如果多个产品与*销售相关,那么我的任意(但确定性和可重复性)选择是具有较小producto_id的选择。如果你需要所有同伴打一天,请使用@Houari建议的rank()。
-
The sequence of events in an SQL
SELECT
query is explained in this related answer:SQL SELECT查询中的事件序列在相关答案中进行了解释:
- Best way to get result count before LIMIT was applied
在应用LIMIT之前获得结果计数的最佳方法
-
date_trunc()
was just noise in the calculation ofdia
. I removed it.date_trunc()只是dia计算中的噪音。我删除了它。
-
I added
NULLS LAST
to the descending sort order since it is unclear whether there might be rows with NULL forsuma
in the result:我将NULLS LAST添加到降序排序中,因为不清楚结果中是否有suma行为NULL:
- PostgreSQL sort by datetime asc, null first?
PostgreSQL按日期时间asc排序,先是null吗?
-
The numbers in
DISTINCT ON
andGROUP BY
are just a syntactical shorthand notation for convenience. Similar:为方便起见,DISTINCT ON和GROUP BY中的数字只是一种语法简写符号。类似:
- PostgreSQL equivalent for MySQL GROUP BY
PostgreSQL相当于MySQL GROUP BY
As are the added table aliases (syntactical shorthand notation).
与添加的表别名一样(语法简写表示法)。
-
Basics for
DISTINCT ON
DISTINCT ON的基础知识
- Select first row in each GROUP BY group?
选择每个GROUP BY组中的第一行?
#2
You can (ab)use SELECT DISTINCT ON
with the appropriate ordering clause. Assuming you made your previous query into a view:
您可以(ab)使用SELECT DISTINCT ON和相应的排序子句。假设您将之前的查询放入视图中:
SELECT DISTINCT ON (dia, producto) * FROM some_view ORDER BY dia, producto, suma DESC;
the DISTINCT
ensures you will retain only one row for every day and product, and the ORDER BY
ensures it retains the correct one
DISTINCT确保您每天只保留一行产品,ORDER BY确保它保留正确的一行
#3
By the windowing function: RANK you can easely get it:
通过窗口功能:RANK你可以轻松搞定:
select * from
(
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from your_query
)A
where ranking = 1
So you final query will be something like:
所以你最后的查询将是这样的:
select * from
(
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from
(
SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as producto,to_char(date_trunc
('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia FROM detalle_orden LEFT JOIN
orden ON orden.id= detalle_orden.order_id GROUP by
orden.fecha,detalle_orden.producto_id ) B
) A
where ranking = 1
#1
You can still use DISTINCT ON
to get this done in a single query level without subquery, because DISTINCT
is applied after GROUP BY
and aggregate functions (and after window functions):
您仍然可以使用DISTINCT ON在没有子查询的单个查询级别中完成此操作,因为DISTINCT在GROUP BY和聚合函数之后(以及在窗口函数之后)应用:
SELECT DISTINCT ON (3)
sum(d.cantidad) AS suma
, d.producto_id AS producto
, to_char(o.fecha AT TIME ZONE 'MST', 'DY') AS dia
FROM detalle_orden d
LEFT JOIN orden o ON o.id = d.order_id
GROUP BY o.fecha, d.producto_id
ORDER BY 3, 1 DESC NULLS LAST, d.producto_id;
Notes
-
This solution returns exactly one row per
dia
(if available). if multiple products tie for top sales my arbitrary (but deterministic and reproducible) pick is the one with the smallerproducto_id
.
If you need all peers tying for one day userank()
as suggested by @Houari.此解决方案每个直接返回一行(如果可用)。如果多个产品与*销售相关,那么我的任意(但确定性和可重复性)选择是具有较小producto_id的选择。如果你需要所有同伴打一天,请使用@Houari建议的rank()。
-
The sequence of events in an SQL
SELECT
query is explained in this related answer:SQL SELECT查询中的事件序列在相关答案中进行了解释:
- Best way to get result count before LIMIT was applied
在应用LIMIT之前获得结果计数的最佳方法
-
date_trunc()
was just noise in the calculation ofdia
. I removed it.date_trunc()只是dia计算中的噪音。我删除了它。
-
I added
NULLS LAST
to the descending sort order since it is unclear whether there might be rows with NULL forsuma
in the result:我将NULLS LAST添加到降序排序中,因为不清楚结果中是否有suma行为NULL:
- PostgreSQL sort by datetime asc, null first?
PostgreSQL按日期时间asc排序,先是null吗?
-
The numbers in
DISTINCT ON
andGROUP BY
are just a syntactical shorthand notation for convenience. Similar:为方便起见,DISTINCT ON和GROUP BY中的数字只是一种语法简写符号。类似:
- PostgreSQL equivalent for MySQL GROUP BY
PostgreSQL相当于MySQL GROUP BY
As are the added table aliases (syntactical shorthand notation).
与添加的表别名一样(语法简写表示法)。
-
Basics for
DISTINCT ON
DISTINCT ON的基础知识
- Select first row in each GROUP BY group?
选择每个GROUP BY组中的第一行?
#2
You can (ab)use SELECT DISTINCT ON
with the appropriate ordering clause. Assuming you made your previous query into a view:
您可以(ab)使用SELECT DISTINCT ON和相应的排序子句。假设您将之前的查询放入视图中:
SELECT DISTINCT ON (dia, producto) * FROM some_view ORDER BY dia, producto, suma DESC;
the DISTINCT
ensures you will retain only one row for every day and product, and the ORDER BY
ensures it retains the correct one
DISTINCT确保您每天只保留一行产品,ORDER BY确保它保留正确的一行
#3
By the windowing function: RANK you can easely get it:
通过窗口功能:RANK你可以轻松搞定:
select * from
(
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from your_query
)A
where ranking = 1
So you final query will be something like:
所以你最后的查询将是这样的:
select * from
(
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from
(
SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as producto,to_char(date_trunc
('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia FROM detalle_orden LEFT JOIN
orden ON orden.id= detalle_orden.order_id GROUP by
orden.fecha,detalle_orden.producto_id ) B
) A
where ranking = 1