Postgres选择所有列,但按列分组

时间:2023-02-02 04:24:18

I have a simple table with a unit_id oid, time timestamp, diag bytea. The primary key is a combination of both time and unit_id.

我有一个简单的表,其中包含unit_id oid,time timestamp,diag bytea。主键是time和unit_id的组合。

The idea behind this query is to get the latest row (largest timestamp) for each unique unit_id. However the rows for each unit_id with the latest time are not always returned.

此查询背后的想法是获取每个唯一unit_id的最新行(最大时​​间戳)。但是,并不总是返回具有最新时间的每个unit_id的行。

I really want to group by just the unit_id, but postgres makes me use diag also, since I am selecting that.

我真的想通过unit_id进行分组,但是postgres也让我使用了diag,因为我选择了它。

SELECT DISTINCT ON(unit_id) max(time) as time, diag, unit_id 
FROM diagnostics.unit_diag_history  
GROUP BY unit_id, diag

2 个解决方案

#1


14  

Any time you start thinking that you want a localized GROUP BY you should start thinking about window functions instead.

每当你开始认为你想要一个本地化的GROUP BY时,你应该开始考虑窗口函数。

I think you're after something like this:

我想你是在追求这样的事情:

select unit_id, time, diag
from (
    select unit_id, time, diag,
           rank() over (partition by unit_id order by time desc) as rank
    from diagnostics.unit_diag_history
) as dt
where rank = 1

You might want to add something to the ORDER BY to consistently break ties as well but that wouldn't alter the overall technique.

您可能希望向ORDER BY添加一些内容以始终断开关系,但这不会改变整体技术。

#2


10  

You can join the grouped select with the original table:

您可以将分组选择与原始表一起加入:

SELECT d.time, d.diag, d.unit_id
FROM(
    SELECT unit_id, max(time) as max_time
    FROM diagnostics.unit_diag_history
    GROUP BY unit_id
) s JOIN diagnostics.unit_diag_history d
ON s.unit_id = d.unit_id AND s.max_time = d.time

#1


14  

Any time you start thinking that you want a localized GROUP BY you should start thinking about window functions instead.

每当你开始认为你想要一个本地化的GROUP BY时,你应该开始考虑窗口函数。

I think you're after something like this:

我想你是在追求这样的事情:

select unit_id, time, diag
from (
    select unit_id, time, diag,
           rank() over (partition by unit_id order by time desc) as rank
    from diagnostics.unit_diag_history
) as dt
where rank = 1

You might want to add something to the ORDER BY to consistently break ties as well but that wouldn't alter the overall technique.

您可能希望向ORDER BY添加一些内容以始终断开关系,但这不会改变整体技术。

#2


10  

You can join the grouped select with the original table:

您可以将分组选择与原始表一起加入:

SELECT d.time, d.diag, d.unit_id
FROM(
    SELECT unit_id, max(time) as max_time
    FROM diagnostics.unit_diag_history
    GROUP BY unit_id
) s JOIN diagnostics.unit_diag_history d
ON s.unit_id = d.unit_id AND s.max_time = d.time