窗口函数,尝试从连接表中的列中的created_at进行排序而不进行分组

时间:2021-10-13 22:56:15

I am trying to select all thread subjects for a particular user but I want to order by the most recent thread via recent message that was sent. Here is my database schema.

我试图为特定用户选择所有线程主题但我想通过最近的线程通过最近发送的消息进行排序。这是我的数据库架构。

create table thread (
    id bigserial primary key,
    subject text not null,
    created timestamp with time zone not null default current_timestamp
);

create table thread_account (
    account bigint not null references account(id) on delete cascade,
    thread bigint not null references thread(id) on delete cascade
);
create index thread_account_account on thread_account(account);
create index thread_account_thread on thread_account(thread);

create table message (
    id bigserial primary key,
    thread bigint not null references thread(id) on delete cascade,
    content text not null,
    account bigint not null references account(id) on delete cascade,
    created timestamp with time zone not null default current_timestamp
);
create index message_account on message(account);
create index message_thread on message(thread);

Then I was doing a query like

然后我正在做一个类似的查询

select * 
FROM thread_account 
JOIN thread on thread.id = thread_account.thread
JOIN message on message.thread = thread_account.thread 
WHERE thread_account.account = 299
ORDER BY message.created desc;

But this just gives back a list of all thread subjects for every entry where there is a message. (JOIN message on message.thread = thread_account.thread) seems to be the issue. I've been told I need a window function but can't seem to figure them out. This is for Postgres by the way.

但这只是为每个有消息的条目返回所有主题主题的列表。 (message.thread = thread_account.thread上的JOIN消息)似乎是个问题。我被告知我需要一个窗口功能,但似乎无法弄清楚它们。这是Postgres的方式。

2 个解决方案

#1


0  

I think you are looking for something like:

我想你正在寻找类似的东西:

select * 
FROM thread_account 
JOIN thread on thread.id = thread_account.thread
JOIN message on message.thread = thread_account.thread 
WHERE thread_account.account = 299
ORDER BY MAX(message.Created) OVER (PARTITION BY thread.id) desc;

The small tweak is the window function in the ORDER BY. This will partition your result set by thread.id, so you will end up with chunks of records for each thread.id, then it finds the max(message.created) for each of those chunks of records. Then it uses that max(message.created) to sort the result set.

小调整是ORDER BY中的窗口函数。这将通过thread.id对结果集进行分区,因此您将最终获得每个thread.id的记录块,然后找到每个记录块的max(message.created)。然后它使用max(message.created)对结果集进行排序。

Window Functions are little tricky to wrap your head around at first, but just think of them as chunking up your records (Partitioning) and then applying some sort of aggregation or function to one of the fields in that chunk, such as Max().

窗口函数起初很难解决,但只需将它们视为分块记录(分区),然后将某种聚合或函数应用于该块中的某个字段,例如Max()。


As mentioned in your comment, you don't want to see the information for the messages, rather just the threads. You just need to specify which fields you want in your result set in your SELECT portion of the query. You can use either GROUP BY or DISTINCT to get a single record back for each thread.

正如您的评论中所提到的,您不希望看到消息的信息,而只是线程。您只需在查询的SELECT部分​​中指定结果集中所需的字段。您可以使用GROUP BY或DISTINCT为每个线程获取单个记录。

Furthermore, we can show the Last Message Date in the results by copying that window function up into the Select portion as well:

此外,我们还可以通过将窗口函数复制到Select部分来显示结果中的Last Message Date:

SELECT DISTINCT 
    thread_account.*, 
    thread.*, 
    MAX(message.Created) OVER (PARTITION BY thread.id) as Last_Message_Date
FROM thread_account 
JOIN thread on thread.id = thread_account.thread
JOIN message on message.thread = thread_account.thread 
WHERE thread_account.account = 299
ORDER BY MAX(message.Created) OVER (PARTITION BY thread.id) desc;

If you only want certain fields from either Thread or Thread_Account then you just get more explicit in the SELECT portion of the query like SELECT DISTINCT Thread.Id, Thread_Account.Account, etc..

如果您只想要Thread或Thread_Account中的某些字段,那么您只需在SELECT DISTINCT Thread.Id,Thread_Account.Account等查询的SELECT部分​​中更加明确。

#2


0  

The very handy distinct on makes it easy:

非常方便的独特之处使它变得简单:

select distinct on (t.id) *
from
    thread_account ta
    inner join
    thread t on t.id = ta.thread
    inner join
    message m on m.thread = ta.thread 
where ta.account = 299
order by t.id, m.created desc

For just the thread info do instead:

对于线程信息而言:

select distinct on (t.id) t.*

#1


0  

I think you are looking for something like:

我想你正在寻找类似的东西:

select * 
FROM thread_account 
JOIN thread on thread.id = thread_account.thread
JOIN message on message.thread = thread_account.thread 
WHERE thread_account.account = 299
ORDER BY MAX(message.Created) OVER (PARTITION BY thread.id) desc;

The small tweak is the window function in the ORDER BY. This will partition your result set by thread.id, so you will end up with chunks of records for each thread.id, then it finds the max(message.created) for each of those chunks of records. Then it uses that max(message.created) to sort the result set.

小调整是ORDER BY中的窗口函数。这将通过thread.id对结果集进行分区,因此您将最终获得每个thread.id的记录块,然后找到每个记录块的max(message.created)。然后它使用max(message.created)对结果集进行排序。

Window Functions are little tricky to wrap your head around at first, but just think of them as chunking up your records (Partitioning) and then applying some sort of aggregation or function to one of the fields in that chunk, such as Max().

窗口函数起初很难解决,但只需将它们视为分块记录(分区),然后将某种聚合或函数应用于该块中的某个字段,例如Max()。


As mentioned in your comment, you don't want to see the information for the messages, rather just the threads. You just need to specify which fields you want in your result set in your SELECT portion of the query. You can use either GROUP BY or DISTINCT to get a single record back for each thread.

正如您的评论中所提到的,您不希望看到消息的信息,而只是线程。您只需在查询的SELECT部分​​中指定结果集中所需的字段。您可以使用GROUP BY或DISTINCT为每个线程获取单个记录。

Furthermore, we can show the Last Message Date in the results by copying that window function up into the Select portion as well:

此外,我们还可以通过将窗口函数复制到Select部分来显示结果中的Last Message Date:

SELECT DISTINCT 
    thread_account.*, 
    thread.*, 
    MAX(message.Created) OVER (PARTITION BY thread.id) as Last_Message_Date
FROM thread_account 
JOIN thread on thread.id = thread_account.thread
JOIN message on message.thread = thread_account.thread 
WHERE thread_account.account = 299
ORDER BY MAX(message.Created) OVER (PARTITION BY thread.id) desc;

If you only want certain fields from either Thread or Thread_Account then you just get more explicit in the SELECT portion of the query like SELECT DISTINCT Thread.Id, Thread_Account.Account, etc..

如果您只想要Thread或Thread_Account中的某些字段,那么您只需在SELECT DISTINCT Thread.Id,Thread_Account.Account等查询的SELECT部分​​中更加明确。

#2


0  

The very handy distinct on makes it easy:

非常方便的独特之处使它变得简单:

select distinct on (t.id) *
from
    thread_account ta
    inner join
    thread t on t.id = ta.thread
    inner join
    message m on m.thread = ta.thread 
where ta.account = 299
order by t.id, m.created desc

For just the thread info do instead:

对于线程信息而言:

select distinct on (t.id) t.*