从oracle中为每个组选择最新行

时间:2022-05-22 12:27:32

I have a table with user comments in a guestbook. Columns are: id, user_id, title, comment, timestamp.

我在留言簿中有一个用户评论表。列是:id,user_id,title,comment,timestamp。

I need to select the latest row for each user. I have tried to do this with group by but havent managed it because i cant select anything else in the same query where i group by user_id:

我需要为每个用户选择最新的行。我已尝试使用group by但没有管理它,因为我无法在同一查询中选择其他任何我按user_id分组的内容:

SELECT user_id, MAX(ts) FROM comments GROUP BY user_id

for example in this query i cant add to also select columns id, tilte and comment. How can this be done?

例如,在此查询中,我无法添加也选择列id,tilte和comment。如何才能做到这一点?

3 个解决方案

#1


4  

You can use analytic functions

您可以使用分析功能

SELECT *
  FROM (SELECT c.*,
               rank() over (partition by user_id order by ts desc) rnk
          FROM comments c)
 WHERE rnk = 1

Depending on how you want to handle ties (if there can be two rows with the same user_id and ts), you may want to use the row_number or dense_rank function rather than rank. rank would allow multiple rows to be first if there was a tie. row_number would arbitrarily return one row if there was a tie. dense_rank would behave like rank for the rows that tied for first but would consider the next row to be second rather than third assuming two rows tie for first.

根据您想要处理关系的方式(如果可以有两个具有相同user_id和ts的行),您可能希望使用row_number或dense_rank函数而不是rank。如果存在平局,则排名将允许多行为第一行。如果存在平局,row_number将任意返回一行。 dense_rank的行为类似于首先绑定的行的排名,但会认为下一行是第二行而不是第三行,假设两行首先绑定。

#2


6  

You can build on your query using a JOIN:

您可以使用JOIN构建查询:

select c.*
from comments c join
     (select user_id, max(ts) as maxts
      from comments c2
      group by user_id
     ) cc
     on c.user_id = cc.user_id and c.ts = cc.maxts;

There are other ways. Typical advice is to use row_number():

还有其他方法。典型的建议是使用row_number():

select t.*
from (select c.*, row_number() over (partition by user_id order by ts desc) as seqnum
      from comments c
     ) c
where seqnum = 1;

These two queries are subtly different. The first will return duplicates if the most recent comment for a user had exactly the same ts. The second returns one row per user.

这两个查询略有不同。如果用户的最新评论具有完全相同的ts,则第一个将返回重复项。第二个返回每个用户一行。

#3


2  

This type of problems has a very simple and very efficient solution with the dense rank first/last function:

这种类型的问题有一个非常简单和非常有效的解决方案,具有密集排名第一/最后一个功能:

select id,
       max(user_id) keep (dense_rank last order by ts) over (partition by id) as user_id,
       max(title)   keep (dense_rank last order by ts) over (partition by id) as title,
       max(comment) keep (dense_rank last order by ts) over (partition by id) as comment,
       max(ts)                                                                as ts
from   comments;

#1


4  

You can use analytic functions

您可以使用分析功能

SELECT *
  FROM (SELECT c.*,
               rank() over (partition by user_id order by ts desc) rnk
          FROM comments c)
 WHERE rnk = 1

Depending on how you want to handle ties (if there can be two rows with the same user_id and ts), you may want to use the row_number or dense_rank function rather than rank. rank would allow multiple rows to be first if there was a tie. row_number would arbitrarily return one row if there was a tie. dense_rank would behave like rank for the rows that tied for first but would consider the next row to be second rather than third assuming two rows tie for first.

根据您想要处理关系的方式(如果可以有两个具有相同user_id和ts的行),您可能希望使用row_number或dense_rank函数而不是rank。如果存在平局,则排名将允许多行为第一行。如果存在平局,row_number将任意返回一行。 dense_rank的行为类似于首先绑定的行的排名,但会认为下一行是第二行而不是第三行,假设两行首先绑定。

#2


6  

You can build on your query using a JOIN:

您可以使用JOIN构建查询:

select c.*
from comments c join
     (select user_id, max(ts) as maxts
      from comments c2
      group by user_id
     ) cc
     on c.user_id = cc.user_id and c.ts = cc.maxts;

There are other ways. Typical advice is to use row_number():

还有其他方法。典型的建议是使用row_number():

select t.*
from (select c.*, row_number() over (partition by user_id order by ts desc) as seqnum
      from comments c
     ) c
where seqnum = 1;

These two queries are subtly different. The first will return duplicates if the most recent comment for a user had exactly the same ts. The second returns one row per user.

这两个查询略有不同。如果用户的最新评论具有完全相同的ts,则第一个将返回重复项。第二个返回每个用户一行。

#3


2  

This type of problems has a very simple and very efficient solution with the dense rank first/last function:

这种类型的问题有一个非常简单和非常有效的解决方案,具有密集排名第一/最后一个功能:

select id,
       max(user_id) keep (dense_rank last order by ts) over (partition by id) as user_id,
       max(title)   keep (dense_rank last order by ts) over (partition by id) as title,
       max(comment) keep (dense_rank last order by ts) over (partition by id) as comment,
       max(ts)                                                                as ts
from   comments;