如何在HSQL中为每个组选择列中具有最大值的行?

时间:2022-03-10 22:56:49

I have a table named PERSON in an HSQL database like this:

我在HSQL数据库中有一个名为PERSON的表,如下所示:

NAME(PK) | AGE | CITY   | ... many more here ... | 
--------------------------------------------------
aaa      |  12 |   nyc  | ...
bbb      |  12 |   nyc  | ...
ccc      |  10 |   nyc  | ...
ddd      |  34 |    la  | ...
eee      |  10 |    la  | ...

For each city I need to select the record with the maximum age. If for a given city there are multiple records with tied maximum age I still need to select exactly one record for this city (but it can be arbitrarily chosen).

对于每个城市,我需要选择具有最大年龄的记录。如果对于给定的城市,有多个记录具有绑定的最大年龄,我仍然需要为该城市选择一个记录(但可以任意选择)。

So in the above example I need this result:

所以在上面的例子中我需要这个结果:

NAME(PK) | AGE | CITY | ... many more here ... | 
--------------------------------------------------
aaa      |  12 |  nyc | ...
ddd      |  34 |   la | ...

and it would be ok if I got bbb instead of aaa, but not ok to get aaa and bbb.

如果我得到bbb而不是aaa就可以了,但是不能得到aaa和bbb。

Simply using group by on the city column and max(age) as aggregate function does not work, because this does not allow me to select other columns than age and city as they are not in the aggregate. I tried doing the group by and then joining the result back to the table, but this way I do not manage to get rid of records with duplicate max age. This query:

简单地使用city列上的group by和max(age)作为聚合函数不起作用,因为这不允许我选择除年龄和城市之外的其他列,因为它们不在聚合中。我尝试通过组合然后将结果加回到表中,但是这样我就无法摆脱具有重复最大年龄的记录。这个查询:

SELECT NAME, CITY, AGE, [... many more here ...] 
FROM ( 
    SELECT max(age) AS maxAge, city 
    FROM PERSON
    GROUP BY CITY
) AS x
JOIN PERSON AS p 
ON p.city = x.city AND p.age = x.maxAge

yields:

收益率:

NAME(PK) | AGE | CITY | ... many more here ... | 
--------------------------------------------------
aaa      |  12 |  nyc | ...
bbb      |  12 |  nyc | ...
ddd      |  34 |   la | ...

whith two records for nyc where there should be only one.

nyc的两个记录,其中只有一个。

2 个解决方案

#1


1  

A modern SQL alternative to the correlated subquery solution is the LATERAL keyword:

相关子查询解决方案的现代SQL替代方法是LATERAL关键字:

SELECT * FROM 
 (SELECT DISTINCT CITY FROM PERSON) CITIES, 
 LATERAL 
 (SELECT * FROM PERSON WHERE CITY = CITIES.CITY ORDER BY AGE DESC LIMIT 1)

#2


2  

If you don't care about order then you can use correlated subquery :

如果您不关心订单,那么您可以使用相关子查询:

select * 
from PERSON p
where name = (select name 
              from PERSON 
              where CITY = p.City 
              order by AGE desc, name asc -- neglate name if you want arbitrary ordering 
              LIMIT 1);

This will select only one name for each city.

这将为每个城市仅选择一个名称。

#1


1  

A modern SQL alternative to the correlated subquery solution is the LATERAL keyword:

相关子查询解决方案的现代SQL替代方法是LATERAL关键字:

SELECT * FROM 
 (SELECT DISTINCT CITY FROM PERSON) CITIES, 
 LATERAL 
 (SELECT * FROM PERSON WHERE CITY = CITIES.CITY ORDER BY AGE DESC LIMIT 1)

#2


2  

If you don't care about order then you can use correlated subquery :

如果您不关心订单,那么您可以使用相关子查询:

select * 
from PERSON p
where name = (select name 
              from PERSON 
              where CITY = p.City 
              order by AGE desc, name asc -- neglate name if you want arbitrary ordering 
              LIMIT 1);

This will select only one name for each city.

这将为每个城市仅选择一个名称。