How do you get the rows that contain the max value for each grouped set?
如何获得包含每个分组集的最大值的行?
I've seen some overly-complicated variations on this question, and none with a good answer. I've tried to put together the simplest possible example:
我在这个问题上见过一些过于复杂的变化,但没有一个能给出一个好的答案。我试着把最简单的例子放在一起:
Given a table like that below, with person, group, and age columns, how would you get the oldest person in each group? (A tie within a group should give the first alphabetical result)
如果有一个这样的表格,有人,组和年龄列,你如何得到每个组中最年长的人?(一个组内的平手应该给出第一个字母的结果)
Person | Group | Age
---
Bob | 1 | 32
Jill | 1 | 34
Shawn| 1 | 42
Jake | 2 | 29
Paul | 2 | 36
Laura| 2 | 39
Desired result set:
预期的结果:
Shawn | 1 | 42
Laura | 2 | 39
17 个解决方案
#1
115
There's a super-simple way to do this in mysql:
在mysql中有一个超级简单的方法:
select *
from (select * from mytable order by `Group`, age desc, Person) x
group by `Group`
This works because in mysql you're allowed to not aggregate non-group-by columns, in which case mysql just returns the first row. The solution is to first order the data such that for each group the row you want is first, then group by the columns you want the value for.
这是可行的,因为在mysql中,不允许聚合非分组列,在这种情况下,mysql只返回第一行。解决方案是首先对数据进行排序,以便对于每个组,您想要的行都是第一个,然后按您想要的值的列进行分组。
You avoid complicated subqueries that try to find the max()
etc, and also the problems of returning multiple rows when there are more than one with the same maximum value (as the other answers would do)
您可以避免试图查找max()等的复杂子查询,还可以避免当多个行具有相同的最大值时返回多个行的问题(其他答案会这样做)
Note: This is a mysql-only solution. All other databases I know will throw an SQL syntax error with the message "non aggregated columns are not listed in the group by clause" or similar. Because this solution uses undocumented behavior, the more cautious may want to include a test to assert that it remains working should a future version of MySQL change this behavior.
注意:这是一个mysq -only的解决方案。我所知道的所有其他数据库都会抛出一个SQL语法错误,消息是“group by子句中没有列出非聚合列”或类似的列。因为这个解决方案使用了未文档化的行为,所以更谨慎的人可能想要包含一个测试来断言,如果MySQL的未来版本改变了这种行为,那么它将继续工作。
Version 5.7 update:
Since version 5.7, the sql-mode
setting includes ONLY_FULL_GROUP_BY
by default, so to make this work you must not have this option (edit the option file for the server to remove this setting).
自5.7版本以来,sql模式设置默认只包含_full_group_for,因此要使此工作,您必须没有此选项(编辑服务器的选项文件以删除此设置)。
#2
205
The correct solution is:
正确的解决方案是:
SELECT o.*
FROM `Persons` o # 'o' from 'oldest person in group'
LEFT JOIN `Persons` b # 'b' from 'bigger age'
ON o.Group = b.Group AND o.Age < b.Age
WHERE b.Age is NULL # bigger age not found
How it works:
It matches each row from o
with all the rows from b
having the same value in column Group
and a bigger value in column Age
. Any row from o
not having the maximum value of its group in column Age
will match one or more rows from b
.
它将来自o的每一行与b中的所有行匹配,在列组中具有相同的值,并且在列年龄中具有更大的值。在列年龄中,不具有组的最大值的任何一行都将匹配来自b的一个或多个行。
The LEFT JOIN
makes it match the oldest person in group (including the persons that are alone in their group) with a row full of NULL
s from b
('no biggest age in the group').
Using INNER JOIN
makes these rows not matching and they are ignored.
左连接使它匹配组中最年长的人(包括组中唯一的人)和一排满了来自b的NULLs(“组中最大的年龄”)。使用内部连接会使这些行不匹配,并且会被忽略。
The WHERE
clause keeps only the rows having NULL
s in the fields extracted from b
. They are the oldest persons from each group.
WHERE子句只保留从b中提取的字段中具有null的行。
Further readings
This solution and many others are explained in the book SQL Antipatterns: Avoiding the Pitfalls of Database Programming
在《SQL反模式:避免数据库编程的陷阱》一书中解释了这个解决方案和其他许多解决方案
#3
25
My simple solution for SQLite (and probably MySQL):
我对SQLite(可能还有MySQL)的简单解决方案:
SELECT *, MAX(age) FROM mytable GROUP BY `Group`;
However it doesn't work in PostgreSQL and maybe some other platforms.
不过,它在PostgreSQL和其他一些平台上都不起作用。
In PostgreSQL you can use DISTINCT ON clause:
在PostgreSQL中,可以使用不同的ON子句:
SELECT DISTINCT ON ("group") * FROM "mytable" ORDER BY "group", "age" DESC;
#4
23
You can join against a subquery that pulls the MAX(Group)
and Age
. This method is portable across most RDBMS.
您可以对一个子查询进行连接,该子查询将获取最大(组)和年龄。这种方法在大多数RDBMS中都是可移植的。
SELECT
yourtable.*
FROM
yourtable
JOIN (
SELECT `Group`, MAX(Age) AS age
FROM yourtable
GROUP BY `Group`
) maxage
/* join subquery against both Group and Age values */
ON yourtable.`Group` = maxage.`Group`
AND yourtable.Age = maxage.age
#5
3
Using ranking method.
使用排序法。
SELECT @rn := CASE WHEN @prev_grp <> groupa THEN 1 ELSE @rn+1 END AS rn,
@prev_grp :=groupa,
person,age,groupa
FROM users,(SELECT @rn := 0) r
HAVING rn=1
ORDER BY groupa,age DESC,person
#6
1
Using CTEs - Common Table Expressions:
使用CTEs -公共表表达式:
WITH MyCTE(MaxPKID, SomeColumn1)
AS(
SELECT MAX(a.MyTablePKID) AS MaxPKID, a.SomeColumn1
FROM MyTable1 a
GROUP BY a.SomeColumn1
)
SELECT b.MyTablePKID, b.SomeColumn1, b.SomeColumn2 MAX(b.NumEstado)
FROM MyTable1 b
INNER JOIN MyCTE c ON c.MaxPKID = b.MyTablePKID
GROUP BY b.MyTablePKID, b.SomeColumn1, b.SomeColumn2
--Note: MyTablePKID is the PrimaryKey of MyTable
#7
1
axiac's solution is what worked best for me in the end. I had an additional complexity however: a calculated "max value", derived from two columns.
axiac的解决方案最终对我最有效。不过,我还有一个额外的复杂性:一个计算出来的“最大值”,它来自两列。
Let's use the same example: I would like the oldest person in each group. If there are people that are equally old, take the tallest person.
让我们用同样的例子:我希望每个组中年龄最大的人。如果有同样年纪的人,以个子最高的人为例。
I had to perform the left join two times to get this behavior:
为了得到这种行为,我不得不两次执行左连接:
SELECT o1.* WHERE
(SELECT o.*
FROM `Persons` o
LEFT JOIN `Persons` b
ON o.Group = b.Group AND o.Age < b.Age
WHERE b.Age is NULL) o1
LEFT JOIN
(SELECT o.*
FROM `Persons` o
LEFT JOIN `Persons` b
ON o.Group = b.Group AND o.Age < b.Age
WHERE b.Age is NULL) o2
ON o1.Group = o2.Group AND o1.Height < o2.Height
WHERE o2.Height is NULL;
Hope this helps! I guess there should be better way to do this though...
希望这可以帮助!我想应该有更好的办法。
#8
1
My solution works only if you need retrieve only one column, however for my needs was the best solution found in terms of performance (it use only one single query!):
我的解决方案只在您只需要检索一列时才有效,但是我的需要是在性能方面找到的最佳解决方案(它只使用一个查询):
SELECT SUBSTRING_INDEX(GROUP_CONCAT(column_x ORDER BY column_y),',',1) AS xyz,
column_z
FROM table_name
GROUP BY column_z;
It use GROUP_CONCAT in order to create an ordered concat list and then I substring to only the first one.
它使用GROUP_CONCAT来创建一个有序的concat列表,然后将子字符串设置为第一个。
#9
0
You can also try
你也可以尝试
SELECT * FROM mytable WHERE age IN (SELECT MAX(age) FROM mytable GROUP BY `Group`) ;
#10
0
This method has the benefit of allowing you to rank by a different column, and not trashing the other data. It's quite useful in a situation where you are trying to list orders with a column for items, listing the heaviest first.
这种方法的好处是允许您对不同的列进行排序,而不破坏其他数据。在您试图用列来列出订单的情况下,列出最重的优先级是非常有用的。
Source: http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat
来源:http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html function_group-concat
SELECT person, group,
GROUP_CONCAT(
DISTINCT age
ORDER BY age DESC SEPARATOR ', follow up: '
)
FROM sql_table
GROUP BY group;
#11
0
Not sure if MySQL has row_number function. If so you can use it to get the desired result. On SQL Server you can do something similar to:
不确定MySQL是否有row_number函数。如果可以,您可以使用它来得到想要的结果。在SQL Server上,您可以执行以下操作:
CREATE TABLE p
(
person NVARCHAR(10),
gp INT,
age INT
);
GO
INSERT INTO p
VALUES ('Bob', 1, 32);
INSERT INTO p
VALUES ('Jill', 1, 34);
INSERT INTO p
VALUES ('Shawn', 1, 42);
INSERT INTO p
VALUES ('Jake', 2, 29);
INSERT INTO p
VALUES ('Paul', 2, 36);
INSERT INTO p
VALUES ('Laura', 2, 39);
GO
SELECT t.person, t.gp, t.age
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY gp ORDER BY age DESC) row
FROM p
) t
WHERE t.row = 1;
#12
0
let the table name be people
让表名为people
select O.* -- > O for oldest table
from people O , people T
where O.grp = T.grp and
O.Age =
(select max(T.age) from people T where O.grp = T.grp
group by T.grp)
group by O.grp;
#13
0
If ID(and all coulmns) is needed from mytable
如果从mytable中需要ID(和所有库仑)
SELECT
*
FROM
mytable
WHERE
id NOT IN (
SELECT
A.id
FROM
mytable AS A
JOIN mytable AS B ON A. GROUP = B. GROUP
AND A.age < B.age
)
#14
0
This is how I'm getting the N max rows per group in mysql
这就是mysql中每组最多N行
SELECT co.id, co.person, co.country
FROM person co
WHERE (
SELECT COUNT(*)
FROM person ci
WHERE co.country = ci.country AND co.id < ci.id
) < 1
;
how it works:
它是如何工作的:
- self join to the table
- self加入到表格中
- groups are done by
co.country = ci.country
- 组由co.country = ci.country执行
- N elements per group are controlled by
) < 1
so for 3 elements - ) < 3 - 每组N个元素由)< 1控制,因此3个元素-)< 3
- to get max or min depends on:
co.id < ci.id
- co.id < ci.id - max
- co.id < ci。id -马克斯
- co.id > ci.id - min
- co.id > ci。id -分钟
- 获取最大值或最小值取决于:co id < ci。id co.id < ci。id - max co.id > ci。id -分钟
Full example here:
完整的例子:
mysql select n max values per group
mysql每组选择n个最大值
#15
0
I have a simple solution by using WHERE IN
我有一个用WHERE IN的简单解决方案
SELECT a.* FROM `mytable` AS a
WHERE a.age IN( SELECT MAX(b.age) AS age FROM `mytable` AS b GROUP BY b.group )
ORDER BY a.group ASC, a.person ASC
#16
-1
with CTE as
(select Person,
[Group], Age, RN= Row_Number()
over(partition by [Group]
order by Age desc)
from yourtable)`
`select Person, Age from CTE where RN = 1`
#17
-1
I would not use Group as column name since it is reserved word. However following SQL would work.
我不会使用组作为列名,因为它是保留字。但是,遵循SQL是可行的。
SELECT a.Person, a.Group, a.Age FROM [TABLE_NAME] a
INNER JOIN
(
SELECT `Group`, MAX(Age) AS oldest FROM [TABLE_NAME]
GROUP BY `Group`
) b ON a.Group = b.Group AND a.Age = b.oldest
#1
115
There's a super-simple way to do this in mysql:
在mysql中有一个超级简单的方法:
select *
from (select * from mytable order by `Group`, age desc, Person) x
group by `Group`
This works because in mysql you're allowed to not aggregate non-group-by columns, in which case mysql just returns the first row. The solution is to first order the data such that for each group the row you want is first, then group by the columns you want the value for.
这是可行的,因为在mysql中,不允许聚合非分组列,在这种情况下,mysql只返回第一行。解决方案是首先对数据进行排序,以便对于每个组,您想要的行都是第一个,然后按您想要的值的列进行分组。
You avoid complicated subqueries that try to find the max()
etc, and also the problems of returning multiple rows when there are more than one with the same maximum value (as the other answers would do)
您可以避免试图查找max()等的复杂子查询,还可以避免当多个行具有相同的最大值时返回多个行的问题(其他答案会这样做)
Note: This is a mysql-only solution. All other databases I know will throw an SQL syntax error with the message "non aggregated columns are not listed in the group by clause" or similar. Because this solution uses undocumented behavior, the more cautious may want to include a test to assert that it remains working should a future version of MySQL change this behavior.
注意:这是一个mysq -only的解决方案。我所知道的所有其他数据库都会抛出一个SQL语法错误,消息是“group by子句中没有列出非聚合列”或类似的列。因为这个解决方案使用了未文档化的行为,所以更谨慎的人可能想要包含一个测试来断言,如果MySQL的未来版本改变了这种行为,那么它将继续工作。
Version 5.7 update:
Since version 5.7, the sql-mode
setting includes ONLY_FULL_GROUP_BY
by default, so to make this work you must not have this option (edit the option file for the server to remove this setting).
自5.7版本以来,sql模式设置默认只包含_full_group_for,因此要使此工作,您必须没有此选项(编辑服务器的选项文件以删除此设置)。
#2
205
The correct solution is:
正确的解决方案是:
SELECT o.*
FROM `Persons` o # 'o' from 'oldest person in group'
LEFT JOIN `Persons` b # 'b' from 'bigger age'
ON o.Group = b.Group AND o.Age < b.Age
WHERE b.Age is NULL # bigger age not found
How it works:
It matches each row from o
with all the rows from b
having the same value in column Group
and a bigger value in column Age
. Any row from o
not having the maximum value of its group in column Age
will match one or more rows from b
.
它将来自o的每一行与b中的所有行匹配,在列组中具有相同的值,并且在列年龄中具有更大的值。在列年龄中,不具有组的最大值的任何一行都将匹配来自b的一个或多个行。
The LEFT JOIN
makes it match the oldest person in group (including the persons that are alone in their group) with a row full of NULL
s from b
('no biggest age in the group').
Using INNER JOIN
makes these rows not matching and they are ignored.
左连接使它匹配组中最年长的人(包括组中唯一的人)和一排满了来自b的NULLs(“组中最大的年龄”)。使用内部连接会使这些行不匹配,并且会被忽略。
The WHERE
clause keeps only the rows having NULL
s in the fields extracted from b
. They are the oldest persons from each group.
WHERE子句只保留从b中提取的字段中具有null的行。
Further readings
This solution and many others are explained in the book SQL Antipatterns: Avoiding the Pitfalls of Database Programming
在《SQL反模式:避免数据库编程的陷阱》一书中解释了这个解决方案和其他许多解决方案
#3
25
My simple solution for SQLite (and probably MySQL):
我对SQLite(可能还有MySQL)的简单解决方案:
SELECT *, MAX(age) FROM mytable GROUP BY `Group`;
However it doesn't work in PostgreSQL and maybe some other platforms.
不过,它在PostgreSQL和其他一些平台上都不起作用。
In PostgreSQL you can use DISTINCT ON clause:
在PostgreSQL中,可以使用不同的ON子句:
SELECT DISTINCT ON ("group") * FROM "mytable" ORDER BY "group", "age" DESC;
#4
23
You can join against a subquery that pulls the MAX(Group)
and Age
. This method is portable across most RDBMS.
您可以对一个子查询进行连接,该子查询将获取最大(组)和年龄。这种方法在大多数RDBMS中都是可移植的。
SELECT
yourtable.*
FROM
yourtable
JOIN (
SELECT `Group`, MAX(Age) AS age
FROM yourtable
GROUP BY `Group`
) maxage
/* join subquery against both Group and Age values */
ON yourtable.`Group` = maxage.`Group`
AND yourtable.Age = maxage.age
#5
3
Using ranking method.
使用排序法。
SELECT @rn := CASE WHEN @prev_grp <> groupa THEN 1 ELSE @rn+1 END AS rn,
@prev_grp :=groupa,
person,age,groupa
FROM users,(SELECT @rn := 0) r
HAVING rn=1
ORDER BY groupa,age DESC,person
#6
1
Using CTEs - Common Table Expressions:
使用CTEs -公共表表达式:
WITH MyCTE(MaxPKID, SomeColumn1)
AS(
SELECT MAX(a.MyTablePKID) AS MaxPKID, a.SomeColumn1
FROM MyTable1 a
GROUP BY a.SomeColumn1
)
SELECT b.MyTablePKID, b.SomeColumn1, b.SomeColumn2 MAX(b.NumEstado)
FROM MyTable1 b
INNER JOIN MyCTE c ON c.MaxPKID = b.MyTablePKID
GROUP BY b.MyTablePKID, b.SomeColumn1, b.SomeColumn2
--Note: MyTablePKID is the PrimaryKey of MyTable
#7
1
axiac's solution is what worked best for me in the end. I had an additional complexity however: a calculated "max value", derived from two columns.
axiac的解决方案最终对我最有效。不过,我还有一个额外的复杂性:一个计算出来的“最大值”,它来自两列。
Let's use the same example: I would like the oldest person in each group. If there are people that are equally old, take the tallest person.
让我们用同样的例子:我希望每个组中年龄最大的人。如果有同样年纪的人,以个子最高的人为例。
I had to perform the left join two times to get this behavior:
为了得到这种行为,我不得不两次执行左连接:
SELECT o1.* WHERE
(SELECT o.*
FROM `Persons` o
LEFT JOIN `Persons` b
ON o.Group = b.Group AND o.Age < b.Age
WHERE b.Age is NULL) o1
LEFT JOIN
(SELECT o.*
FROM `Persons` o
LEFT JOIN `Persons` b
ON o.Group = b.Group AND o.Age < b.Age
WHERE b.Age is NULL) o2
ON o1.Group = o2.Group AND o1.Height < o2.Height
WHERE o2.Height is NULL;
Hope this helps! I guess there should be better way to do this though...
希望这可以帮助!我想应该有更好的办法。
#8
1
My solution works only if you need retrieve only one column, however for my needs was the best solution found in terms of performance (it use only one single query!):
我的解决方案只在您只需要检索一列时才有效,但是我的需要是在性能方面找到的最佳解决方案(它只使用一个查询):
SELECT SUBSTRING_INDEX(GROUP_CONCAT(column_x ORDER BY column_y),',',1) AS xyz,
column_z
FROM table_name
GROUP BY column_z;
It use GROUP_CONCAT in order to create an ordered concat list and then I substring to only the first one.
它使用GROUP_CONCAT来创建一个有序的concat列表,然后将子字符串设置为第一个。
#9
0
You can also try
你也可以尝试
SELECT * FROM mytable WHERE age IN (SELECT MAX(age) FROM mytable GROUP BY `Group`) ;
#10
0
This method has the benefit of allowing you to rank by a different column, and not trashing the other data. It's quite useful in a situation where you are trying to list orders with a column for items, listing the heaviest first.
这种方法的好处是允许您对不同的列进行排序,而不破坏其他数据。在您试图用列来列出订单的情况下,列出最重的优先级是非常有用的。
Source: http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat
来源:http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html function_group-concat
SELECT person, group,
GROUP_CONCAT(
DISTINCT age
ORDER BY age DESC SEPARATOR ', follow up: '
)
FROM sql_table
GROUP BY group;
#11
0
Not sure if MySQL has row_number function. If so you can use it to get the desired result. On SQL Server you can do something similar to:
不确定MySQL是否有row_number函数。如果可以,您可以使用它来得到想要的结果。在SQL Server上,您可以执行以下操作:
CREATE TABLE p
(
person NVARCHAR(10),
gp INT,
age INT
);
GO
INSERT INTO p
VALUES ('Bob', 1, 32);
INSERT INTO p
VALUES ('Jill', 1, 34);
INSERT INTO p
VALUES ('Shawn', 1, 42);
INSERT INTO p
VALUES ('Jake', 2, 29);
INSERT INTO p
VALUES ('Paul', 2, 36);
INSERT INTO p
VALUES ('Laura', 2, 39);
GO
SELECT t.person, t.gp, t.age
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY gp ORDER BY age DESC) row
FROM p
) t
WHERE t.row = 1;
#12
0
let the table name be people
让表名为people
select O.* -- > O for oldest table
from people O , people T
where O.grp = T.grp and
O.Age =
(select max(T.age) from people T where O.grp = T.grp
group by T.grp)
group by O.grp;
#13
0
If ID(and all coulmns) is needed from mytable
如果从mytable中需要ID(和所有库仑)
SELECT
*
FROM
mytable
WHERE
id NOT IN (
SELECT
A.id
FROM
mytable AS A
JOIN mytable AS B ON A. GROUP = B. GROUP
AND A.age < B.age
)
#14
0
This is how I'm getting the N max rows per group in mysql
这就是mysql中每组最多N行
SELECT co.id, co.person, co.country
FROM person co
WHERE (
SELECT COUNT(*)
FROM person ci
WHERE co.country = ci.country AND co.id < ci.id
) < 1
;
how it works:
它是如何工作的:
- self join to the table
- self加入到表格中
- groups are done by
co.country = ci.country
- 组由co.country = ci.country执行
- N elements per group are controlled by
) < 1
so for 3 elements - ) < 3 - 每组N个元素由)< 1控制,因此3个元素-)< 3
- to get max or min depends on:
co.id < ci.id
- co.id < ci.id - max
- co.id < ci。id -马克斯
- co.id > ci.id - min
- co.id > ci。id -分钟
- 获取最大值或最小值取决于:co id < ci。id co.id < ci。id - max co.id > ci。id -分钟
Full example here:
完整的例子:
mysql select n max values per group
mysql每组选择n个最大值
#15
0
I have a simple solution by using WHERE IN
我有一个用WHERE IN的简单解决方案
SELECT a.* FROM `mytable` AS a
WHERE a.age IN( SELECT MAX(b.age) AS age FROM `mytable` AS b GROUP BY b.group )
ORDER BY a.group ASC, a.person ASC
#16
-1
with CTE as
(select Person,
[Group], Age, RN= Row_Number()
over(partition by [Group]
order by Age desc)
from yourtable)`
`select Person, Age from CTE where RN = 1`
#17
-1
I would not use Group as column name since it is reserved word. However following SQL would work.
我不会使用组作为列名,因为它是保留字。但是,遵循SQL是可行的。
SELECT a.Person, a.Group, a.Age FROM [TABLE_NAME] a
INNER JOIN
(
SELECT `Group`, MAX(Age) AS oldest FROM [TABLE_NAME]
GROUP BY `Group`
) b ON a.Group = b.Group AND a.Age = b.oldest