Say, I have a table in the database (SQL Server 2008) with data similar to this (but much, much bigger):
说,我在数据库(SQL Server 2008)中有一个表,其数据与此类似(但更多,更大):
| ID | SCORE | GROUP |
-----------------------
| 10 | 1 | A |
| 6 | 2 | A |
| 3 | 3 | A |
|----|-------|-------|
| 8 | 5 | B |
|----|-------|-------|
| 4 | 1 | C |
| 9 | 3 | C |
| 2 | 4 | C |
| 7 | 4 | C |
|----|-------|-------|
| 12 | 3 | D |
| 1 | 3 | D |
| 11 | 4 | D |
| 5 | 6 | D |
I'd like to get the ID
of the top and bottom records for each GROUP
, where the records for each group are ordered by SCORE
(and supplementarily, ID
), like this:
我想得到每个GROUP的顶部和底部记录的ID,其中每个组的记录按SCORE(以及补充,ID)排序,如下所示:
| GROUP | MIN_ID | MAX_ID |
----------------------------
| A | 10 | 3 |
| B | 8 | 8 |
| C | 4 | 7 |
| D | 1 | 5 |
The question is: how can I achieve this?
问题是:我怎样才能做到这一点?
So far, I have been attempting solutions based on the RANK()
function, but haven't managed find a query which both produces the correct output and is vaguely efficient or maintainable.
到目前为止,我一直在尝试基于RANK()函数的解决方案,但是没有管理查找既产生正确输出又模糊效率或可维护的查询。
Notes:
The example is simplified. My 'table' is actually the output of an already complex query, which I'm looking to add the final stages to. I would prefer to only select from the table once.
这个例子很简单。我的'table'实际上是一个已经很复杂的查询的输出,我希望将最后的阶段添加到。我宁愿只从表中选择一次。
If possible, it would be good to have a general solution which would allow me to select the top and bottom n
values per group.
如果可能的话,最好有一个通用的解决方案,允许我选择每组的顶部和底部n值。
The ID
s are not in convenient order.
ID不方便。
4 个解决方案
#1
If possible, it would be good to have a general solution which would allow me to select the top and bottom
n
values per group.如果可能的话,最好有一个通用的解决方案,允许我选择每组的顶部和底部n值。
WITH q AS
(
SELECT m.*,
ROW_NUMBER() OVER (PARTITION BY Group ORDER BY Score) AS rn_asc,
ROW_NUMBER() OVER (PARTITION BY Group ORDER BY Score DESC) AS rn_desc
FROM mytable m
)
SELECT *
FROM q
WHERE rn_asc BETWEEN 1 AND 10
OR rn_desc BETWEEN 1 AND 10
#2
DECLARE @YourTable TABLE (ID INTEGER, Score INTEGER, [Group] VARCHAR(1))
INSERT INTO @YourTable VALUES (10, 1, 'A')
INSERT INTO @YourTable VALUES (6 , 2, 'A')
INSERT INTO @YourTable VALUES (3 , 3, 'A')
INSERT INTO @YourTable VALUES (8 , 5, 'B')
INSERT INTO @YourTable VALUES (4 , 1, 'C')
INSERT INTO @YourTable VALUES (9 , 3, 'C')
INSERT INTO @YourTable VALUES (2 , 4, 'C')
INSERT INTO @YourTable VALUES (7 , 4, 'C')
INSERT INTO @YourTable VALUES (12, 3, 'D')
INSERT INTO @YourTable VALUES (1 , 3, 'D')
INSERT INTO @YourTable VALUES (11, 4, 'D')
INSERT INTO @YourTable VALUES (5 , 6, 'D')
SELECT [Group], MIN([Min_ID]), MAX([Max_ID])
FROM (
SELECT [score].[Group], [Min_ID] = [min].ID, [Max_ID] = [max].ID
FROM (
SELECT [Group], [Min_Score] = MIN(Score), [Max_Score] = MAX(Score)
FROM @YourTable
GROUP BY [GROUP]) score
INNER JOIN @YourTable [min] ON [min].[Group] = [score].[Group] AND [min].[Score] = [score].[Min_Score]
INNER JOIN @YourTable [max] ON [max].[Group] = [score].[Group] AND [max].[Score] = [score].[Max_Score]
) yourtable
GROUP BY [yourtable].[Group]
#3
Any solution will also require some good indexes on group and score, but include ID
任何解决方案还需要一些关于组和分数的良好索引,但包括ID
SELECT
foo.[Group],
m1.ID AS Min_ID,
m2.ID AS Max_ID
FROM
(
SELECT
[Group], MIN(Score) AS MinScore, MAX(Score) AS MaxScore
FROM
mytable
GROUP BY
[Group]
) foo
JOIN
mytable m1 ON foo.[Group] = m1.[Group] AND foo.MinScore = m1.Score
JOIN
mytable m2 ON foo.[Group] = m2.[Group] AND foo.MaxScore = m2.Score
In your sample data however, this also works because ID and score are aligned in order:
但是,在您的示例数据中,这也有效,因为ID和分数按顺序对齐:
SELECT
[Group],
MIN(ID) AS Min_ID,
MAX(ID) AS Max_ID
FROM
mytable
GROUP BY
[Group]
#4
You could use a sub-query (SELECT TOP 1
)...
您可以使用子查询(SELECT TOP 1)...
#1
If possible, it would be good to have a general solution which would allow me to select the top and bottom
n
values per group.如果可能的话,最好有一个通用的解决方案,允许我选择每组的顶部和底部n值。
WITH q AS
(
SELECT m.*,
ROW_NUMBER() OVER (PARTITION BY Group ORDER BY Score) AS rn_asc,
ROW_NUMBER() OVER (PARTITION BY Group ORDER BY Score DESC) AS rn_desc
FROM mytable m
)
SELECT *
FROM q
WHERE rn_asc BETWEEN 1 AND 10
OR rn_desc BETWEEN 1 AND 10
#2
DECLARE @YourTable TABLE (ID INTEGER, Score INTEGER, [Group] VARCHAR(1))
INSERT INTO @YourTable VALUES (10, 1, 'A')
INSERT INTO @YourTable VALUES (6 , 2, 'A')
INSERT INTO @YourTable VALUES (3 , 3, 'A')
INSERT INTO @YourTable VALUES (8 , 5, 'B')
INSERT INTO @YourTable VALUES (4 , 1, 'C')
INSERT INTO @YourTable VALUES (9 , 3, 'C')
INSERT INTO @YourTable VALUES (2 , 4, 'C')
INSERT INTO @YourTable VALUES (7 , 4, 'C')
INSERT INTO @YourTable VALUES (12, 3, 'D')
INSERT INTO @YourTable VALUES (1 , 3, 'D')
INSERT INTO @YourTable VALUES (11, 4, 'D')
INSERT INTO @YourTable VALUES (5 , 6, 'D')
SELECT [Group], MIN([Min_ID]), MAX([Max_ID])
FROM (
SELECT [score].[Group], [Min_ID] = [min].ID, [Max_ID] = [max].ID
FROM (
SELECT [Group], [Min_Score] = MIN(Score), [Max_Score] = MAX(Score)
FROM @YourTable
GROUP BY [GROUP]) score
INNER JOIN @YourTable [min] ON [min].[Group] = [score].[Group] AND [min].[Score] = [score].[Min_Score]
INNER JOIN @YourTable [max] ON [max].[Group] = [score].[Group] AND [max].[Score] = [score].[Max_Score]
) yourtable
GROUP BY [yourtable].[Group]
#3
Any solution will also require some good indexes on group and score, but include ID
任何解决方案还需要一些关于组和分数的良好索引,但包括ID
SELECT
foo.[Group],
m1.ID AS Min_ID,
m2.ID AS Max_ID
FROM
(
SELECT
[Group], MIN(Score) AS MinScore, MAX(Score) AS MaxScore
FROM
mytable
GROUP BY
[Group]
) foo
JOIN
mytable m1 ON foo.[Group] = m1.[Group] AND foo.MinScore = m1.Score
JOIN
mytable m2 ON foo.[Group] = m2.[Group] AND foo.MaxScore = m2.Score
In your sample data however, this also works because ID and score are aligned in order:
但是,在您的示例数据中,这也有效,因为ID和分数按顺序对齐:
SELECT
[Group],
MIN(ID) AS Min_ID,
MAX(ID) AS Max_ID
FROM
mytable
GROUP BY
[Group]
#4
You could use a sub-query (SELECT TOP 1
)...
您可以使用子查询(SELECT TOP 1)...