05. 取SQL分组中的某几行数据

时间:2023-12-30 21:09:56

对表中数据分组,有时只需要某列的聚合值;有时却需要返回整行数据,常用的方法有:子查询、ROW_NUMBER、APPLY,总体感觉还是ROW_NUMBER比较直观。
测试数据:

if OBJECT_ID('testGroup') is not null
drop table testGroup
GO
create table testGroup
(
ID int identity primary key,
UserID int,
OrderID int
)
GO
insert testGroup
select 1,10 union all
select 1,20 union all
select 1,30 union all
select 2,100 union all
select 2,200 union all
select 3,1000 union all
select 3,2000 union all
select 3,3000 union all
select 3,4000

一. 取分组中第1行(最大/最小值)
1. 取出分组中某列最大/最小值,不要求显示其他列
最常见的分组聚合,用group by 分组时,只有参加分组/聚合的列才可以被显示。

select UserID, MAX(OrderID) as MaxOrderID
from testGroup
group by UserID

2. 取出分组中某列最大/最小值,要求显示其他列

要显示表中其他列,用group by 不好实现,可以借助子查询。

select * from testGroup a
where ID = (select MAX(ID) from testGroup b where a.UserID = b.UserID)
order by ID
--或者
select * from testGroup
where ID in (select MAX(ID) from testGroup group by UserID)
--或者
select * from testGroup as a
where a.ID in (select top 1 ID from testGroup b where a.UserID = b.UserID order by b.OrderID desc)
--或者
select * from testGroup a
where not exists(select 1 from testGroup b where a.UserID = b.UserID and a.OrderID < b.OrderID)
--或者
select * from testGroup a
where (select count(1) from testGroup b where a.UserID = b.UserID and a.id <= b.id) = 1

二. 取分组中前N行(排名前几名)
前N行为正向排序(ASC),后N行改为反向排序(DESC)即可,N=1时也就是取最大/最小值的行。下面以前2名(N=2)为例。
1. SQL Server 2000的写法
(1)子查询

select * from testGroup as a
where a.ID in (select top 2 ID from testGroup b where a.UserID = b.UserID order by b.OrderID)
--或者
select * from testGroup a
where not exists (select 1 from testGroup b where a.UserID = b.UserID and a.OrderID > b.OrderID
having count(1) >= 2)
--或者
select * from testGroup a
where (select count(1) from testGroup b where a.UserID = b.UserID and a.ID >= b.ID) <= 2
--没有唯一标识的表,可以用checksum来标识每行
select * from testGroup as a
where checksum(*) in (select top 2 checksum(*) from testGroup b where a.UserID = b.UserID order by b.OrderID)

2. SQL Server 2005新语法

(2) ROW_NUMBER()

select ID, UserID, OrderID
from
(select *, ROW_NUMBER() over(partition by UserID order by OrderID) num
from testGroup ) t
where t.num between 1 and 2

(3) APPLY(TOP)

select distinct t.* from testGroup a
cross apply (select top 2 ID, UserID, OrderID from testGroup b
where a.UserID = b.UserID order by b.OrderID) as t

三. 取分组中第N行(排名第N名)
把上面的查询中,范围值都改为固定值,就可以取具体某一行了,下面以第3名(N=3)为例。
(1) 子查询

select * from testGroup a
where (select count(1) from testGroup b where a.UserID = b.UserID and a.OrderID >= b.OrderID) = 3
--或者
select * from testGroup a
where exists (select 1 from testGroup b where a.UserID = b.UserID and a.OrderID >= b.OrderID
having count(1) = 3)

(2) ROW_NUMBER()

select ID, UserID, OrderID
from
(select *, ROW_NUMBER() over(partition by UserID order by OrderID) num
from testGroup ) t
where t.num = 3