从100万条记录中的到 成绩最高的记录
问题分析:
要从一张表中找到成绩最高的记录并不难,有很多种办法,最简单的就是利用TOP 1
select
top
1
*
from
student
order
by
score
desc
但是这种方法有一点小问题,就是如果成绩最高的有两个人的话只能查出其中的一个。
对上面的方法进行改进:
select
top
1
with
ties
*
from
student
order
by
score
desc
select
*
from
student
where
score
=
(
select max (score) from student)
或
select * from student where score = ( select top 1 score from student order by score desc )
select max (score) from student)
或
select * from student where score = ( select top 1 score from student order by score desc )
=================成功的分割线======================
但是这个题目成功与否不在于能否查询出成绩最高的记录,而在于用最快的速度查询出来。经过测试从100万条记录中找到成绩最高的记录如果使用第一个方法需要1秒多,
下面是测试的代码:
我这里创建的表是person,查询的是年龄最大的
代码
create
table
person
(
id int identity ( 1 , 1 ) not null ,
pid varchar ( 18 ) not null ,
md varchar ( 11 ) not null ,
age int
)
go
declare @pid varchar ( 15 )
declare @age int
declare @mb varchar ( 11 )
declare @count int
set @count = 0
-- 插入100万条随机的记录
while ( @count < 1000000 )
begin
-- 生成随机的PID
select @pid = substring ( cast ( rand () as varchar ( 20 )), 3 , 6 ) +
substring ( cast ( rand () as varchar ( 20 )), 3 , 6 ) + substring ( cast ( rand () as varchar ( 20 )), 3 , 6 )
-- 生成随机的MB
select @mb = substring ( cast ( rand () as varchar ( 20 )), 3 , 6 ) +
substring ( cast ( rand () as varchar ( 20 )), 3 , 5 )
-- 生成随机的AGE
select @age = cast ( rand () * 100 as int )
-- 将生成的随机数据插入表
insert into person
values ( @pid , @mb , @age )
set @count = @count + 1
end
(
id int identity ( 1 , 1 ) not null ,
pid varchar ( 18 ) not null ,
md varchar ( 11 ) not null ,
age int
)
go
declare @pid varchar ( 15 )
declare @age int
declare @mb varchar ( 11 )
declare @count int
set @count = 0
-- 插入100万条随机的记录
while ( @count < 1000000 )
begin
-- 生成随机的PID
select @pid = substring ( cast ( rand () as varchar ( 20 )), 3 , 6 ) +
substring ( cast ( rand () as varchar ( 20 )), 3 , 6 ) + substring ( cast ( rand () as varchar ( 20 )), 3 , 6 )
-- 生成随机的MB
select @mb = substring ( cast ( rand () as varchar ( 20 )), 3 , 6 ) +
substring ( cast ( rand () as varchar ( 20 )), 3 , 5 )
-- 生成随机的AGE
select @age = cast ( rand () * 100 as int )
-- 将生成的随机数据插入表
insert into person
values ( @pid , @mb , @age )
set @count = @count + 1
end
以上插入的语句需要执行20分钟以上,请耐心等待
完成后用下面的测试看执行的效率
代码
DECLARE @BD DATETIME
DECLARE @ED DATETIME
SET @BD = GETDATE () -- 执行之前记录时间
SELECT TOP 1 * FROM PERSON ORDER BY AGE DESC
SET @ED = GETDATE () -- 执行之后记录时间
SELECT datediff (millisecond, @BD , @ED ) -- 用毫秒的方法显示执行时间。
DECLARE @BD DATETIME
DECLARE @ED DATETIME
SET @BD = GETDATE () -- 执行之前记录时间
SELECT TOP 1 * FROM PERSON ORDER BY AGE DESC
SET @ED = GETDATE () -- 执行之后记录时间
SELECT datediff (millisecond, @BD , @ED ) -- 用毫秒的方法显示执行时间。
我的电脑执行的时候是在1100-1500之间
如果是用子查询的方式会更慢。大约在5000毫秒左右。
解决办法:
为person表的age这一列创建索引
create
nonclustered
index
ix_age
on person(age)
on person(age)
公平起见执行一下清空缓存的语句: (感谢 huyg的提醒)
DBCC
FREEPROCCACHE
--
清空SQL缓存
DBCC DROPCLEANBUFFERS
DBCC DROPCLEANBUFFERS
Oh Yeah!效率提高了无数倍。
使用子查询的方式再试试看:
DECLARE
@BD
DATETIME
DECLARE @ED DATETIME
SET @BD = GETDATE ()
select * from person where age = (
select max (age) from person )
SET @ED = GETDATE ()
SELECT datediff (millisecond, @BD , @ED )
DECLARE @ED DATETIME
SET @BD = GETDATE ()
select * from person where age = (
select max (age) from person )
SET @ED = GETDATE ()
SELECT datediff (millisecond, @BD , @ED )
我执行的时间是4186毫秒,效率略有提升。
但是这样查询出的结果是10192条记录。查询结果记录多的原因是结果集数量太大。