查询250k行需要53秒

The box this query is running on is a dedicated server running in a datacenter.

这个查询运行的框是一个在数据中心中运行的专用服务器。

AMD Opteron 1354 Quad-Core 2.20GHz 2GB of RAM Windows Server 2008 x64 (Yes I know I only have 2GB of RAM, I'm upgrading to 8GB when the project goes live).

AMD Opteron 1354四核2.20GHz 2GB的RAM Windows Server 2008 x64(是的，我知道我只有2GB的RAM，我在项目上线时升级到8GB)。

So I went through and created 250,000 dummy rows in a table to really stress test some queries that LINQ to SQL generates and make sure they're not to terrible and I noticed one of them was taking an absurd amount of time.

我在表中创建了250000个虚拟行来对LINQ to SQL生成的一些查询进行压力测试以确保它们不会很糟糕，我注意到其中一个查询占用了大量的时间。

I had this query down to 17 seconds with indexes but I removed them for the sake of this answer to go from start to finish. Only indexes are Primary Keys.

我用索引把这个查询缩短到17秒，但是为了这个答案，我把它们从开始到结束删除了。只有索引是主键。

Stories table --
[ID] [int] IDENTITY(1,1) NOT NULL,
[UserID] [int] NOT NULL,
[CategoryID] [int] NOT NULL,
[VoteCount] [int] NOT NULL,
[CommentCount] [int] NOT NULL,
[Title] [nvarchar](96) NOT NULL,
[Description] [nvarchar](1024) NOT NULL,
[CreatedAt] [datetime] NOT NULL,
[UniqueName] [nvarchar](96) NOT NULL,
[Url] [nvarchar](512) NOT NULL,
[LastActivityAt] [datetime] NOT NULL,

Categories table --
[ID] [int] IDENTITY(1,1) NOT NULL,
[ShortName] [nvarchar](8) NOT NULL,
[Name] [nvarchar](64) NOT NULL,

Users table --
[ID] [int] IDENTITY(1,1) NOT NULL,
[Username] [nvarchar](32) NOT NULL,
[Password] [nvarchar](64) NOT NULL,
[Email] [nvarchar](320) NOT NULL,
[CreatedAt] [datetime] NOT NULL,
[LastActivityAt] [datetime] NOT NULL,

Currently in the database there is 1 user, 1 category and 250,000 stories and I tried to run this query.

目前在数据库中有一个用户、一个类别和250000个故事，我尝试运行这个查询。

SELECT TOP(10) *
FROM Stories
INNER JOIN Categories ON Categories.ID = Stories.CategoryID
INNER JOIN Users ON Users.ID = Stories.UserID
ORDER BY Stories.LastActivityAt

Query takes 52 seconds to run, CPU usage hovers at 2-3%, Membery is 1.1GB, 900MB free but the Disk usage seems out of control. It's @ 100MB/sec with 2/3 of that being writes to tempdb.mdf and the rest is reading from tempdb.mdf.

查询运行需要52秒，CPU使用率保持在2-3%，Membery为1.1GB, 900MB，但是磁盘使用似乎无法控制。@ 100MB/秒，其中2/3被写入到tempdb。mdf和其余的都是从tempdb.mdf中读取的。

Now for the interesting part...

现在有趣的是……

SELECT TOP(10) *
FROM Stories
INNER JOIN Categories ON Categories.ID = Stories.CategoryID
INNER JOIN Users ON Users.ID = Stories.UserID

SELECT TOP(10) *
FROM Stories
INNER JOIN Users ON Users.ID = Stories.UserID
ORDER BY Stories.LastActivityAt

SELECT TOP(10) *
FROM Stories
INNER JOIN Categories ON Categories.ID = Stories.CategoryID
ORDER BY Stories.LastActivityAt

All 3 of these queries are pretty much instant.

这三个查询几乎都是即时的。

Exec plan for first query.
http://i43.tinypic.com/xp6gi1.png

执行计划第一次查询。http://i43.tinypic.com/xp6gi1.png

Exec plans for other 3 queries (in order).
http://i43.tinypic.com/30124bp.png
http://i44.tinypic.com/13yjml1.png
http://i43.tinypic.com/33ue7fb.png

Exec计划其他3个查询(按顺序)。http://i43.tinypic.com/30124bp.png http://i44.tinypic.com/13yjml1.png http://i43.tinypic.com/33ue7fb.png

Any help would be much appreciated.

如有任何帮助，我们将不胜感激。

Exec plan after adding indexes (down to 17 seconds again).
http://i39.tinypic.com/2008ytx.png

执行计划在添加索引之后(再减少到17秒)。http://i39.tinypic.com/2008ytx.png

I've gotten a lot of helpful feedback from everyone and I thank you, I tried a new angle at this. I query the stories I need, then in separate queries get the Categories and Users and with 3 queries it only took 250ms... I don't understand the issue but if it works and at 250ms no less for the time being I'll stick with that. Here's the code I used to test this.

我从每个人那里得到了很多有用的反馈，谢谢，我尝试了一个新的角度。我查询我需要的故事，然后在单独的查询中获取类别和用户，3个查询只花费250毫秒……我不理解这个问题，但如果它能起作用，至少在250ms的时候，我将坚持下去。这是我用来测试它的代码。

DBDataContext db = new DBDataContext();
Console.ReadLine();

Stopwatch sw = Stopwatch.StartNew();

var stories = db.Stories.OrderBy(s => s.LastActivityAt).Take(10).ToList();
var storyIDs = stories.Select(c => c.ID);
var categories = db.Categories.Where(c => storyIDs.Contains(c.ID)).ToList();
var users = db.Users.Where(u => storyIDs.Contains(u.ID)).ToList();

sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);

8 个解决方案

#1

Try adding an index on Stories.LastActivityAt. I think the clustered index scan in the execution plan may be due to the sorting.

尝试在Stories.LastActivityAt上添加索引。我认为执行计划中的聚集索引扫描可能是由于排序。

Edit: Since my query returned in an instant with rows just a few bytes long, but has been running for 5 minutes already and is still going after I added a 2K varchar, I think Mitch has a point. It is the volume of that data that is shuffled around for nothing, but this can be fixed in the query.

编辑:由于我的查询立即返回，行只有几个字节长，但是已经运行了5分钟，并且在我添加了2K varchar之后仍在继续，我认为Mitch说的有道理。该数据的容量被随意打乱，但可以在查询中修复它。

Try putting the join, sort and top(10) in a view or in a nested query, and then join back against the story table to get the rest of the data just for the 10 rows that you need.

尝试将join、sort和top(10)放在视图或嵌套查询中，然后对story表进行连接，以获得所需的10行数据的其余部分。

Like this:

是这样的:

select * from 
(
    SELECT TOP(10) id, categoryID, userID
    FROM Stories
    ORDER BY Stories.LastActivityAt
) s
INNER JOIN Stories ON Stories.ID = s.id
INNER JOIN Categories ON Categories.ID = s.CategoryID
INNER JOIN Users ON Users.ID = s.UserID

If you have an index on LastActivityAt, this should run very fast.

如果您有一个关于LastActivityAt的索引，那么它应该运行得非常快。

#2

So if I read the first part correctly, it responds in 17 seconds with an index. Which is still a while to chug out 10 records. I'm thinking the time is in the order by clause. I would want an index on LastActivityAt, UserID, CategoryID. Just for fun, remove the order by and see if it returns the 10 records quickly. If it does, then you know it is not in the joins to the other tables. Also it would be helpful to replace the * with the columns needed as all 3 table columns are in the tempdb as you are sorting - as Neil mentioned.

如果我读对了第一部分，它在17秒内就会有一个索引。还需要一段时间才能录制出10张唱片。我在想时间是按条款规定的。我想要一个关于LastActivityAt, UserID, CategoryID的索引。为了好玩，删除订单，看看它是否能快速返回10条记录。如果是，那么您就知道它不在与其他表的连接中。同样，将*替换为所需的列也很有帮助，因为在进行排序时，tempdb中的所有3个表列都在—正如Neil所提到的。

Looking at the execution plans you'll notice the extra sort - I believe that is the order by which is going to take some time. I'm assuming you had an index with the 3 and it was 17 seconds... so you may want one index for the join criteria (userid, categoryID) and another for lastactivityat - see if that performs better. Also it would be good to run the query thru the index tuning wizard.

看看执行计划，你会注意到额外的排序——我相信这是需要一些时间的顺序。我假设你有一个索引和3，它是17秒…因此，您可能需要一个连接标准的索引(userid, categoryID)和另一个lastactivityat的索引——看看它们的性能是否更好。同样，最好通过索引调优向导运行查询。

#3

My first suggestion is to remove the *, and replace it with the minimum columns you need.

我的第一个建议是删除*，并用您需要的最小列替换它。

second, is there a trigger involved? Something that would update the LastActivityAt field?

第二，是否有触发因素?更新LastActivityAt字段的内容?

#4

Based on your problem query, try add a combination index on table Stories (CategoryID, UserID, LastActivityAt)

根据问题查询，尝试在表事例(CategoryID, UserID, LastActivityAt)上添加一个组合索引

#5

You are maxing out the Disks in your hardware setup.

在您的硬件设置中，您正在最大化磁盘。

Given your comments about your Data/Log/tempDB File placement, I think any amount of tuning is going to be a bandaid.

考虑到您对您的数据/日志/tempDB文件放置的评论，我认为任何程度的调优都将是一个绷带。

250,000 Rows is small. Imagine how bad your problems are going to be with 10 million rows.

250000行很小。想象一下你的问题会有多糟糕，有1000万行。

I suggest you move tempDB onto its own physical drive (preferable a RAID 0).

我建议您将tempDB移动到它自己的物理驱动器上(最好是RAID 0)。

#6

Ok, so my test machine isn't fast. Actually it's really slow. It 1.6 ghz,n 1 gb of ram, No multiple disks, just a single (read slow) disk for sql server, os, and extras.

我的测试机器不太快。其实真的很缓慢。它有1.6 ghz, n1gb的ram，没有多个磁盘，只有一个(读慢)磁盘用于sql服务器、操作系统和附加组件。

I created your tables with primary and foreign keys defined. Inserted 2 categories, 500 random users, and 250000 random stories.

我创建了您的表，其中定义了主键和外键。插入2个类别，500个随机用户，25万个随机故事。

Running the first query above takes 16 seconds (no plan cache either). If I index the LastActivityAt column I get results in under a second (no plan cache here either).

运行上面的第一个查询需要16秒(也没有计划缓存)。如果我索引LastActivityAt列，我可以在一秒钟内得到结果(这里也没有计划缓存)。

Here's the script I used to do all of this.

这是我以前做这些的脚本。

    --Categories table --
Create table Categories (
[ID] [int] IDENTITY(1,1) primary key NOT NULL,
[ShortName] [nvarchar](8) NOT NULL,
[Name] [nvarchar](64) NOT NULL)

--Users table --
Create table Users(
[ID] [int] IDENTITY(1,1) primary key NOT NULL,
[Username] [nvarchar](32) NOT NULL,
[Password] [nvarchar](64) NOT NULL,
[Email] [nvarchar](320) NOT NULL,
[CreatedAt] [datetime] NOT NULL,
[LastActivityAt] [datetime] NOT NULL
)
go

-- Stories table --
Create table Stories(
[ID] [int] IDENTITY(1,1) primary key NOT NULL,
[UserID] [int] NOT NULL references Users ,
[CategoryID] [int] NOT NULL references Categories,
[VoteCount] [int] NOT NULL,
[CommentCount] [int] NOT NULL,
[Title] [nvarchar](96) NOT NULL,
[Description] [nvarchar](1024) NOT NULL,
[CreatedAt] [datetime] NOT NULL,
[UniqueName] [nvarchar](96) NOT NULL,
[Url] [nvarchar](512) NOT NULL,
[LastActivityAt] [datetime] NOT NULL)

Insert into Categories (ShortName, Name) 
Values ('cat1', 'Test Category One')

Insert into Categories (ShortName, Name) 
Values ('cat2', 'Test Category Two')

--Dummy Users
Insert into Users
Select top 500
UserName=left(SO.name+SC.name, 32)
, Password=left(reverse(SC.name+SO.name), 64)
, Email=Left(SO.name, 128)+'@'+left(SC.name, 123)+'.com'
, CreatedAt='1899-12-31'
, LastActivityAt=GETDATE()
from sysobjects SO 
Inner Join syscolumns SC on SO.id=SC.id
go

--dummy stories!
-- A Count is given every 10000 record inserts (could be faster)
-- RBAR method!
set nocount on
Declare @count as bigint
Set @count = 0
begin transaction
while @count<=250000
begin
Insert into Stories
Select
  USERID=floor(((500 + 1) - 1) * RAND() + 1)
, CategoryID=floor(((2 + 1) - 1) * RAND() + 1)
, votecount=floor(((10 + 1) - 1) * RAND() + 1)
, commentcount=floor(((8 + 1) - 1) * RAND() + 1)
, Title=Cast(NEWID() as VARCHAR(36))+Cast(NEWID() as VARCHAR(36))
, Description=Cast(NEWID() as VARCHAR(36))+Cast(NEWID() as VARCHAR(36))+Cast(NEWID() as VARCHAR(36))
, CreatedAt='1899-12-31'
, UniqueName=Cast(NEWID() as VARCHAR(36))+Cast(NEWID() as VARCHAR(36)) 
, Url=Cast(NEWID() as VARCHAR(36))+Cast(NEWID() as VARCHAR(36))
, LastActivityAt=Dateadd(day, -floor(((600 + 1) - 1) * RAND() + 1), GETDATE())
If @count % 10000=0
Begin
Print @count
Commit
begin transaction
End
Set @count=@count+1
end 
set nocount off
go

--returns in 16 seconds
DBCC DROPCLEANBUFFERS
SELECT TOP(10) *
FROM Stories
INNER JOIN Categories ON Categories.ID = Stories.CategoryID
INNER JOIN Users ON Users.ID = Stories.UserID
ORDER BY Stories.LastActivityAt
go

--Now create an index
Create index IX_LastADate on Stories (LastActivityAt asc)
go
--With an index returns in less than a second
DBCC DROPCLEANBUFFERS
SELECT TOP(10) *
FROM Stories
INNER JOIN Categories ON Categories.ID = Stories.CategoryID
INNER JOIN Users ON Users.ID = Stories.UserID
ORDER BY Stories.LastActivityAt
go

The sort is definitely where your slow down is occuring. Sorting mainly gets done in the tempdb and a large table will cause LOTS to be added. Having an index on this column will definitely improve performance on an order by.

这类事情肯定是你慢下来的地方。排序主要在tempdb中完成，一个大的表将导致大量的添加。在这个列上有一个索引肯定会提高订单的性能。

Also, defining your Primary and Foreign Keys helps SQL Server immensly

此外，定义您的主键和外键可以帮助SQL服务器无限地发挥作用。

Your method that is listed in your code is elegant, and basically the same response that cdonner wrote except in c# and not sql. Tuning the db will probably give even better results!

您的代码中列出的方法是优雅的，并且基本上与cdonner在c#中编写的响应相同，而不是sql。调优db可能会得到更好的结果!

--Kris

——克里斯

#7

Have you cleared the SQL Server cache before running each of the query?

在运行每个查询之前，是否清除了SQL服务器缓存?

In SQL 2000, it's something like DBCC DROPCLEANBUFFERS. Google the command for more info.

在SQL 2000中，它类似于DBCC DROPCLEANBUFFERS。关于更多信息的命令。

Looking at the query, I would have an index for

查看查询，我将有一个for的索引

Categories.ID Stories.CategoryID Users.ID Stories.UserID

类别。ID的故事。被标记的用户。ID Stories.UserID

and possibly Stories.LastActivityAt

并可能Stories.LastActivityAt

But yeah, sounds like the result could be bogus 'cos of caching.

但是，听起来结果可能是虚假的，因为缓存。

#8

When you have worked with SQL Server for some time, you will discover that even the smallest changes to a query can cause wildly different response times. From what I have read in the initial question, and looking at the query plan, I suspect that the optimizer has decided that the best approach is to form a partial result and then sort that as a separate step. The partial result is a composite of the Users and Stories tables. This is formed in tempdb. So the excessive disk access is due to the forming and then sorting of this temporary table.

当您使用SQL Server工作一段时间后，您将发现，即使对查询的最小更改也可能导致不同的响应时间。根据我在第一个问题中所读到的内容，以及查询计划，我怀疑优化器已经决定了最好的方法是形成部分结果，然后将其作为单独的步骤进行排序。部分结果是用户和故事表的组合。这是在tempdb中形成的。所以过多的磁盘访问是由于临时表的形成和排序造成的。

I concur that the solution should be to create a compound index on Stories.LastActivityAt, Stories.UserId, Stories.CategoryId. The order is VERY important, the field LastActivityAt must be first.

我同意解决方案应该是创建一个关于故事的复合索引。LastActivityAt,故事。UserId,Stories.CategoryId。顺序很重要，字段的最后一个动作必须是第一位的。

#1