I have a table with 40 million rows.
I want to pick up about 2 million at a time and "process" them.
Why?
Cos processing processing 10million+ rows degrades performance, and often times out. (I need this to work independant of data size, so i cant just keep increasing the time out limit.)
Also, I'm using SQL Server.
我有一张4000万行的表。我想一次拿起大约200万并“处理”它们。为什么? Cos处理处理1000万行+会降低性能,并且经常会超时。 (我需要这个独立于数据大小的工作,所以我不能继续增加超时限制。)另外,我正在使用SQL Server。
1 个解决方案
#1
Is there an increasing key, such as an identity key? And is it the clustered index? If so, it should be fairly simple to track the last key you got to, and do things like:
是否有增加的密钥,例如身份密钥?它是聚集索引吗?如果是这样,跟踪你到达的最后一个键应该相当简单,并执行以下操作:
SELECT TOP 1000000 *
FROM [MyTable]
WHERE [Id] > @LastId
ORDER BY [Id]
Also - be sure to read it with something like ExecuteReader
, so that you aren't buffering too many rows.
另外 - 请务必使用类似ExecuteReader的内容读取它,这样就不会缓冲太多行。
Of course, beyond a few thousand rows, you might as well just accept the occasional round-trip, and make a number of requests for (say) 10000 rows at a time. I don't think this would be any less efficient in real terms (a few milliseconds here and there).
当然,除了几千行之外,您还可以接受偶尔的往返,并且一次请求(比如说)10000行。我认为实际上这不会有任何效率(在这里和那里几毫秒)。
#1
Is there an increasing key, such as an identity key? And is it the clustered index? If so, it should be fairly simple to track the last key you got to, and do things like:
是否有增加的密钥,例如身份密钥?它是聚集索引吗?如果是这样,跟踪你到达的最后一个键应该相当简单,并执行以下操作:
SELECT TOP 1000000 *
FROM [MyTable]
WHERE [Id] > @LastId
ORDER BY [Id]
Also - be sure to read it with something like ExecuteReader
, so that you aren't buffering too many rows.
另外 - 请务必使用类似ExecuteReader的内容读取它,这样就不会缓冲太多行。
Of course, beyond a few thousand rows, you might as well just accept the occasional round-trip, and make a number of requests for (say) 10000 rows at a time. I don't think this would be any less efficient in real terms (a few milliseconds here and there).
当然,除了几千行之外,您还可以接受偶尔的往返,并且一次请求(比如说)10000行。我认为实际上这不会有任何效率(在这里和那里几毫秒)。