如何避免在选择然后更新的情况下再次获取数据

Currently, my table (table A) has around 10.000.000 records. Every day, there are 100 records come. They are new and not been processed. So, Process column = 0. I'm using SQL Server 2008.

目前，我的表(表A)有大约10.000.000条记录。每天都有100项记录。它们是新的，没有经过处理。进程列= 0。我使用的是SQL Server 2008。

In my business, I need to do 2 steps:

在我的工作中，我需要做2个步骤:

Getting data are new (Process = 0), do something, and insert to table B.
获取数据是新的(Process = 0)，做点什么，然后插入到表B。
Update Process = 1 at table A.
表A的更新过程= 1。

So, at step 1, I got them with WHERE clause to get these 100 records. At step 2, I have to use WHERE clause one more time to get and update them.

所以，在第一步，我用WHERE子句获得了这100条记录。在步骤2中，我必须再次使用WHERE子句来获取和更新它们。

I think, with getting data twice, the performance will not good, right?

我认为，两次获得数据，性能不会很好，对吧?

Can someone advise me what should I do in this case so that I just need to query ONLY one time?

有人能建议我在这种情况下该怎么做，以便我只需要查询一次吗?

Thank you very much.

非常感谢。

3 个解决方案

#1

Consider the OUTPUT clause:

考虑到输出条款:

UPDATE A
SET    Process = 1
OUTPUT INSERTED.column1,
       INSERTED.column2,
       …
INTO   B (column1, column2, …)
WHERE  Process = 0
;

Note that, according to the manual, the B table cannot:

注意，根据手册，B表不能:

Have enabled triggers defined on it.

已启用在其上定义的触发器。

Participate on either side of a FOREIGN KEY constraint.

参与任何一方的外键约束。

Have CHECK constraints or enabled rules.

具有检查约束或启用规则。

If anything of the above is true with regard to the table B, you could use a temporary table or a table variable as an intermediate storage before finally inserting data into B:

如果表B符合上述任何条件，您可以使用临时表或表变量作为中间存储，然后再将数据插入到B中:

DECLARE @newdata TABLE (columns);
UPDATE A
SET    Process = 1
OUTPUT INSERTED.column1,
       INSERTED.column2,
       …
INTO   @newdata (column1, column2, …)
WHERE  Process = 0
;
INSERT INTO B (columns)
SELECT columns FROM @newdata
;

#2

SQL server holds results from previous queries in cache. So if you have useful primary keys (say: small, clustered surrogate keys), the second query shouldn't be an issue.

SQL server在缓存中保存以前查询的结果。因此，如果您有有用的主键(比如:小型的、集群的代理键)，第二个查询就不应该成为问题。

If you want to create larger batches (e.g. 10000 items at once), you could use a temp table to store the primary keys your are handling in a batch. This way, you don't need to pass too many keys in a query.

如果您想创建更大的批(例如，一次创建10000个项目)，您可以使用临时表将正在处理的主键存储在一个批中。这样，您就不需要在查询中传递太多的键。

Avoid premature optimization. Identify the performance problem first - if there is one.

避免过早优化。首先确定性能问题——如果有的话。

#3

You can Use trigger for your problem:

你可以用触发器解决你的问题:

CREATE TRIGGER TR_A ON  A
   AFTER INSERT
AS BEGIN
    INSERT INTO B (Column1, Column2, ...)
    SELECT I.Column1, I.Column2, ...
    FROM INSERTED I
    WHERE I.Process = 1
END

#1