T-SQL是更新限制Atomic的子查询吗?

时间:2021-09-10 01:27:49

I've got a simple queue implementation in MS Sql Server 2008 R2. Here's the essense of the queue:

我在MS Sql Server 2008 R2中有一个简单的队列实现。这是队列的本质:

CREATE TABLE ToBeProcessed 
(
    Id BIGINT IDENTITY(1,1) PRIMARY KEY NOT NULL,
    [Priority] INT DEFAULT(100) NOT NULL,
    IsBeingProcessed BIT default (0) NOT NULL,
    SomeData nvarchar(MAX) NOT null
)

I want to atomically select the top n rows ordered by the priority and the id where IsBeingProcessed is false and update those rows to say they are being processed. I thought I'd use a combination of Update, Top, Output and Order By but unfortunately you can't use top and order by in an Update statement.

我想原子地选择按优先级排序的前n行和IsBeingProcessed为false的id,并更新这些行以表示它们正在被处理。我以为我会使用Update,Top,Output和Order By的组合,但不幸的是你不能在Update语句中使用top和order by。

So I've made an in clause to restrict the update and that sub query does the order by (see below). My question is, is this whole statement atomic, or do I need to wrap it in a transaction?

所以我创建了一个in子句来限制更新,并且子查询按顺序执行(见下文)。我的问题是,这整个语句是原子的,还是我需要将它包装在一个事务中?

DECLARE @numberToProcess INT = 2

CREATE TABLE #IdsToProcess
(
    Id BIGINT NOT null
)

UPDATE 
    ToBeProcessed
SET
    ToBeProcessed.IsBeingProcessed = 1
OUTPUT 
    INSERTED.Id 
INTO
    #IdsToProcess   
WHERE
    ToBeProcessed.Id IN 
    (
        SELECT TOP(@numberToProcess) 
            ToBeProcessed.Id 
        FROM 
            ToBeProcessed 
        WHERE
            ToBeProcessed.IsBeingProcessed = 0
        ORDER BY 
            ToBeProcessed.Id, 
            ToBeProcessed.Priority DESC)

SELECT 
    *
FROM 
    #IdsToProcess

DROP TABLE #IdsToProcess

Here's some sql to insert some dummy rows:

这是一些插入一些虚拟行的SQL:

INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');
INSERT INTO ToBeProcessed (SomeData) VALUES (N'');

2 个解决方案

#1


8  

If I understand the motivation for the question you want to avoid the possibility that two concurrent transactions could both execute the sub query to get the top N rows to process then proceed to update the same rows?

如果我理解问题的动机,你想避免两个并发事务都可以执行子查询以获得前N行进行处理然后继续更新相同行的可能性?

In that case I'd use this approach.

在那种情况下,我会使用这种方法。

;WITH cte As
(
SELECT TOP(@numberToProcess) 
            *
        FROM 
            ToBeProcessed WITH(UPDLOCK,ROWLOCK,READPAST) 
        WHERE
            ToBeProcessed.IsBeingProcessed = 0
        ORDER BY 
            ToBeProcessed.Id, 
            ToBeProcessed.Priority DESC
)            
UPDATE 
    cte
SET
    IsBeingProcessed = 1
OUTPUT 
    INSERTED.Id 
INTO
    #IdsToProcess  

I was a bit uncertain earlier whether SQL Server would take U locks when processing your version with the sub query thus blocking two concurrent transactions from reading the same TOP N rows. This does not appear to be the case.

我之前有点不确定SQL Server是否会在使用子查询处理您的版本时使用U锁,从而阻止两个并发事务读取相同的TOP N行。这似乎不是这种情况。

Test Table

CREATE TABLE JobsToProcess
(
priority INT IDENTITY(1,1),
isprocessed BIT ,
number INT
)

INSERT INTO JobsToProcess
SELECT TOP (1000000) 0,0
FROM master..spt_values v1, master..spt_values v2

Test Script (Run in 2 concurrent SSMS sessions)

BEGIN TRY
DECLARE @FinishedMessage VARBINARY (128) = CAST('TestFinished' AS  VARBINARY (128))
DECLARE @SynchMessage VARBINARY (128) = CAST('TestSynchronising' AS  VARBINARY (128))
SET CONTEXT_INFO @SynchMessage

DECLARE @OtherSpid int

WHILE(@OtherSpid IS NULL)
SELECT @OtherSpid=spid 
FROM sys.sysprocesses 
WHERE context_info=@SynchMessage and spid<>@@SPID

SELECT @OtherSpid


DECLARE @increment INT = @@spid
DECLARE @number INT = @increment

WHILE (@number = @increment AND NOT EXISTS(SELECT * FROM sys.sysprocesses WHERE context_info=@FinishedMessage))
UPDATE JobsToProcess 
SET @number=number +=@increment,isprocessed=1
WHERE priority = (SELECT TOP 1 priority 
                   FROM JobsToProcess 
                   WHERE isprocessed=0 
                   ORDER BY priority DESC)

SELECT * 
FROM JobsToProcess 
WHERE number not in (0,@OtherSpid,@@spid)
SET CONTEXT_INFO @FinishedMessage
END TRY
BEGIN CATCH
SET CONTEXT_INFO @FinishedMessage
SELECT ERROR_MESSAGE(), ERROR_NUMBER()
END CATCH

Almost immediately execution stops as both concurrent transactions update the same row so the S locks taken whilst identifying the TOP 1 priority must get released before it aquires a U lock then the 2 transactions proceed to get the row U and X lock in sequence.

几乎立即执行停止,因为两个并发事务都更新同一行,因此在识别TOP 1优先级时采用的S锁必须在获取U锁之前释放,然后2个事务继续按顺序获得行U和X锁。

T-SQL是更新限制Atomic的子查询吗?

If a CI is added ALTER TABLE JobsToProcess ADD PRIMARY KEY CLUSTERED (priority) then deadlock occurs almost immediately instead as in this case the row S lock doesn't get released, one transaction aquires a U lock on the row and waits to convert it to an X lock and the other transaction is still waiting to convert its S lock to a U lock.

如果添加了CI ALTER TABLE JobsToProcess ADD PRIMARY KEY CLUSTERED(优先级),那么死锁几乎立即发生,因为在这种情况下,行S锁没有被释放,一个事务获取行上的U锁并等待将其转换为一个X锁,另一个事务仍在等待将其S锁转换为U锁。

T-SQL是更新限制Atomic的子查询吗?

If the query above is changed to use MIN rather than TOP

如果上面的查询更改为使用MIN而不是TOP

WHERE priority = (SELECT MIN(priority)
                   FROM JobsToProcess 
                   WHERE isprocessed=0 
                   )

Then SQL Server manages to completely eliminate the sub query from the plan and takes U locks all the way.

然后SQL Server设法完全消除计划中的子查询并一直采用U锁。

T-SQL是更新限制Atomic的子查询吗?

#2


2  

Every individual T-SQL statement is, according to all my experience and all the documenation I've ever read, supposed to be atomic. What you have there is a single T-SQL statement, ergo is should be atomic and will not require explicit transaction statements. I've used this precise kind of logic many times, and never had a problem with it. I look forward to seeing if anyone as a supportable alternate opinion.

根据我所有的经验和我读过的所有文档,每个单独的T-SQL语句都应该是原子的。你有什么是一个T-SQL语句,ergo应该是原子的,不需要显式的事务语句。我多次使用这种精确的逻辑,从来没有遇到过问题。我期待看到是否有人作为可支持的替代意见。

Incidentally, look into the ranking functions, specifically row_number(), for retrieving a set number of items. The syntax is perhaps a tad awkward, but overall they are flexible and powerful tools. (There are about a bazillion Stack Overlow questions and answers that discuss them.)

顺便提一下,查看排名函数,特别是row_number(),用于检索一定数量的项目。语法可能有点尴尬,但总的来说它们是灵活而强大的工具。 (有大量的Stack Overlow问题和答案可以讨论它们。)

#1


8  

If I understand the motivation for the question you want to avoid the possibility that two concurrent transactions could both execute the sub query to get the top N rows to process then proceed to update the same rows?

如果我理解问题的动机,你想避免两个并发事务都可以执行子查询以获得前N行进行处理然后继续更新相同行的可能性?

In that case I'd use this approach.

在那种情况下,我会使用这种方法。

;WITH cte As
(
SELECT TOP(@numberToProcess) 
            *
        FROM 
            ToBeProcessed WITH(UPDLOCK,ROWLOCK,READPAST) 
        WHERE
            ToBeProcessed.IsBeingProcessed = 0
        ORDER BY 
            ToBeProcessed.Id, 
            ToBeProcessed.Priority DESC
)            
UPDATE 
    cte
SET
    IsBeingProcessed = 1
OUTPUT 
    INSERTED.Id 
INTO
    #IdsToProcess  

I was a bit uncertain earlier whether SQL Server would take U locks when processing your version with the sub query thus blocking two concurrent transactions from reading the same TOP N rows. This does not appear to be the case.

我之前有点不确定SQL Server是否会在使用子查询处理您的版本时使用U锁,从而阻止两个并发事务读取相同的TOP N行。这似乎不是这种情况。

Test Table

CREATE TABLE JobsToProcess
(
priority INT IDENTITY(1,1),
isprocessed BIT ,
number INT
)

INSERT INTO JobsToProcess
SELECT TOP (1000000) 0,0
FROM master..spt_values v1, master..spt_values v2

Test Script (Run in 2 concurrent SSMS sessions)

BEGIN TRY
DECLARE @FinishedMessage VARBINARY (128) = CAST('TestFinished' AS  VARBINARY (128))
DECLARE @SynchMessage VARBINARY (128) = CAST('TestSynchronising' AS  VARBINARY (128))
SET CONTEXT_INFO @SynchMessage

DECLARE @OtherSpid int

WHILE(@OtherSpid IS NULL)
SELECT @OtherSpid=spid 
FROM sys.sysprocesses 
WHERE context_info=@SynchMessage and spid<>@@SPID

SELECT @OtherSpid


DECLARE @increment INT = @@spid
DECLARE @number INT = @increment

WHILE (@number = @increment AND NOT EXISTS(SELECT * FROM sys.sysprocesses WHERE context_info=@FinishedMessage))
UPDATE JobsToProcess 
SET @number=number +=@increment,isprocessed=1
WHERE priority = (SELECT TOP 1 priority 
                   FROM JobsToProcess 
                   WHERE isprocessed=0 
                   ORDER BY priority DESC)

SELECT * 
FROM JobsToProcess 
WHERE number not in (0,@OtherSpid,@@spid)
SET CONTEXT_INFO @FinishedMessage
END TRY
BEGIN CATCH
SET CONTEXT_INFO @FinishedMessage
SELECT ERROR_MESSAGE(), ERROR_NUMBER()
END CATCH

Almost immediately execution stops as both concurrent transactions update the same row so the S locks taken whilst identifying the TOP 1 priority must get released before it aquires a U lock then the 2 transactions proceed to get the row U and X lock in sequence.

几乎立即执行停止,因为两个并发事务都更新同一行,因此在识别TOP 1优先级时采用的S锁必须在获取U锁之前释放,然后2个事务继续按顺序获得行U和X锁。

T-SQL是更新限制Atomic的子查询吗?

If a CI is added ALTER TABLE JobsToProcess ADD PRIMARY KEY CLUSTERED (priority) then deadlock occurs almost immediately instead as in this case the row S lock doesn't get released, one transaction aquires a U lock on the row and waits to convert it to an X lock and the other transaction is still waiting to convert its S lock to a U lock.

如果添加了CI ALTER TABLE JobsToProcess ADD PRIMARY KEY CLUSTERED(优先级),那么死锁几乎立即发生,因为在这种情况下,行S锁没有被释放,一个事务获取行上的U锁并等待将其转换为一个X锁,另一个事务仍在等待将其S锁转换为U锁。

T-SQL是更新限制Atomic的子查询吗?

If the query above is changed to use MIN rather than TOP

如果上面的查询更改为使用MIN而不是TOP

WHERE priority = (SELECT MIN(priority)
                   FROM JobsToProcess 
                   WHERE isprocessed=0 
                   )

Then SQL Server manages to completely eliminate the sub query from the plan and takes U locks all the way.

然后SQL Server设法完全消除计划中的子查询并一直采用U锁。

T-SQL是更新限制Atomic的子查询吗?

#2


2  

Every individual T-SQL statement is, according to all my experience and all the documenation I've ever read, supposed to be atomic. What you have there is a single T-SQL statement, ergo is should be atomic and will not require explicit transaction statements. I've used this precise kind of logic many times, and never had a problem with it. I look forward to seeing if anyone as a supportable alternate opinion.

根据我所有的经验和我读过的所有文档,每个单独的T-SQL语句都应该是原子的。你有什么是一个T-SQL语句,ergo应该是原子的,不需要显式的事务语句。我多次使用这种精确的逻辑,从来没有遇到过问题。我期待看到是否有人作为可支持的替代意见。

Incidentally, look into the ranking functions, specifically row_number(), for retrieving a set number of items. The syntax is perhaps a tad awkward, but overall they are flexible and powerful tools. (There are about a bazillion Stack Overlow questions and answers that discuss them.)

顺便提一下,查看排名函数,特别是row_number(),用于检索一定数量的项目。语法可能有点尴尬,但总的来说它们是灵活而强大的工具。 (有大量的Stack Overlow问题和答案可以讨论它们。)