是否应该选择…FOR UPDATE总是包含ORDER BY?

Let's say we execute...

假设我们执行…

SELECT * FROM MY_TABLE FOR UPDATE

...and there is more than one row in MY_TABLE.

…MY_TABLE中有不止一行。

Theoretically, if two concurrent transactions execute this statement, but it happens to traverse (and therefore lock) the rows in different order, a deadlock may occur. For example:

理论上，如果两个并发事务执行此语句，但它碰巧遍历(并因此锁定)不同顺序的行，可能会出现死锁。例如:

Transaction 1: Locks row A.
事务1:锁定行A。
Transaction 2: Locks row B.
事务2:锁定行B。
Transaction 1: Attempts to lock row B and blocks.
事务1:尝试锁定行B和块。
Transaction 2: Attempts to lock row A and deadlocks.
事务2:尝试锁定第A行和死锁。

The way to resolve this is to use ORDER BY to ensure rows are always locked in the same order.

解决这个问题的方法是使用ORDER，确保始终以相同的顺序锁定行。

So, my question is: will this theoretical deadlock ever occur in practice? I know there are ways to artificially induce it, but could it ever occur in the normal operation? Should we just always use ORDER BY, or it's actually safe to omit it?

所以，我的问题是:这种理论僵局会在实践中出现吗?我知道有一些方法可以人工诱导它，但是在正常的操作中它会发生吗?我们应该一直使用ORDER BY，还是忽略它实际上是安全的?

I'm primarily interested in behavior of Oracle and MySQL/InnoDB, but comments on other DBMSes would be helpful as well.

我主要对Oracle和MySQL/InnoDB的行为感兴趣，但是对其他dbms的评论也会有帮助。

--- EDIT ---

Here is how to reproduce a deadlock under Oracle when locking order is not the same:

当锁的顺序不同时，如何在Oracle中重新生成一个死锁:

Create the test table and fill it with some test data...

创建测试表并用一些测试数据填充它…

CREATE TABLE DEADLOCK_TEST (
    ID INT PRIMARY KEY,
    A INT 
);

INSERT INTO DEADLOCK_TEST SELECT LEVEL, 1 FROM DUAL CONNECT BY LEVEL <= 10000;

COMMIT;

...from one client session (I used SQL Developer), run the following block:

在一个客户端会话(我使用SQL Developer)中，运行以下代码块:

DECLARE
    CURSOR CUR IS 
        SELECT * FROM DEADLOCK_TEST
        WHERE ID BETWEEN 1000 AND 2000 
        ORDER BY ID 
        FOR UPDATE;
BEGIN
    WHILE TRUE LOOP
        FOR LOCKED_ROW IN CUR LOOP
            UPDATE DEADLOCK_TEST 
            SET A = -99999999999999999999 
            WHERE CURRENT OF CUR;
        END LOOP;
        ROLLBACK;
    END LOOP;
END;
/

From a different client session (I simply started one more instance of SQL Developer), run that same block, but with DESC in the ORDER BY. After few seconds, you'll get the:

从另一个客户端会话(我仅仅启动了SQL Developer的另一个实例)运行相同的块，但是DESC是按顺序运行的。几秒钟后，你会得到:

ORA-00060: deadlock detected while waiting for resource

BTW, you'll likely achieve the same result by completely removing the ORDER BY (so both blocks are identical), and adding the...

顺便说一句，通过完全删除ORDER by(因此两个块都是相同的)并添加…

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ = 1;

...in front of one block but...

…在一个街区前面，但是……

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ = 10000;

...in front of the other (so Oracle chooses different execution plans and likely fetches the rows in different order).

…在另一个之前(因此Oracle选择不同的执行计划，并可能以不同的顺序获取行)。

This illustrates that locking is indeed done as rows are fetched from the cursor (and not for the whole result-set at once when the cursor is opened).

这说明锁确实是在从游标获取行时完成的(而不是在打开游标时立即获取整个结果集)。

3 个解决方案

#1

Your example in your question shows that the order of locking depends upon the access method. This access path is not directly decided by the ORDER BY clause of the query, there are many factors that can influence this access path. Therefore, you can't prevent a deadlock just by adding an ORDER BY because you could still have two distinct access paths. In fact by running your test case with the order by and changing the session parameters I was able to cause two session to run into an ORA-60 with the same query.

您的问题中的示例表明，锁定的顺序取决于访问方法。此访问路径不是由查询的ORDER by子句直接决定的，有许多因素可以影响此访问路径。因此，您不能仅仅通过添加ORDER来防止死锁，因为您仍然可以有两个不同的访问路径。实际上，通过使用order by运行您的测试用例并更改会话参数，我能够使用相同的查询将两个会话运行到ORA-60。

If the sessions involved have no other lock pending, locking the rows in the same order in all sessions will prevent deadlocks but how can you reliably force this order? Note that this would only prevent this very special case of deadlock anyway. You could still get deadlocks with multiple queries in each session or different plans.

如果涉及的会话没有其他锁挂起，那么在所有会话中以相同的顺序锁定行将防止死锁，但是如何可靠地强制执行这个顺序呢?注意，这只会防止这种非常特殊的死锁情况。您仍然可以在每个会话或不同的计划中使用多个查询获得死锁。

In practice this case is really special and shouldn't happen often anyway: if you're worried about deadlocks, I still think there are easier methods to prevent them.

在实践中，这种情况是非常特殊的，也不应该经常发生:如果您担心死锁，我仍然认为有更简单的方法来防止死锁。

The easiest way to prevent a deadlock is to use either FOR UPDATE NOWAIT or FOR UPDATE WAIT X (although WAIT X can still trigger a deadlock with values of X superior to the deadlock detection mechanism, currently 3 seconds as of 11g I believe -- thanks @APC for the correction).

预防死锁的最简单方法是使用更新NOWAIT或等待更新X(尽管等待X X仍然可以引发僵局优于死锁检测机制,目前3秒11 g的我相信——感谢@APC修正)。

In other words, both transactions should ask: give me those rows and lock them but if another user already has a lock return an error instead of waiting indefinitely. It is the indefinite waiting that causes deadlocks.

换句话说，两个事务都应该问:给我这些行并锁定它们，但是如果另一个用户已经有一个锁，那么返回一个错误，而不是无限期地等待。正是不确定的等待导致了死锁。

In practice I would say that most applications with real person users would rather receive an error immediately than have a transaction wait indefinitely for another transaction to finish. I would consider FOR UPDATE without NOWAIT only for non-critical batch jobs.

在实践中，我想说，大多数拥有真人用户的应用程序宁愿立即收到错误，也不愿让一个事务无限期地等待另一个事务完成。我将考虑更新，现在不只是非关键批处理作业。

#2

Look at it another way. Barring bizarre implementations, omitting the ORDER BY clauses is almost certain to yield the same order every time, but including them provides an opportunity for 'maintenance' programmers and other well-wishers to fiddle with them, or some of them, so that they have some finite probability of ending up different over the life of the application.

换个角度看。除非奇怪的实现,省略ORDER BY子句几乎肯定会每次都产生相同的顺序,但包括他们提供了一个机会,“维护”程序员和其他民众摆弄它们,或者其中的一些,所以,他们有一些有限的概率最终应用程序的不同的生活。

#3

I think you have misunderstood how FOR UPDATE works. It acquires the locks when the cursor is activated ;that is, when the SELECT is issued.

我认为您误解了更新的工作方式。它在激活游标时获取锁，即在发出SELECT时获取锁。

So, running your query, Transaction 1 will lock the entire table (because you haven't specified a WHERE clause). Transaction 2 will either hang or fail (depending on what you've specified in the WAIT clause) regardless of whether Transaction 1 has issued any DML against the selected set of records. If fact, Transaction 1 doesn't even have to fetch any records; Transaction 2 will hurl ORA-00054 once Transaction 1 has opened the FOR UPDATE cursor.

因此，运行查询，事务1将锁定整个表(因为您没有指定WHERE子句)。事务2将挂起或失败(取决于您在WAIT子句中指定的内容)，而不管事务1是否针对所选的一组记录发出任何DML。事实上，事务1甚至不需要获取任何记录;事务2将在事务1打开FOR UPDATE游标后抛出ORA-00054。

The deadlock scenario you describe is the classic outcome of an application which uses optimistic locking (i.e. assumes it will be able to acquire a lock when it needs to). The whole point of FOR UPDATE is that it is a pessimistic locking strategy: grab all the locks potentially required now in order to guarantee successful processing in the future.

您所描述的死锁场景是使用乐观锁的应用程序的典型结果(即假定在需要锁时能够获得锁)。更新的全部意义在于它是一种悲观的锁定策略:获取当前可能需要的所有锁，以保证未来的成功处理。

The inestimable Mr Kyte provides the crucial insight in his blog:

难以估量的Kyte先生在他的博客中提供了关键的见解:

"deadlock detection trumps a waiting period"

“死锁检测胜过等待期”

In my code I was using NOWAIT in the FOR UPDATE clause of the cursor used in the second session:

在我的代码中，我在第二个会话中使用的光标的FOR UPDATE子句中使用了NOWAIT:

cursor c10000 is
     select * from order_lines
     where header_id = 1234
     for update;

cursor c1 is
     select * from order_lines
     where header_id = 1234
     and line_id = 9999
     for update nowait;

Consequently Session 2 fails immediately and hurls ORA-00054.

因此，第2节立即失败，并投掷了ORA-00054。

However the OP doesn't specify anything, in which case the second session will wait indefinitely for the row to be released. Except that it doesn't, because after a while deadlock detection kicks in and terminates the command with extreme prejudice i.e. ORA-00060. If they had specified a short wait period - say WAIT 1 - they would have seen ORA-30006: resource busy.

但是，OP没有指定任何内容，在这种情况下，第二个会话将无限期地等待行被释放。但它没有，因为过了一段时间，死锁检测启动并终止命令，带有极端的偏见，比如ORA-00060。如果他们指定了一个短暂的等待期——比如等待1——他们就会看到ORA-30006:资源繁忙。

Note that this happens regardless of whether we use the verbose syntax...

注意，不管我们是否使用详细语法，都会发生这种情况。

open c10000;
loop
    fetch c10000 into r;

or the snazzier....

或多达....

for r in c10000 loop

And it really doesn't matter whether Session 1 has fetched the row of interest when Session 2 starts.

当会话2开始时会话1是否获取了感兴趣的行真的不重要。

tl;dr

博士tl;

So the key thing is, ORDER BY doesn't solve anything. The first session to issue FOR UPDATE grabs all the records in the result set. Any subsequent session attempting to update any of those records will fail with either ORA-00054, ORA-30006 or ORA-00060, depending on whether they specified NOWAIT, WAIT n or nothing.... unless the first session releases the locks before the WAIT period times out or deadlock detection kicks in.

关键是，ORDER BY不能解决任何问题。第一次会议的问题更新抓取的所有记录的结果集。任何后续会话试图更新这些记录将失败与ora - 00054,ora - 30006和ora - 00060,取决于他们是否指定NOWAIT等n或无....除非第一个会话在等待期超时或死锁检测开始之前释放锁。

Here is a worked example. I am using an autonmous transaction to simulate a second session. The effect is the same but the output is easier to read.

这里有一个工作的例子。我正在使用一个自动事务来模拟第二个会话。效果是一样的，但是输出更容易阅读。

declare
    cursor c1 is
        select * from emp
        where deptno = 10
        for update;
    procedure s2 
    is
        cursor c2 is
            select * from emp
            where empno = 7934 -- one of the employees in dept 10
            for update
            -- for update nowait
            -- for update wait 1
            ;
        x_deadlock exception;
        pragma exception_init( x_deadlock, -60);
        x_row_is_locked exception;
        pragma exception_init( x_row_is_locked, -54);
        x_wait_timeout exception;
        pragma exception_init( x_wait_timeout, -30006);
        pragma autonomous_transaction;
    begin
        dbms_output.put_line('session 2 start');
        for r2 in c2 loop
            dbms_output.put_line('session 2 got '||r2.empno);
            update emp
            set sal = sal * 1.1
            where current of c2;
            dbms_output.put_line('session 2 update='||sql%rowcount);
        end loop;    
        rollback;
     exception
        when x_deadlock then
            dbms_output.put_line('session 2: deadlock exception');
        when x_row_is_locked then
           dbms_output.put_line('session 2: nowait exception');
        when x_wait_timeout then
            dbms_output.put_line('session 2: wait timeout exception');       
    end s2;
begin
    for r1 in c1 loop
        dbms_output.put_line('session 1 got '||r1.empno);
        s2;
    end loop;
end;
/

In this version I have specified a straightfor update in the second session. This is the configuration the OP uses and as can be seen from the output hurls because a deadlock has been detected:

在这个版本中，我在第二个会话中指定了一个简单的for更新。这是OP使用的配置，可以从输出抛出中看到，因为检测到死锁:

session 1 got 7782                                                              
session 2 start                                                                 
session 2: deadlock exception                                                   
session 1 got 7839                                                              
session 2 start                                                                 
session 2: deadlock exception                                                   
session 1 got 7934                                                              
session 2 start                                                                 
session 2: deadlock exception                                                   

PL/SQL procedure successfully completed.

What this clearly demonstrates is

这清楚地表明了什么

The first session has locked the whole result set from the go-get, because the second session never gets a lock on that one row, even when the first session has not yet retrieved it.
第一个会话锁住了go-get的整个结果集，因为第二个会话永远不会锁住这一行，即使第一个会话还没有检索到它。
The Deadlock detected exception is hurled even though the second session has not been able to update anything. 1. The Deadlock detected exception is hurled even though the first session does not update any of the fetched wows.
即使第二会话无法更新任何内容，仍然会抛出已检测到的死锁异常。1。即使第一个会话不更新任何获取的异常，仍然会抛出已检测到的死锁异常。

The code is easily modifiable to demonstrate the different behaviours of the FOR UPDATE variants.

代码很容易修改，以演示用于更新变体的不同行为。

#1