死锁的常见原因是什么?

时间:2020-12-22 15:52:26

Deadlocks are hard to find and very uncomfortable to remove.

很难找到死锁并且非常不舒服。

How can I find error sources for deadlocks in my code? Are there any "deadlock patterns"?

如何在代码中找到死锁的错误源?有没有“死锁模式”?

In my special case, it deals with databases, but this question is open for every deadlock.

在我的特殊情况下,它处理数据库,但这个问题对每个死锁都是开放的。

12 个解决方案

#1


Update: This recent MSDN article, Tools And Techniques to Identify Concurrency Issues, might also be of interest

更新:最近的MSDN文章,识别并发问题的工具和技术,也可能是有意义的


Stephen Toub in the MSDN article Deadlock monitor states the following four conditions necessary for deadlocks to occur:

在MSDN文章死锁监视器中的Stephen Toub说明了发生死锁所需的以下四个条件:

  • A limited number of a particular resource. In the case of a monitor in C# (what you use when you employ the lock keyword), this limited number is one, since a monitor is a mutual-exclusion lock (meaning only one thread can own a monitor at a time).

    有限数量的特定资源。对于C#中的监视器(当您使用lock关键字时使用的内容),这个有限的数字是一个,因为监视器是一个互斥锁(意味着一次只有一个线程可以拥有一个监视器)。

  • The ability to hold one resource and request another. In C#, this is akin to locking on one object and then locking on another before releasing the first lock, for example:

    保留一个资源并请求另一个资源的能力。在C#中,这类似于在释放第一个锁之前锁定一个对象然后锁定另一个对象,例如:


lock(a)
{
...
    lock(b)
    {
            ...
    }
}
  • No preemption capability. In C#, this means that one thread can't force another thread to release a lock.

    没有先发制人的能力。在C#中,这意味着一个线程不能强制另一个线程释放锁。

  • A circular wait condition. This means that there is a cycle of threads, each of which is waiting for the next to release a resource before it can continue.

    循环等待条件。这意味着存在一个循环的线程,每个线程都在等待下一个释放资源,然后才能继续。

He goes on to explain that the way to avoid deadlocks is to avoid (or thwart) condition four.

他继续解释说避免死锁的方法是避免(或阻止)条件四。

Joe Duffy discusses several techniques for avoiding and detecting deadlocks, including one known as lock leveling. In lock leveling, locks are assigned numerical values, and threads must only acquire locks that have higher numbers than locks they have already acquired. This prevents the possibility of a cycle. It's also frequently difficult to do well in a typical software application today, and a failure to follow lock leveling on every lock acquisition invites deadlock.

Joe Duffy讨论了几种避免和检测死锁的技术,包括一种称为锁定均衡的技术。在锁定级别中,锁定被赋予数值,并且线程必须仅获取具有比已经获取的锁更高的数量的锁。这可以防止循环的可能性。在当今典型的软件应用程序中,通常很难做得很好,并且在每次锁定获取时无法跟踪锁定均衡会导致死锁。

#2


The classic deadlock scenario is A is holding lock X and wants to acquire lock Y, while B is holding lock Y and wants to acquire lock X. Since neither can complete what they are trying to do both will end up waiting forever (unless timeouts are used).

经典的死锁场景是A持有锁X并希望获得锁Y,而B持有锁Y并希望获得锁X.既然两者都无法完成他们想要做的事情,两者都将永远等待(除非超时是用过的)。

In this case a deadlock can be avoided if A and B acquire the locks in the same order.

在这种情况下,如果A和B以相同的顺序获取锁,则可以避免死锁。

#3


No deadlock patterns to my knowledge (and 12 years of writing heavily multithreaded trading applications).. But the TimedLock class has been of great help in finding deadlocks that exist in code without massive rework.

据我所知,没有死锁模式(以及编写大量多线程交易应用程序的12年)。但是,TimedLock类在查找代码中存在的死锁方面提供了很大的帮助而没有大量的返工。

http://www.randomtree.org/eric/techblog/archives/2004/10/multithreading_is_hard.html

basically, (in dotnet/c#) you search/replace all your "lock(xxx)" statements with "using TimedLock.Lock(xxx)"

基本上,(在dotnet / c#中)使用“使用TimedLock.Lock(xxx)”搜索/替换所有“lock(xxx)”语句

If a deadlock is ever detected (lock unable to be obtained within the specified timeout, defaults to 10 seconds), then an exception is thrown. My local version also immediately logs the stacktrace. Walk up the stacktrace (preferably debug build with line numbers) and you'll immediately see what locks were held at the point of failure, and which one it was attempting to get.

如果检测到死锁(在指定的超时内无法获取锁定,默认为10秒),则抛出异常。我的本地版本也会立即记录堆栈跟踪。走上堆栈跟踪(最好使用行号进行调试构建),您将立即看到在故障点保持哪些锁,以及它试图获得哪个锁。

In dotnet 1.1, in a deadlock situation as described, as luck would have it all the threads which were locked would throw the exception at the same time. So you'd get 2+ stacktraces, and all the information necessary to fix the problem. (2.0+ may have changed the threading model internally enough to not be this lucky, I'm not sure)

在dotnet 1.1中,在所描述的死锁情况下,幸运的是,所有被锁定的线程都会同时抛出异常。因此,您将获得2+堆栈跟踪,以及解决问题所需的所有信息。 (2.0+可能已经在内部改变了线程模型,以至于不是这么幸运,我不确定)

#4


Making sure all transactions affect tables in the same order is the key to avoiding the most common of deadlocks.

确保所有事务以相同的顺序影响表是避免最常见的死锁的关键。

For example:

Transaction A

UPDATE Table A SET Foo = 'Bar'
UPDATE Table B SET Bar = 'Foo'

Transaction B

UPDATE Table B SET Bar = 'Foo'
UPDATE Table A SET Foo = 'Bar'

This is extremely likely to result in a deadlock as Transaction A gets a lock on Table A, Transaction B gets a lock on table B, therefore neither of them get a lock for their second command until the other has finished.

这极有可能导致死锁,因为事务A在表A上获得锁定,事务B在表B上获得锁定,因此在第二个命令完成之前,它们都没有锁定第二个命令。

All other forms of deadlocks are generally caused through high intensity use and SQL Server deadlocking internally whilst allocated resources.

所有其他形式的死锁通常是在分配资源的情况下通过高强度使用和内部SQL Server死锁引起的。

#5


Yes - deadlocks occur when processes try to acquire resources in random order. If all your processes try to acquire the same resources in the same order, the possibilities for deadlocks are greatly reduced, if not eliminated.

是 - 当进程尝试以随机顺序获取资源时发生死锁。如果您的所有进程都尝试以相同的顺序获取相同的资源,那么即使没有消除死锁,也会大大减少死锁的可能性。

Of course, this is not always easy to arrange...

当然,这并不总是很容易安排......

#6


I recommend reading this article by Herb Sutter. It explains the reasons behind deadlocking issues and puts forward a framework in this article to tackle this problem.

我建议阅读Herb Sutter的这篇文章。它解释了死锁问题背后的原因,并提出了本文中的一个框架来解决这个问题。

#7


The typical scenario are mismatched update plans (tables not always updated in the same order). However it is not unusual to have deadlocks when under high processing volume.

典型情况是更新计划不匹配(表并不总是按相同顺序更新)。然而,在高处理量下发生死锁并不罕见。

I tend to accept deadlocks as a fact of life, it will happen one day or another so I have my DAL prepared to handle and retry a deadlocked operation.

我倾向于接受死锁作为生活中的事实,它将在某一天发生,所以我让我的DAL准备处理并重试僵局的操作。

#8


The most common (according to my unscientific observations) DB deadlock scenario is very simple:

最常见的(根据我不科学的观察)DB死锁场景非常简单:

  • Two processes read something (a DB record for example), both acquire a shared lock on the associated resource (usually a DB page),
  • 两个进程读取内容(例如DB记录),两者都获取关联资源(通常是数据库页面)的共享锁,

  • Both try to make an update, trying to upgrade their locks to exclusive ones - voila, deadlock.
  • 两者都试图进行更新,尝试将其锁定升级为独占锁定 - 瞧,死锁。

This can be avoided by specifying the "FOR UPDATE" clause (or similar, depending on your particular RDBMS) if the read is to be followed by an update. This way the process gets the exclusive lock from the start, making the above scenario impossible.

如果要在读取之后进行更新,则可以通过指定“FOR UPDATE”子句(或类似的,具体取决于您的特定RDBMS)来避免这种情况。这样,进程从一开始就获得了独占锁,使上述场景变得不可能。

#9


A condition that occure whene two process are each waiting for the othere to complete befoure preceding.the result is both procedure is hang. its most comonelly multitasking and clint/server.

当两个进程都在等待其他人完成之前发生的情况。结果是两个程序都挂起。它最comonelly多任务和clint /服务器。

#10


Deadlock occurs mainly when there are multiple dependent locks exist. In a thread and another thread tries to lock the mutex in reverse order occurs. One should pay attention to use a mutex to avoid deadlocks.

死锁主要发生在存在多个依赖锁时。在一个线程和另一个线程尝试以相反的顺序锁定互斥锁。应该注意使用互斥锁来避免死锁。

Be sure to complete the operation after releasing the lock. If you have multiple locks, such as access order is ABC, releasing order should also be ABC.

释放锁定后务必完成操作。如果您有多个锁,例如访问顺序是ABC,则释放顺序也应该是ABC。

#11


In my last project I faced a problem with deadlocks in an sql Server Database. The problem in finding the reason was, that my software and a third party software are using the same Database and are working on the same tables. It was very hard to find out, what causes the deadlocks. I ended up writing an sql-query to find out which processes an which sql-Statements are causing the deadlocks. You can find that statement here: Deadlocks on SQL-Server

在我的上一个项目中,我遇到了SQL Server数据库中的死锁问题。找到原因的问题是,我的软件和第三方软件正在使用相同的数据库,并且正在使用相同的表。很难找到导致死锁的原因。我最后写了一个sql查询来找出哪些进程导致死锁的sql-Statements。您可以在此处找到该语句:SQL-Server上的死锁

#12


To avoid the deadlock there is a algorithm called Banker's algorithm.

为了避免死锁,有一种称为Banker算法的算法。

This one also provides helpful information to avoid deadlock.

这个还提供有用的信息,以避免死锁。

#1


Update: This recent MSDN article, Tools And Techniques to Identify Concurrency Issues, might also be of interest

更新:最近的MSDN文章,识别并发问题的工具和技术,也可能是有意义的


Stephen Toub in the MSDN article Deadlock monitor states the following four conditions necessary for deadlocks to occur:

在MSDN文章死锁监视器中的Stephen Toub说明了发生死锁所需的以下四个条件:

  • A limited number of a particular resource. In the case of a monitor in C# (what you use when you employ the lock keyword), this limited number is one, since a monitor is a mutual-exclusion lock (meaning only one thread can own a monitor at a time).

    有限数量的特定资源。对于C#中的监视器(当您使用lock关键字时使用的内容),这个有限的数字是一个,因为监视器是一个互斥锁(意味着一次只有一个线程可以拥有一个监视器)。

  • The ability to hold one resource and request another. In C#, this is akin to locking on one object and then locking on another before releasing the first lock, for example:

    保留一个资源并请求另一个资源的能力。在C#中,这类似于在释放第一个锁之前锁定一个对象然后锁定另一个对象,例如:


lock(a)
{
...
    lock(b)
    {
            ...
    }
}
  • No preemption capability. In C#, this means that one thread can't force another thread to release a lock.

    没有先发制人的能力。在C#中,这意味着一个线程不能强制另一个线程释放锁。

  • A circular wait condition. This means that there is a cycle of threads, each of which is waiting for the next to release a resource before it can continue.

    循环等待条件。这意味着存在一个循环的线程,每个线程都在等待下一个释放资源,然后才能继续。

He goes on to explain that the way to avoid deadlocks is to avoid (or thwart) condition four.

他继续解释说避免死锁的方法是避免(或阻止)条件四。

Joe Duffy discusses several techniques for avoiding and detecting deadlocks, including one known as lock leveling. In lock leveling, locks are assigned numerical values, and threads must only acquire locks that have higher numbers than locks they have already acquired. This prevents the possibility of a cycle. It's also frequently difficult to do well in a typical software application today, and a failure to follow lock leveling on every lock acquisition invites deadlock.

Joe Duffy讨论了几种避免和检测死锁的技术,包括一种称为锁定均衡的技术。在锁定级别中,锁定被赋予数值,并且线程必须仅获取具有比已经获取的锁更高的数量的锁。这可以防止循环的可能性。在当今典型的软件应用程序中,通常很难做得很好,并且在每次锁定获取时无法跟踪锁定均衡会导致死锁。

#2


The classic deadlock scenario is A is holding lock X and wants to acquire lock Y, while B is holding lock Y and wants to acquire lock X. Since neither can complete what they are trying to do both will end up waiting forever (unless timeouts are used).

经典的死锁场景是A持有锁X并希望获得锁Y,而B持有锁Y并希望获得锁X.既然两者都无法完成他们想要做的事情,两者都将永远等待(除非超时是用过的)。

In this case a deadlock can be avoided if A and B acquire the locks in the same order.

在这种情况下,如果A和B以相同的顺序获取锁,则可以避免死锁。

#3


No deadlock patterns to my knowledge (and 12 years of writing heavily multithreaded trading applications).. But the TimedLock class has been of great help in finding deadlocks that exist in code without massive rework.

据我所知,没有死锁模式(以及编写大量多线程交易应用程序的12年)。但是,TimedLock类在查找代码中存在的死锁方面提供了很大的帮助而没有大量的返工。

http://www.randomtree.org/eric/techblog/archives/2004/10/multithreading_is_hard.html

basically, (in dotnet/c#) you search/replace all your "lock(xxx)" statements with "using TimedLock.Lock(xxx)"

基本上,(在dotnet / c#中)使用“使用TimedLock.Lock(xxx)”搜索/替换所有“lock(xxx)”语句

If a deadlock is ever detected (lock unable to be obtained within the specified timeout, defaults to 10 seconds), then an exception is thrown. My local version also immediately logs the stacktrace. Walk up the stacktrace (preferably debug build with line numbers) and you'll immediately see what locks were held at the point of failure, and which one it was attempting to get.

如果检测到死锁(在指定的超时内无法获取锁定,默认为10秒),则抛出异常。我的本地版本也会立即记录堆栈跟踪。走上堆栈跟踪(最好使用行号进行调试构建),您将立即看到在故障点保持哪些锁,以及它试图获得哪个锁。

In dotnet 1.1, in a deadlock situation as described, as luck would have it all the threads which were locked would throw the exception at the same time. So you'd get 2+ stacktraces, and all the information necessary to fix the problem. (2.0+ may have changed the threading model internally enough to not be this lucky, I'm not sure)

在dotnet 1.1中,在所描述的死锁情况下,幸运的是,所有被锁定的线程都会同时抛出异常。因此,您将获得2+堆栈跟踪,以及解决问题所需的所有信息。 (2.0+可能已经在内部改变了线程模型,以至于不是这么幸运,我不确定)

#4


Making sure all transactions affect tables in the same order is the key to avoiding the most common of deadlocks.

确保所有事务以相同的顺序影响表是避免最常见的死锁的关键。

For example:

Transaction A

UPDATE Table A SET Foo = 'Bar'
UPDATE Table B SET Bar = 'Foo'

Transaction B

UPDATE Table B SET Bar = 'Foo'
UPDATE Table A SET Foo = 'Bar'

This is extremely likely to result in a deadlock as Transaction A gets a lock on Table A, Transaction B gets a lock on table B, therefore neither of them get a lock for their second command until the other has finished.

这极有可能导致死锁,因为事务A在表A上获得锁定,事务B在表B上获得锁定,因此在第二个命令完成之前,它们都没有锁定第二个命令。

All other forms of deadlocks are generally caused through high intensity use and SQL Server deadlocking internally whilst allocated resources.

所有其他形式的死锁通常是在分配资源的情况下通过高强度使用和内部SQL Server死锁引起的。

#5


Yes - deadlocks occur when processes try to acquire resources in random order. If all your processes try to acquire the same resources in the same order, the possibilities for deadlocks are greatly reduced, if not eliminated.

是 - 当进程尝试以随机顺序获取资源时发生死锁。如果您的所有进程都尝试以相同的顺序获取相同的资源,那么即使没有消除死锁,也会大大减少死锁的可能性。

Of course, this is not always easy to arrange...

当然,这并不总是很容易安排......

#6


I recommend reading this article by Herb Sutter. It explains the reasons behind deadlocking issues and puts forward a framework in this article to tackle this problem.

我建议阅读Herb Sutter的这篇文章。它解释了死锁问题背后的原因,并提出了本文中的一个框架来解决这个问题。

#7


The typical scenario are mismatched update plans (tables not always updated in the same order). However it is not unusual to have deadlocks when under high processing volume.

典型情况是更新计划不匹配(表并不总是按相同顺序更新)。然而,在高处理量下发生死锁并不罕见。

I tend to accept deadlocks as a fact of life, it will happen one day or another so I have my DAL prepared to handle and retry a deadlocked operation.

我倾向于接受死锁作为生活中的事实,它将在某一天发生,所以我让我的DAL准备处理并重试僵局的操作。

#8


The most common (according to my unscientific observations) DB deadlock scenario is very simple:

最常见的(根据我不科学的观察)DB死锁场景非常简单:

  • Two processes read something (a DB record for example), both acquire a shared lock on the associated resource (usually a DB page),
  • 两个进程读取内容(例如DB记录),两者都获取关联资源(通常是数据库页面)的共享锁,

  • Both try to make an update, trying to upgrade their locks to exclusive ones - voila, deadlock.
  • 两者都试图进行更新,尝试将其锁定升级为独占锁定 - 瞧,死锁。

This can be avoided by specifying the "FOR UPDATE" clause (or similar, depending on your particular RDBMS) if the read is to be followed by an update. This way the process gets the exclusive lock from the start, making the above scenario impossible.

如果要在读取之后进行更新,则可以通过指定“FOR UPDATE”子句(或类似的,具体取决于您的特定RDBMS)来避免这种情况。这样,进程从一开始就获得了独占锁,使上述场景变得不可能。

#9


A condition that occure whene two process are each waiting for the othere to complete befoure preceding.the result is both procedure is hang. its most comonelly multitasking and clint/server.

当两个进程都在等待其他人完成之前发生的情况。结果是两个程序都挂起。它最comonelly多任务和clint /服务器。

#10


Deadlock occurs mainly when there are multiple dependent locks exist. In a thread and another thread tries to lock the mutex in reverse order occurs. One should pay attention to use a mutex to avoid deadlocks.

死锁主要发生在存在多个依赖锁时。在一个线程和另一个线程尝试以相反的顺序锁定互斥锁。应该注意使用互斥锁来避免死锁。

Be sure to complete the operation after releasing the lock. If you have multiple locks, such as access order is ABC, releasing order should also be ABC.

释放锁定后务必完成操作。如果您有多个锁,例如访问顺序是ABC,则释放顺序也应该是ABC。

#11


In my last project I faced a problem with deadlocks in an sql Server Database. The problem in finding the reason was, that my software and a third party software are using the same Database and are working on the same tables. It was very hard to find out, what causes the deadlocks. I ended up writing an sql-query to find out which processes an which sql-Statements are causing the deadlocks. You can find that statement here: Deadlocks on SQL-Server

在我的上一个项目中,我遇到了SQL Server数据库中的死锁问题。找到原因的问题是,我的软件和第三方软件正在使用相同的数据库,并且正在使用相同的表。很难找到导致死锁的原因。我最后写了一个sql查询来找出哪些进程导致死锁的sql-Statements。您可以在此处找到该语句:SQL-Server上的死锁

#12


To avoid the deadlock there is a algorithm called Banker's algorithm.

为了避免死锁,有一种称为Banker算法的算法。

This one also provides helpful information to avoid deadlock.

这个还提供有用的信息,以避免死锁。