Windows:在什么情况下SetEvent()不会立即返回?

时间:2021-02-24 22:12:55

I have a thread that, when its function exits its loop (the exit is triggered by an event), it does some cleanup and then sets a different event to let a master thread know that it is done.

我有一个线程,当它的函数退出其循环(退出由事件触发)时,它会进行一些清理,然后设置一个不同的事件让主线程知道它已完成。

However, under some circumstances, SetEvent() seems not to return after it sets the thread's 'I'm done' event.

但是,在某些情况下,SetEvent()在设置线程的“我已完成”事件后似乎不会返回。

This thread is part of a DLL and the problem seems to occur after the DLL has been loaded/attached, the thread started, the thread ended and the DLL detached/unloaded a number of times without the application shutting down in between. The number of times this sequence has to be repeated before this problem happens is variable.

这个线程是DLL的一部分,问题似乎发生在DLL加载/附加,线程启动,线程结束和DLL分离/卸载多次而没有应用程序关闭之间。在此问题发生之前必须重复此序列的次数是可变的。

In case you are skeptical that I know what I'm talking about, I have determined what's happening by bracketing the SetEvent() call with calls to OutputDebugString(). The output before SetEvent() appears. Then, the waiting thread produces output that indicates that the Event has been set.

如果你怀疑我知道我在说什么,我已经通过调用OutputDebugString()包围了SetEvent()调用来确定发生了什么。出现SetEvent()之前的输出。然后,等待线程产生输出,指示已设置事件。

However, the second call to OutputDebugString() in the exiting thread (the one AFTER SetEvent() ) never occurs, or at least its string never shows up. If this happens, the application crashes a few moments later.

但是,在退出线程(一个AFTER SetEvent())中对OutputDebugString()的第二次调用永远不会发生,或者至少它的字符串永远不会出现。如果发生这种情况,应用程序会在几分钟后崩溃。

(Note that the calls to OutputDebugString() were added after the problem started occurring, so it's unlikely to be hanging there, rather than in SetEvent().)

(注意,在问题开始发生之后添加了对OutputDebugString()的调用,因此它不可能挂在那里,而不是挂在SetEvent()中。)

I'm not entirely sure what causes the crash, but it occurs in the same thread in which SetEvent() didn't return immediately (I've been tracking/outputting the thread IDs). I suppose it's possible that SetEvent() is finally returning, by which point the context to which it is returning is gone/invalid, but what could cause such a delay?

我不完全确定导致崩溃的原因,但是它发生在SetEvent()没有立即返回的同一个线程中(我一直在跟踪/输出线程ID)。我想SetEvent()最终可能会返回,到那时它返回的上下文已经/无效,但是什么可能导致这样的延迟?

It turns out that I've been blinded by looking at this code for so long, and it didn't even occur to me to check the return code. I'm done looking at it for today, so I'll know what it's returning (if it's returning) on Monday and I'll edit this question with that info then.

事实证明,通过长时间查看这段代码我已经蒙蔽了眼睛,我甚至没有检查返回代码。我今天已经完成了它,所以我知道周一它会返回什么(如果它正在返回),那么我将用这个信息编辑这个问题。

Update: I changed the (master) code to wait for the thread to exit rather than for it to set the event, and removed the SetEvent() call from the slave thread. This changed the nature of the bug: now, instead of failing to return from SetEvent(), it doesn't exit the thread at all and the whole thing hangs.

更新:我更改了(主)代码以等待线程退出而不是设置事件,并从从属线程中删除了SetEvent()调用。这改变了bug的性质:现在,它不会从SetEvent()返回,而是根本不退出线程,整个事情都会挂起。

This indicates that the problem is not with SetEvent(), but something deeper. No idea what, yet, but it's good not to be chasing down that blind alley.

这表明问题不在于SetEvent(),而在于更深层次的问题。不知道是什么,但是不要追逐那条死胡同是好的。

Update (Feb 13/09):
It turned out that the problem was deeper than I thought when I asked this question. jdigital (and probably others) has pretty much nailed the underlying problem: we were trying to unload a thread as part of the process of detaching a DLL.

更新(2009年2月13日):事实证明,当我问这个问题时,问题比我想象的要深。 jdigital(可能还有其他人)几乎已经解决了潜在的问题:我们试图卸载一个线程作为分离DLL的过程的一部分。

This, as I didn't realize at the time, but have since found out through research here and elsewhere (Raymond Chen's blog, for example), is a Very Bad Thing.

正如我当时没有意识到的那样,但此后通过研究和其他地方(例如Raymond Chen的博客)发现,这是一件非常糟糕的事情。

The problem was, because of the way it was coded and the way it was behaving, it not obvious that that was the underlying problem - it was camouflaged as all sorts of other Bad Behaviours that I had to wade through.

问题在于,由于它的编码方式和行为方式,这不是显而易见的,这是潜在的问题 - 它被伪装成各种其他不良行为,我不得不趟过。

Some of the suggestions here helped me do that, so I'm grateful to everyone who contributed. Thank you!

这里的一些建议帮助我做到了,所以我很感谢所有贡献的人。谢谢!

4 个解决方案

#1


1  

Who is unloading the DLL and at what time is the unload done? I am wondering if there is a timing issue here where the DLL is unloaded before the thread has run to completion.

谁正在卸载DLL并在卸载的什么时候完成?我想知道在线程运行完成之前是否存在卸载DLL的时间问题。

#2


2  

Are you dereferncing a HANDLE * to pass to SetEvent? It's more likely that the event handle reference is invalid and the crash is an access violation (i.e., accessing garbage).

您是否要将HANDLE *传递给SetEvent?事件句柄引用更可能是无效的,并且崩溃是访问冲突(即访问垃圾)。

#3


0  

You might want to use WinDbg to catch the crash and examine the stack.

您可能希望使用WinDbg来捕获崩溃并检查堆栈。

#4


0  

Why do you need to set an event in the slave thread to trigger to the master thread that the thread is done? just exit the thread, the calling master thread should wait for the worker thread to exit, example pseudo code -

为什么需要在从属线程中设置一个事件来触发线程完成的主线程?刚退出线程,调用主线程应等待工作线程退出,示例伪代码 -

Master
{
   TerminateEvent = CreateEvent ( ... ) ;
   ThreadHandle = BeginThread ( Slave, (LPVOID) TerminateEvent ) ;
   ...
   Do some work
   ...
   SetEvent ( TerminateEvent ) ;
   WaitForSingleObject ( ThreadHandle, SOME_TIME_OUT ) ;
   CloseHandle ( TerminateEvent ) ;
   CloseHandle ( ThreadHandle ) ; 
}

Slave ( LPVOID ThreadParam )
{
   TerminateEvent = (HANDLE) ThreadParam ;
   while ( WaitForSingleObject ( TerminateEvent, SOME__SHORT_TIME_OUT ) == WAIT_TIMEOUT )
   { 
      ... 
      Do some work 
      ...
   }
}

There are lots of error conditions and states to check for but this is the essence of how I normally do it.

有很多错误条件和状态要检查,但这是我通常如何做的本质。

If you can get hold of it, get this book, it changed my life with respect to Windows development when I first read it many, many years ago.

如果你能掌握它,拿到这本书,它在很多年前我第一次阅读它时就改变了我在Windows开发方面的生活。

Advanced Windows: The Developer's Guide to the Win32 Api for Windows Nt 3.5 and Windows 95 (Paperback), by Jeffrey Richter (Author)

高级Windows:Win32 Api for Windows Nt 3.5和Windows 95开发人员指南(平装本),作者:Jeffrey Richter(作者)

#1


1  

Who is unloading the DLL and at what time is the unload done? I am wondering if there is a timing issue here where the DLL is unloaded before the thread has run to completion.

谁正在卸载DLL并在卸载的什么时候完成?我想知道在线程运行完成之前是否存在卸载DLL的时间问题。

#2


2  

Are you dereferncing a HANDLE * to pass to SetEvent? It's more likely that the event handle reference is invalid and the crash is an access violation (i.e., accessing garbage).

您是否要将HANDLE *传递给SetEvent?事件句柄引用更可能是无效的,并且崩溃是访问冲突(即访问垃圾)。

#3


0  

You might want to use WinDbg to catch the crash and examine the stack.

您可能希望使用WinDbg来捕获崩溃并检查堆栈。

#4


0  

Why do you need to set an event in the slave thread to trigger to the master thread that the thread is done? just exit the thread, the calling master thread should wait for the worker thread to exit, example pseudo code -

为什么需要在从属线程中设置一个事件来触发线程完成的主线程?刚退出线程,调用主线程应等待工作线程退出,示例伪代码 -

Master
{
   TerminateEvent = CreateEvent ( ... ) ;
   ThreadHandle = BeginThread ( Slave, (LPVOID) TerminateEvent ) ;
   ...
   Do some work
   ...
   SetEvent ( TerminateEvent ) ;
   WaitForSingleObject ( ThreadHandle, SOME_TIME_OUT ) ;
   CloseHandle ( TerminateEvent ) ;
   CloseHandle ( ThreadHandle ) ; 
}

Slave ( LPVOID ThreadParam )
{
   TerminateEvent = (HANDLE) ThreadParam ;
   while ( WaitForSingleObject ( TerminateEvent, SOME__SHORT_TIME_OUT ) == WAIT_TIMEOUT )
   { 
      ... 
      Do some work 
      ...
   }
}

There are lots of error conditions and states to check for but this is the essence of how I normally do it.

有很多错误条件和状态要检查,但这是我通常如何做的本质。

If you can get hold of it, get this book, it changed my life with respect to Windows development when I first read it many, many years ago.

如果你能掌握它,拿到这本书,它在很多年前我第一次阅读它时就改变了我在Windows开发方面的生活。

Advanced Windows: The Developer's Guide to the Win32 Api for Windows Nt 3.5 and Windows 95 (Paperback), by Jeffrey Richter (Author)

高级Windows:Win32 Api for Windows Nt 3.5和Windows 95开发人员指南(平装本),作者:Jeffrey Richter(作者)