
时间:2021-02-24 22:12:55

I have a thread that, when its function exits its loop (the exit is triggered by an event), it does some cleanup and then sets a different event to let a master thread know that it is done.


However, under some circumstances, SetEvent() seems not to return after it sets the thread's 'I'm done' event.


This thread is part of a DLL and the problem seems to occur after the DLL has been loaded/attached, the thread started, the thread ended and the DLL detached/unloaded a number of times without the application shutting down in between. The number of times this sequence has to be repeated before this problem happens is variable.


In case you are skeptical that I know what I'm talking about, I have determined what's happening by bracketing the SetEvent() call with calls to OutputDebugString(). The output before SetEvent() appears. Then, the waiting thread produces output that indicates that the Event has been set.


However, the second call to OutputDebugString() in the exiting thread (the one AFTER SetEvent() ) never occurs, or at least its string never shows up. If this happens, the application crashes a few moments later.

但是,在退出线程(一个AFTER SetEvent())中对OutputDebugString()的第二次调用永远不会发生,或者至少它的字符串永远不会出现。如果发生这种情况,应用程序会在几分钟后崩溃。

(Note that the calls to OutputDebugString() were added after the problem started occurring, so it's unlikely to be hanging there, rather than in SetEvent().)


I'm not entirely sure what causes the crash, but it occurs in the same thread in which SetEvent() didn't return immediately (I've been tracking/outputting the thread IDs). I suppose it's possible that SetEvent() is finally returning, by which point the context to which it is returning is gone/invalid, but what could cause such a delay?


It turns out that I've been blinded by looking at this code for so long, and it didn't even occur to me to check the return code. I'm done looking at it for today, so I'll know what it's returning (if it's returning) on Monday and I'll edit this question with that info then.


Update: I changed the (master) code to wait for the thread to exit rather than for it to set the event, and removed the SetEvent() call from the slave thread. This changed the nature of the bug: now, instead of failing to return from SetEvent(), it doesn't exit the thread at all and the whole thing hangs.


This indicates that the problem is not with SetEvent(), but something deeper. No idea what, yet, but it's good not to be chasing down that blind alley.


Update (Feb 13/09):
It turned out that the problem was deeper than I thought when I asked this question. jdigital (and probably others) has pretty much nailed the underlying problem: we were trying to unload a thread as part of the process of detaching a DLL.

更新(2009年2月13日):事实证明,当我问这个问题时,问题比我想象的要深。 jdigital(可能还有其他人)几乎已经解决了潜在的问题:我们试图卸载一个线程作为分离DLL的过程的一部分。

This, as I didn't realize at the time, but have since found out through research here and elsewhere (Raymond Chen's blog, for example), is a Very Bad Thing.

正如我当时没有意识到的那样,但此后通过研究和其他地方(例如Raymond Chen的博客)发现,这是一件非常糟糕的事情。

The problem was, because of the way it was coded and the way it was behaving, it not obvious that that was the underlying problem - it was camouflaged as all sorts of other Bad Behaviours that I had to wade through.

问题在于,由于它的编码方式和行为方式,这不是显而易见的,这是潜在的问题 - 它被伪装成各种其他不良行为,我不得不趟过。

Some of the suggestions here helped me do that, so I'm grateful to everyone who contributed. Thank you!


4 个解决方案



Who is unloading the DLL and at what time is the unload done? I am wondering if there is a timing issue here where the DLL is unloaded before the thread has run to completion.




Are you dereferncing a HANDLE * to pass to SetEvent? It's more likely that the event handle reference is invalid and the crash is an access violation (i.e., accessing garbage).

您是否要将HANDLE *传递给SetEvent?事件句柄引用更可能是无效的,并且崩溃是访问冲突(即访问垃圾)。



You might want to use WinDbg to catch the crash and examine the stack.




Why do you need to set an event in the slave thread to trigger to the master thread that the thread is done? just exit the thread, the calling master thread should wait for the worker thread to exit, example pseudo code -

为什么需要在从属线程中设置一个事件来触发线程完成的主线程?刚退出线程,调用主线程应等待工作线程退出,示例伪代码 -

   TerminateEvent = CreateEvent ( ... ) ;
   ThreadHandle = BeginThread ( Slave, (LPVOID) TerminateEvent ) ;
   Do some work
   SetEvent ( TerminateEvent ) ;
   WaitForSingleObject ( ThreadHandle, SOME_TIME_OUT ) ;
   CloseHandle ( TerminateEvent ) ;
   CloseHandle ( ThreadHandle ) ; 

Slave ( LPVOID ThreadParam )
   TerminateEvent = (HANDLE) ThreadParam ;
   while ( WaitForSingleObject ( TerminateEvent, SOME__SHORT_TIME_OUT ) == WAIT_TIMEOUT )
      Do some work 

There are lots of error conditions and states to check for but this is the essence of how I normally do it.


If you can get hold of it, get this book, it changed my life with respect to Windows development when I first read it many, many years ago.


Advanced Windows: The Developer's Guide to the Win32 Api for Windows Nt 3.5 and Windows 95 (Paperback), by Jeffrey Richter (Author)

高级Windows:Win32 Api for Windows Nt 3.5和Windows 95开发人员指南(平装本),作者:Jeffrey Richter(作者)



Who is unloading the DLL and at what time is the unload done? I am wondering if there is a timing issue here where the DLL is unloaded before the thread has run to completion.




Are you dereferncing a HANDLE * to pass to SetEvent? It's more likely that the event handle reference is invalid and the crash is an access violation (i.e., accessing garbage).

您是否要将HANDLE *传递给SetEvent?事件句柄引用更可能是无效的,并且崩溃是访问冲突(即访问垃圾)。



You might want to use WinDbg to catch the crash and examine the stack.




Why do you need to set an event in the slave thread to trigger to the master thread that the thread is done? just exit the thread, the calling master thread should wait for the worker thread to exit, example pseudo code -

为什么需要在从属线程中设置一个事件来触发线程完成的主线程?刚退出线程,调用主线程应等待工作线程退出,示例伪代码 -

   TerminateEvent = CreateEvent ( ... ) ;
   ThreadHandle = BeginThread ( Slave, (LPVOID) TerminateEvent ) ;
   Do some work
   SetEvent ( TerminateEvent ) ;
   WaitForSingleObject ( ThreadHandle, SOME_TIME_OUT ) ;
   CloseHandle ( TerminateEvent ) ;
   CloseHandle ( ThreadHandle ) ; 

Slave ( LPVOID ThreadParam )
   TerminateEvent = (HANDLE) ThreadParam ;
   while ( WaitForSingleObject ( TerminateEvent, SOME__SHORT_TIME_OUT ) == WAIT_TIMEOUT )
      Do some work 

There are lots of error conditions and states to check for but this is the essence of how I normally do it.


If you can get hold of it, get this book, it changed my life with respect to Windows development when I first read it many, many years ago.


Advanced Windows: The Developer's Guide to the Win32 Api for Windows Nt 3.5 and Windows 95 (Paperback), by Jeffrey Richter (Author)

高级Windows:Win32 Api for Windows Nt 3.5和Windows 95开发人员指南(平装本),作者:Jeffrey Richter(作者)