如何找出我的线程在ASP.NET中停止的原因？

Our logs are reporting ThreadAbortExceptions that are stopping our Quartz.NET jobs at seemingly random intervals. From what I understand, this wouldn't normally be caused by something the thread itself is doing (e.g. reading a file from an FTP server, or executing a LINQ to Entities query), but rather because some outside process is telling the thread to stop. Furthermore, the way the logs are being created leads me to believe that the entire web application is being restarted when we get these errors, so I'm guessing that the restart process is what's causing the thread to be aborted in the first place.

我们的日志报告ThreadAbortExceptions,它们以看似随机的间隔停止我们的Quartz.NET作业。根据我的理解,这通常不会由线程本身正在做的事情引起(例如从FTP服务器读取文件,或执行LINQ to Entities查询),而是因为某些外部进程告诉线程停止。此外,创建日志的方式让我相信当我们收到这些错误时整个Web应用程序正在重新启动,所以我猜测重启过程是导致线程首先被中止的原因。

So my question is: how can I figure out why the server/application is being restarted? Are there logs somewhere that would give me details on each restart? Are there common causes for something like this that I should investigate?

所以我的问题是:我怎样才能找出服务器/应用程序重启的原因?是否有某些日志可以在每次重启时向我提供详细信息?我应该调查这样的事情的常见原因吗?

Thanks in advance for your help.

在此先感谢您的帮助。

Edit

I just had a discussion with some co-workers, and it sounds like IIS automatically puts the application to sleep after a certain period of inactivity, which might be part of the problem. With some research, I've found an "Idle Timeout" setting for worker threads in IIS. I think that when the application hasn't processed any requests for a certain amount of time, it issues a shutdown command. For some reason Quartz doesn't shut down immediately, but instead it waits for the next job to get fired, and then the system detects that job's thread and kills it while it's trying to run.

我刚与一些同事讨论过,听起来IIS会在一段时间不活动后自动将应用程序置于休眠状态,这可能是问题的一部分。通过一些研究,我发现了IIS中工作线程的“空闲超时”设置。我认为当应用程序在一段时间内没有处理任何请求时,它会发出shutdown命令。出于某种原因,Quartz不会立即关闭,而是等待下一个作业被触发,然后系统检测到该作业的线程并在其尝试运行时将其杀死。

So I guess we need to come up with some way to gracefully finish any running jobs when the system wants to shut down, and make Quartz actually shut down when it's told to, if it's not running any jobs. Does anybody have any experience wit this sort of issue?

因此,当系统想要关闭时,我想我们需要想出一些方法来优雅地完成任何正在运行的作业,并且当它被告知时,如果它没有运行任何作业,那么Quartz实际上会关闭。有没有人有这种问题的经验?

3 个解决方案

#1

As liho1eye pointed out, the problem arose from the application pool shutting down our application. For some reason, Quartz apparently wasn't shutting down immediately. Instead, it waited until the next job ran and shut down then, which meant that the running job had to get shut down via ThreadAbordException.

正如liho1eye指出的那样,问题出现在应用程序池关闭我们的应用程序。出于某种原因,Quartz显然没有立即关闭。相反,它等待下一个作业运行并关闭,这意味着正在运行的作业必须通过ThreadAbordException关闭。

Our solution to this was two-fold. First, we updated Quartz to a more recent version, which seemed to make it behave a little better. Second, in our Application_End method in Global.asax.cs, we added a call to Scheduler.Shutdown(true). This tells the scheduler to stop firing additional triggers, and then it waits until all the currently-running triggers complete before allowing the application to end.

我们的解决方案是双重的。首先,我们将Quartz更新为更新版本,这似乎使其表现得更好一些。其次,在Global.asax.cs的Application_End方法中,我们添加了对Scheduler.Shutdown(true)的调用。这告诉调度程序停止触发其他触发器,然后等待所有当前运行的触发器完成,然后允许应用程序结束。

#2

Naturally this means that something somewhere called Thread.Abort() on the instance of your job thread. I would look towards this Quartz thing for explanation.

当然,这意味着在作业线程的实例上某处调用了Thread.Abort()。我会看一下这个Quartz的解释。

Another possibility is that your job thread is a background thread and your app pool is being recycled, but I would know anything about this Quartz thing to tell for sure.

另一种可能性是你的工作线程是一个后台线程,你的应用程序池正在被回收,但我知道有关这个Quartz的事情。

#3

If you perform any redirects in your code without specifying the endReponse parameter of Response.Redirect, the redirect will call thread.Abort(), but there will still be code to execute. This code gets orphaned since the thread is gone and you get the exception. For reading:

如果在代码中执行任何重定向而未指定Response.Redirect的endReponse参数,则重定向将调用thread.Abort(),但仍将存在要执行的代码。这个代码变得孤立,因为线程消失了,你得到了异常。阅读:

http://www.c6software.com/articles/ThreadAbortException.aspx

Edit:
Another possibility would be an unhandled server level exception that causes the w3wp.exe process to crash or recycle itself. This would be the external cause you alluded to that would cause the thread to abort but attempt to continue running code. To determine if this might be the case, you would have exceptions in your System Event Log. They're very generic, but they'll clearly list w3wp.exe (so you can use that as a filter). If this turns out to be the case, you'll need to install IIS Debug Diagnostics and set up some crash monitors to catch what is going on at the moment of the crash. Since it happens outside of the actual page lifecycle, normal exception handling gets bypassed.

编辑:另一种可能性是未处理的服务器级异常导致w3wp.exe进程崩溃或自行回收。这可能是您提到的导致线程中止但尝试继续运行代码的外部原因。要确定是否可能出现这种情况,您的系统事件日志中会有例外。它们非常通用,但它们会清楚地列出w3wp.exe(因此您可以将其用作过滤器)。如果情况确实如此,则需要安装IIS调试诊断程序并设置一些崩溃监视器以捕获崩溃时发生的情况。由于它发生在实际页面生命周期之外,因此绕过了正常的异常处理。

#1

#2