在使用App Service的过程中,发现应用频繁出现503错误,通过Kudu站点获取到Logfiles。
在 Eventlog.xml 文件中,发现大量的 Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener 异常,并且引起 w3wp.exe 进程终止。
<Event>
<System>
<Provider Name=".NET Runtime"/>
<EventID>1026</EventID>
<Level>1</Level>
<Task>0</Task>
<Keywords>Keywords</Keywords>
<TimeCreated SystemTime="2022-09-22T09:52:50Z"/>
<EventRecordID>421882843</EventRecordID>
<Channel>Application</Channel>
<Computer>RD0003FF0C13B3</Computer>
<Security/>
</System>
<EventData>
<Data>Application: w3wp.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: System.Configuration.ConfigurationErrorsException at System.Diagnostics.TraceUtils.GetRuntimeObject(System.String, System.Type, System.String) at System.Diagnostics.TypedElement.BaseGetRuntimeObject() at System.Diagnostics.ListenerElement.GetRuntimeObject() at System.Diagnostics.ListenerElementsCollection.GetRuntimeObject() at System.Diagnostics.TraceInternal.get_Listeners() at System.Diagnostics.TraceInternal.WriteLine(System.String) at System.Diagnostics.Debug.WriteLine(System.String) at Microsoft.Web.Compilation.Snapshots.SnapshotHelper.TakeSnapshotTimerCallback(System.Object) at System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object) at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.TimerQueueTimer.CallCallback() at System.Threading.TimerQueueTimer.Fire() at System.Threading.TimerQueue.FireNextTimers() at System.Threading.TimerQueue.AppDomainTimerCallback(Int32) </Data>
</EventData>
</Event>
<Event>
<System>
<Provider Name="ASP.NET 4.0.30319.0"/>
<EventID>1325</EventID>
<Level>1</Level>
<Task>0</Task>
<Keywords>Keywords</Keywords>
<TimeCreated SystemTime="2022-09-22T10:02:53Z"/>
<EventRecordID>422485625</EventRecordID>
<Channel>Application</Channel>
<Computer>RD0003FF0C13B3</Computer>
<Security/>
</System>
<EventData>
<Data>An unhandled exception occurred and the process was terminated. Application ID: /LM/W3SVC/152666399/ROOT Process ID: 6092 Exception: System.Configuration.ConfigurationErrorsException Message: Couldn't find type for class Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Version=2.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35. StackTrace: at System.Diagnostics.TraceUtils.GetRuntimeObject(String className, Type baseType, String initializeData) at System.Diagnostics.TypedElement.BaseGetRuntimeObject() at System.Diagnostics.ListenerElement.GetRuntimeObject() at System.Diagnostics.ListenerElementsCollection.GetRuntimeObject() at System.Diagnostics.TraceInternal.get_Listeners() at System.Diagnostics.TraceInternal.WriteLine(String message) at System.Diagnostics.Debug.WriteLine(String message) at Microsoft.Web.Compilation.Snapshots.SnapshotHelper.TakeSnapshotTimerCallback(Object stateInfo) at System.Threading.TimerQueueTimer.CallCallbackInContext(Object state) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.TimerQueueTimer.CallCallback() at System.Threading.TimerQueueTimer.Fire() at System.Threading.TimerQueue.FireNextTimers() at System.Threading.TimerQueue.AppDomainTimerCallback(Int32 id)</Data>
</EventData>
</Event>
在启用了 Proactive Crash Monitoring in Azure App Service 后,当应用再次应为没有处理的异常(24小时内发生3次)就会自动抓取Memory DUMP进行分析。完整的异常日志为:
========================================================
Dump Analysis for w3wp__xxxxxxxx__PID__9644__Date__09_22_2022__Time_02_59_27AM__239__Second_Chance_Exception_E0434352.dmp
========================================================
Thread 7984
ExitCode E0434352
ExitCodeString CLR EXCEPTION
DefaultHostName xxxxxxxxxx.chinacloudsites.cn
Managed Exception = System.Configuration.ConfigurationErrorsException:Couldn't find type for class Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Version=2.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35.
CallStack - Managed Exception
========================================================
System.Diagnostics.TraceUtils.GetRuntimeObject(System.String, System.Type, System.String)
System.Diagnostics.TypedElement.BaseGetRuntimeObject()
System.Diagnostics.ListenerElement.GetRuntimeObject()
System.Diagnostics.ListenerElementsCollection.GetRuntimeObject()
System.Diagnostics.TraceInternal.get_Listeners()
System.Diagnostics.TraceInternal.WriteLine(System.String)
System.Diagnostics.Debug.WriteLine(System.String)
Microsoft.Web.Compilation.Snapshots.SnapshotHelper.TakeSnapshotTimerCallback(System.Object)
System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object)
System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
System.Threading.TimerQueueTimer.CallCallback()
System.Threading.TimerQueueTimer.Fire()
System.Threading.TimerQueue.FireNextTimers()
System.Threading.TimerQueue.AppDomainTimerCallback(Int32)
CallStack - Crashing Thread
========================================================
HelperMethodFrame
System.Diagnostics.TraceUtils.GetRuntimeObject(System.String, System.Type, System.String)
System.Diagnostics.TypedElement.BaseGetRuntimeObject()
System.Diagnostics.ListenerElement.GetRuntimeObject()
System.Diagnostics.ListenerElementsCollection.GetRuntimeObject()
System.Diagnostics.TraceInternal.get_Listeners()
System.Diagnostics.TraceInternal.WriteLine(System.String)
System.Diagnostics.Debug.WriteLine(System.String)
Microsoft.Web.Compilation.Snapshots.SnapshotHelper.TakeSnapshotTimerCallback(System.Object)
System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object)
System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
System.Threading.TimerQueueTimer.CallCallback()
System.Threading.TimerQueueTimer.Fire()
System.Threading.TimerQueue.FireNextTimers()
System.Threading.TimerQueue.AppDomainTimerCallback(Int32)
DebuggerU2MCatchHandlerFrame
ContextTransitionFrame
DebuggerU2MCatchHandlerFrame
Native Call Stack
========================================================
KERNELBASE!RaiseException
clr!RaiseTheExceptionInternalOnly
clr!IL_Throw
mscorlib_ni
mscorlib_ni
mscorlib_ni
mscorlib_ni
mscorlib_ni
mscorlib_ni
clr!CallDescrWorkerInternal
clr!CallDescrWorkerWithHandler
clr!MethodDescCallSite::CallTargetWorker
clr!AppDomainTimerCallback_Worker
clr!ManagedThreadBase_DispatchInner
clr!ManagedThreadBase_DispatchMiddle
clr!ManagedThreadBase_DispatchOuter
clr!ManagedThreadBase_DispatchInCorrectAD
clr!Thread::DoADCallBack
clr!ManagedThreadBase_DispatchInner
clr!ManagedThreadBase_DispatchMiddle
clr!ManagedThreadBase_DispatchOuter
clr!ManagedThreadBase_FullTransitionWithAD
clr!AppDomainTimerCallback
clr!ThreadpoolMgr::AsyncTimerCallbackCompletion
clr!UnManagedPerAppDomainTPCount::DispatchWorkItem
clr!ThreadpoolMgr::ExecuteWorkRequest
clr!ThreadpoolMgr::WorkerThreadStart
clr!Thread::intermediateThreadProc
kernel32!BaseThreadInitThunk
ntdll!__RtlUserThreadStart
ntdll!_RtlUserThreadStart
这个问题是什么原因呢?
问题解答
在Eventlog.xml文件中,已经发现了此异常的根本原因是:Couldn’t find type for class Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Culture=neutral, PublicKeyToken=31bf3856ad364e35
这个问题发生的原因是在项目中使用了 ”Microsoft.WindowsAzure.Diagnostics.dll“, 这个 dll 可以在云服务(Azure Cloud Service)中使用,但是它与App Service不相容,不能在App Service中使用。否则,就不停的引发异常导致W3WP.EXE重启。
通过下面两方面来解决这个问题:
1)检查应用项目文件,移除全部 Microsoft.WindowsAzure.Diagnostics.dll 的引用
2)检查应用的 web.config 文件,如果存在 system.diagnostics 中含有 DiagnosticMonitorTraceListener 内容,则全部移除
<system.diagnostics>
<trace>
<listeners>
<add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Culture=neutral, PublicKeyToken=31bf3856ad364e35" name="AzureDiagnostics">
<filter type="" />
</add>
</listeners>
</trace>
</system.diagnostics>
操作以上两步骤后,希望能解决因为Couldn’t find type for class Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener 问题。
参考资料
https://azure.github.io/AppService/2021/06/09/Apps-on-App-Services-crash-due-to-DiagnosticMonitorTraceListener.html
https://azure.github.io/AppService/2021/03/01/Proactive-Crash-Monitoring-in-Azure-App-Service