Here's the situation:
这是情况:
Background
I have a mixed mode .NET/Native application developed in Visual Studio 2008.
我有一个在Visual Studio 2008中开发的混合模式.NET / Native应用程序。
What I mean by mixed mode is that the front end is written in C++ .NET which calls into a native C++ library. The native code does the bulk of the work in the app, including kicking off new threads as it requires. The .NET code is just for UI purposes (win forms).
混合模式的意思是前端是用C ++ .NET编写的,它调用本机C ++库。本机代码完成应用程序中的大部分工作,包括根据需要启动新线程。 .NET代码仅用于UI目的(获胜表单)。
I have a release build of application running on a tester's computer.
我在测试人员的计算机上运行了一个应用程序的发布版本。
The native libraries were compiled with full optimisations but also with debugging enabled (the "Debug Information Format" was set to "Program Database").
本机库使用完全优化进行编译,但也启用了调试(“调试信息格式”设置为“程序数据库”)。
What this means is that I have the debugging symbols for the application in a PDB file.
这意味着我在PDB文件中有应用程序的调试符号。
The problem
So anyway, one of the testers is having a problem with the app where it occasionally crashes on XP. I've been able to get the minidump of the crash using Dr Watson for several runs.
所以无论如何,其中一个测试人员遇到了应用程序的问题,它偶尔会在XP上崩溃。我已经能够使用沃森博士进行多次运行来获得崩溃的小型转储。
When I debug into it (using the minidump - I'm not actually debugging the real app), all the debugging symbols are loaded correctly: I can see the full stack trace of all of the native threads correctly. Other threads (which are presumably the .NET threads) don't have a stack trace, but they all at least show me which dll the thread was started on (i.e. ntdll.dll).
当我调试它(使用minidump - 我实际上没有调试真正的应用程序)时,所有调试符号都正确加载:我可以正确地看到所有本机线程的完整堆栈跟踪。其他线程(可能是.NET线程)没有堆栈跟踪,但它们至少都显示了线程启动的dll(即ntdll.dll)。
It correctly reports the thread which fails ("Unhandled exception at 0x0563d652 in user(5).dmp: 0xC0000005: Access violation reading location 0x00000000).
它正确报告失败的线程(“用户(5)中的0x0563d652处的未处理异常.dmp:0xC0000005:访问冲突读取位置0x00000000)。
However when I go into the thread it shows nothing useful. In the stack trace there is a single entry which just has the memory address "0563d652()" (not even "ntldll.dll").
但是,当我进入线程时,它显示没有任何用处。在堆栈跟踪中,有一个条目只有内存地址“0563d652()”(甚至不是“ntldll.dll”)。
When I go into dissasembly it just shows a random section of about 30 instructions. Either side of the memory address is just "???". It almost looks like it is not part of my source code (isn't your binary loaded sequentially into memory? is it normal to have a random set of assembly statements in the middle of nowhere?).
当我进入dissasembly时,它只显示了大约30条指令的随机部分。内存地址的任何一侧都只是“???”。它几乎看起来不是我的源代码的一部分(你的二进制文件不是按顺序加载到内存中的吗?在不知名的地方有一组随机的汇编语句是正常的吗?)。
My questions
So basically my questions are threfold.
所以基本上我的问题都是有问题的。
1) Can anyone explain the debugger's lack of information?
1)任何人都可以解释调试器缺乏信息吗?
2) Bearing in mind, I can't show the error occurred in my code, can anyone suggest a reason for the failure
2)请记住,我无法显示我的代码中发生的错误,任何人都可以提出失败的原因
3) Can I do anything else to help me diagnose this current problem in the future?
3)我可以做任何其他事情来帮助我将来诊断这个当前的问题吗?
Help!
John
Update:
Here is the stack dump for the failing thread from WinDBG
这是来自WinDBG的失败线程的堆栈转储
# ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 099bf414 02d0e7fc 0x563d652
01 00000000 00000000 0x2d0e7fc
Weird huh? Doesn't even show a DLL.
怪啊?甚至不显示DLL。
Is it possible that I've corrupted the stack/heap somehow which has caused a thread to just get corrupted...?
我是否有可能以某种方式损坏了堆栈/堆,导致线程被损坏......?
4 个解决方案
#1
Are you using WinDbg? If so, are you using the Son of strike extension?
你在用WinDbg吗?如果是这样,你使用罢工之子吗?
-or-
Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects?
深入了解.NET框架内部以了解CLR如何创建运行时对象?
#2
We had an issue similar to this where a code bug was silent in MSVC2K5 SP1, but if you had the MSVC2K5 SP2 runtime installed it caused an error which didn't point at valid code.
我们遇到了类似的问题,其中代码错误在MSVC2K5 SP1中是静默的,但如果您安装了MSVC2K5 SP2运行时,则会导致错误,该错误未指向有效代码。
Part of the problem is, when you start executing data as code you could be doing anything and so the crash location becomes useless as you cannot even get back to a valid stack trace.
部分问题是,当您开始执行数据作为代码时,您可能正在执行任何操作,因此崩溃位置变得无用,因为您甚至无法返回到有效的堆栈跟踪。
We had this happen to us when the new .Net runtime install installed a newer version of the MSVC C++ Runtime in the SxS directory.
当新的.Net运行时安装在SxS目录中安装了较新版本的MSVC C ++ Runtime时,我们遇到了这种情况。
In the end our method to resolve the issue was to make the crash happen frequently and add as much logging as necessary to localize it.
最后,我们解决问题的方法是使崩溃频繁发生,并根据需要添加尽可能多的日志记录来进行本地化。
#3
could you post the stack of the faulting thread once you've grabbed and installed a copy of windbg and opened the dump file there? we could start from there.
一旦你抓住并安装了windbg的副本并在那里打开转储文件,你能发布故障线程的堆栈吗?我们可以从那里开始。
#4
Your EIP was just corrupted.
Assuming the ESP is valid, you can view the callstack, just type:
dds esp [enter]
dds [enter]
你的EIP刚刚被破坏了。假设ESP有效,您可以查看callstack,只需输入:dds esp [enter] dds [enter]
You can also use the memory windows:
Set address to: esp
Set format to: Pointer&Symbol
您还可以使用内存窗口:将地址设置为:esp将格式设置为:指针和符号
#1
Are you using WinDbg? If so, are you using the Son of strike extension?
你在用WinDbg吗?如果是这样,你使用罢工之子吗?
-or-
Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects?
深入了解.NET框架内部以了解CLR如何创建运行时对象?
#2
We had an issue similar to this where a code bug was silent in MSVC2K5 SP1, but if you had the MSVC2K5 SP2 runtime installed it caused an error which didn't point at valid code.
我们遇到了类似的问题,其中代码错误在MSVC2K5 SP1中是静默的,但如果您安装了MSVC2K5 SP2运行时,则会导致错误,该错误未指向有效代码。
Part of the problem is, when you start executing data as code you could be doing anything and so the crash location becomes useless as you cannot even get back to a valid stack trace.
部分问题是,当您开始执行数据作为代码时,您可能正在执行任何操作,因此崩溃位置变得无用,因为您甚至无法返回到有效的堆栈跟踪。
We had this happen to us when the new .Net runtime install installed a newer version of the MSVC C++ Runtime in the SxS directory.
当新的.Net运行时安装在SxS目录中安装了较新版本的MSVC C ++ Runtime时,我们遇到了这种情况。
In the end our method to resolve the issue was to make the crash happen frequently and add as much logging as necessary to localize it.
最后,我们解决问题的方法是使崩溃频繁发生,并根据需要添加尽可能多的日志记录来进行本地化。
#3
could you post the stack of the faulting thread once you've grabbed and installed a copy of windbg and opened the dump file there? we could start from there.
一旦你抓住并安装了windbg的副本并在那里打开转储文件,你能发布故障线程的堆栈吗?我们可以从那里开始。
#4
Your EIP was just corrupted.
Assuming the ESP is valid, you can view the callstack, just type:
dds esp [enter]
dds [enter]
你的EIP刚刚被破坏了。假设ESP有效,您可以查看callstack,只需输入:dds esp [enter] dds [enter]
You can also use the memory windows:
Set address to: esp
Set format to: Pointer&Symbol
您还可以使用内存窗口:将地址设置为:esp将格式设置为:指针和符号