从进程内部转储Linux核心文件的好方法是什么?

We have a server (written in C and C++) that currently catches a SEGV and dumps some internal info to a file. I would like to generate a core file and write it to disk at the time we catch the SEGV, so our support reps and customers don't have to fuss with ulimit and then wait for the crash to happen again in order to get a core file. We have used the abort function in the past, but it is subject to the ulimit rules and doesn't help.

我们有一个服务器(用C和c++编写)，它现在捕获一个SEGV，并将一些内部信息转储到文件中。我想要生成一个核心文件并在我们捕获SEGV时将其写入磁盘，因此我们的支持reps和客户不必为ulimit操心，然后等待崩溃再次发生，以获得一个核心文件。我们以前使用过abort函数，但是它受ulimit规则的限制，没有帮助。

We have some legacy code that reads /proc/pid/map and manually generates a core file, but it is out of date, and doesn't seem very portable (for example, I'm guessing it would not work in our 64 bit builds). What is the best way to generate and dump a core file in a Linux process?

我们有一些读取/proc/pid/map并手动生成核心文件的遗留代码，但是它已经过时了，并且看起来不太容易移植(例如，我猜它在64位构建中不会工作)。在Linux进程中生成和转储核心文件的最佳方式是什么?

8 个解决方案

#1

Google has a library for generating coredumps from inside a running process called google-coredumper. This should ignore ulimit and other mechanisms.

谷歌有一个库，用于在一个名为Google -coredumper的正在运行的进程中生成内核转储。这应该忽略ulimit和其他机制。

The documentation for the call that generates the core file is here. According to the documentation, it seems that it is feasible to generate a core file in a signal handler, though it is not guaranteed to always work.

生成核心文件的调用的文档在这里。根据文档，在信号处理程序中生成核心文件似乎是可行的，尽管它不能保证总是工作。

#2

I saw pmbrett's post and thought "hey, thats cool" but couldn't find that utility anywhere on my system ( Gentoo ).

我看到了pmbrett的帖子，心想“嘿，那太棒了”，但是我的系统(Gentoo)上却找不到这个实用程序。

So I did a bit of prodding, and discovered GDB has this option in it.

所以我做了一点刺激，发现GDB有这个选项。

gdb --pid=4049 --batch -ex gcore

Seemed to work fine for me.

对我来说似乎还行。

Its not however very useful because it traps the very lowest function that was in use at the time, but it still does a good job outside that ( With no memory limitations, Dumped 350M snapshot of a firefox process with it )

不过它并不是很有用，因为它捕获了当时使用的最低功能，但它在其他方面仍然做得很好(没有内存限制，用它转储了3.5亿条firefox进程快照)

#3

Try using the Linux command gcore

尝试使用Linux命令gcore

usage: gcore [-o filename] pid

用法:gcore [-o文件名]pid

You'll need to use system (or exec) and getpid() to build up the right command line to call it from within your process

您将需要使用system(或exec)和getpid()构建正确的命令行，以便在流程中调用它

#4

Some possible solutions^W ways of dealing with this situation:

一些可能的解决方法^ W的方式处理这样的状况:

Fix the ulimit!!!
修复ulimit ! ! !
Accept that you don't get a core file and run inside gdb, scripted to do a "thread all apply bt" on SIGSEGV
接受您没有获取核心文件并在gdb中运行的事实，编写脚本在SIGSEGV上执行“所有线程都应用bt”
Accept that you don't get a core file and acquired a stack trace from within the application. The Stack Backtracing Inside Your Program article is pretty old but it should be possible these days too.
接受您没有获取核心文件并从应用程序中获取堆栈跟踪的事实。程序文章中的堆栈回溯已经很老了，但现在也应该可以实现了。

#5

You can also change the ulimit() from within your program with setrlimit(2). Like the ulimit shell command, this can lower limits, or raise them as hard as the hard limit allows. At startup setrlimit() to allow core dumping, and you're fine.

您还可以使用setrlimit(2)从程序内部更改ulimit()。与ulimit shell命令一样，这可以降低限制，或者尽可能提高硬限制。在启动setrlimit()时允许核心转储，这样就可以了。

#6

I assume you have a signal handler that catches SEGV, for example, and does something like print a message and call _exit(). (Otherwise, you'd have a core file in the first place!) You could do something like the following.

我假设您有一个捕获SEGV的信号处理程序，并执行诸如打印消息和调用_exit()之类的操作。(否则，您首先会有一个核心文件!)您可以执行以下操作。

void my_handler(int sig)
{
   ...
   if (wantCore_ && !fork()) {
      setrlimit(...);  // ulimit -Sc unlimited
      sigset(sig, SIG_DFL);  // reset default handler
      raise(sig);  // doesn't return, generates a core file
   }
   _exit(1);
}

#7

system ("kill -6 ")

系统(“杀死6”)

I'd give it a try if you are still looking for something

如果你还在找什么，我就试试

#8

use backtrace and backtrace_symbols glibc calls to get the trace, just keep in mind that backtrace_symbols uses malloc internally and in case of heap corruption it might fail.

使用backtrace和backtrace_symbols glibc调用来获取跟踪，只要记住backtrace_symbols在内部使用malloc，如果发生堆损坏，它可能会失败。

#1