不生成核心转储文件

时间:2020-11-30 09:50:39

Every time, my application crash a core dump file is not generated. I remember that few days ago, on another server it was generated. I'm running the app using screen in bash like this:

每次,我的应用程序崩溃一个核心转储文件没有生成。我记得几天前,它在另一个服务器上生成。我在bash中使用屏幕运行这个应用程序,如下所示:

#!/bin/bash
ulimit -c unlimited
while true; do ./server; done

As you can see I'm using ulimit -c unlimited which is important if I want to generate a core dump, but it still doesn't generate it, when I got an segmentation fault. How can I make it work?

你可以看到,我使用的是ulimit -c unlimited,这很重要,如果我想生成一个核心转储,但它仍然不能生成它,当我有一个分割错误时。我如何让它工作?

12 个解决方案

#1


40  

Make sure your current directory (at the time of crash -- server may change directories) is writable. If the server calls setuid, the directory has to be writable by that user.

确保当前目录(在崩溃时——服务器可能更改目录)是可写的。如果服务器调用setuid,则该目录必须由该用户编写。

Also check /proc/sys/kernel/core_pattern. That may redirect core dumps to another directory, and that directory must be writable. More info here.

也检查/proc/sys/kernel/core_pattern。这可能会将核心转储重定向到另一个目录,并且该目录必须是可写的。更多的信息在这里。

#2


45  

This link contains a good checklist why core dumps are not generated:

这个链接包含一个很好的清单,为什么不生成核心转储:

  • The core would have been larger than the current limit.
  • 核心会大于当前的极限。
  • You don't have the necessary permissions to dump core (directory and file). Notice that core dumps are placed in the dumping process' current directory which could be different from the parent process.
  • 您没有转储核心(目录和文件)的必要权限。注意,核心转储位于转储进程的当前目录中,该目录可能与父进程不同。
  • Verify that the file system is writeable and have sufficient free space.
  • 验证文件系统是可写的,并且有足够的空闲空间。
  • If a sub directory named core exist in the working directory no core will be dumped.
  • 如果工作目录中存在一个名为core的子目录,则不会转储任何核心。
  • If a file named core already exist but has multiple hard links the kernel will not dump core.
  • 如果一个名为core的文件已经存在,但是有多个硬链接,内核将不会转储core。
  • Verify the permissions on the executable, if the executable has the suid or sgid bit enabled core dumps will by default be disabled. The same will be the case if you have execute permissions but no read permissions on the file.
  • 如果可执行文件具有启用suid或sgid位的核心转储默认将被禁用,则验证可执行文件上的权限。如果您有执行权限,但是没有文件上的读权限,情况也是一样的。
  • Verify that the process has not changed working directory, core size limit, or dumpable flag.
  • 验证该进程没有更改工作目录、核心大小限制或哑弹标志。
  • Some kernel versions cannot dump processes with shared address space (AKA threads). Newer kernel versions can dump such processes but will append the pid to the file name.
  • 有些内核版本不能使用共享地址空间(即线程)转储进程。更新的内核版本可以转储这些进程,但会将pid附加到文件名。
  • The executable could be in a non-standard format not supporting core dumps. Each executable format must implement a core dump routine.
  • 可执行文件可以采用非标准格式,不支持核心转储。每个可执行格式必须实现一个核心转储例程。
  • The segmentation fault could actually be a kernel Oops, check the system logs for any Oops messages.
  • 分割错误实际上可能是一个内核,检查系统日志中是否有任何Oops消息。
  • The application called exit() instead of using the core dump handler.
  • 该应用程序调用exit()而不是使用核心转储处理程序。

#3


4  

Check:

检查:

$ sysctl kernel.core_pattern

to see how your dumps are created (%e will be the process name, and %t will be the system time).

查看如何创建转储(%e是进程名,%t是系统时间)。

If you've Ubuntu, your dumps are created by apport in /var/crash, but in different format (edit the file to see it).

如果你有Ubuntu,你的转储文件是由apport在/var/crash中创建的,但是格式不同(编辑文件以查看它)。

You can test it by:

你可以通过:

sleep 10 &
killall -SIGSEGV sleep

If core dumping is successful, you will see “(core dumped)” after the segmentation fault indication.

如果内核转储成功,您将在分割错误指示之后看到“(内核转储)”。

Read more:

阅读更多:

How to generate core dump file in Ubuntu

如何在Ubuntu中生成核心转储文件


Ubuntu

Ubuntu

Please read more at:

请阅读更多:

https://wiki.ubuntu.com/Apport

https://wiki.ubuntu.com/Apport

#4


2  

Remember if you are starting the server from a service, it will start a different bash session so the ulimit won't be effective there. Try to put this in your script itself:

请记住,如果您正在从一个服务启动服务器,它将启动一个不同的bash会话,因此ulimit在那里不会有效。试着把它写进你的剧本里:

ulimit -c unlimited

#5


2  

For the record, on Debian 9 Stretch (systemd), I had to install the package systemd-coredump. Afterwards, core dumps were generated in the folder /var/lib/systemd/coredump.

为了记录,在Debian 9 Stretch (systemd)上,我必须安装package system -coredump。之后,在文件夹/var/lib/system /coredump中生成核心转储

Furthermore, these coredumps are compressed in the lz4 format. To decompress, you can use the package liblz4-tool like this: lz4 -d FILE.

此外,这些核心转储以lz4格式压缩。要解压缩,可以使用liblz4-tool包,如:lz4 -d文件。

To be able to debug the decompressed coredump using gdb, I also had to rename the utterly long filename into something shorter...

为了能够使用gdb调试解压缩的coredump,我还必须将长文件名重命名为更短的……

#6


1  

Also, check to make sure you have enough disk space on /var/core or wherever your core dumps get written. If the partition is almos full or at 100% disk usage then that would be the problem. My core dumps average a few gigs so you should be sure to have at least 5-10 gig available on the partition.

此外,请检查是否在/var/core上或在编写核心转储的任何地方拥有足够的磁盘空间。如果分区是almos full或100%的磁盘使用,那么这就是问题所在。我的核心转储平均有几个gigs,所以您应该确保在分区上至少有5-10个可用的gig。

#7


1  

The answers given here cover pretty well most scenarios for which core dump is not created. However, in my instance, none of these applied. I'm posting this answer as an addition to the other answers.

这里给出的答案涵盖了大多数没有创建核心转储的场景。然而,在我的例子中,这些都不适用。我把这个答案作为一个附加的答案。

If your core file is not being created for whatever reason, I recommend looking at the /var/log/messages. There might be a hint in there to why the core file is not created. In my case there was a line stating the root cause:

如果您的核心文件不是因为任何原因创建的,我建议您查看/var/log/ messages.net这里可能有一个提示,说明为什么没有创建核心文件。在我的案例中,有一条线说明了根本原因:

Executable '/path/to/executable' doesn't belong to any package

To work around this issue edit /etc/abrt/abrt-action-save-package-data.conf and change ProcessUnpackaged from 'no' to 'yes'.

要解决这个问题,请编辑/etc/abrt/abrt-action-save-pack -data。conf和change processunpack从“no”到“yes”。

ProcessUnpackaged = yes

This setting specifies whether to create core for binaries not installed with package manager.

此设置指定是否为未安装包管理器的二进制文件创建核心。

#8


1  

If one is on a Linux distro (e.g. CentOS, Debian) then perhaps the most accessible way to find out about core files and related conditions is in the man page. Just run the following command from a terminal:

如果你使用的是Linux发行版(例如CentOS, Debian),那么了解核心文件和相关条件的最容易的方法可能是在手册页中。从终端运行以下命令:

man 5 core

#9


0  

Although this isn't going to be a problem for the person who asked the question, because they ran the program that was to produce the core file in a script with the ulimit command, I'd like to document that the ulimit command is specific to the shell in which you run it (like environment variables). I spent way too much time running ulimit and sysctl and stuff in one shell, and the command that I wanted to dump core in the other shell, and wondering why the core file was not produced.

尽管这不会是一个问题的人问这个问题,因为他们跑的程序生成核心文件和ulimit命令在脚本中,我想文档ulimit命令是特定的shell运行它(如环境变量)。我花了太多时间在一个shell中运行ulimit和sysctl,以及我想在另一个shell中转储core的命令,并想知道为什么没有生成core文件。

I will be adding it to my bashrc. The sysctl works for all processes once it is issued, but the ulimit only works for the shell in which it is issued (maybe also the descendents too) - but not for other shells that happen to be running.

我会把它添加到bashrc中。sysctl在发出后对所有进程都有效,但是ulimit只对发出它的shell(可能也对后代进程有效)有效,而对正在运行的其他shell无效。

#10


0  

Note: If you have written any crash handler yourself, then the core might not get generated. So search for code with something on the line:

注意:如果您自己编写了任何崩溃处理程序,那么内核可能不会生成。因此,搜索代码时要注意:

signal(SIGSEGV, <handler> );

so the SIGSEGV will be handled by handler and you will not get the core dump.

因此SIGSEGV将由处理程序处理,你不会得到核心转储。

#11


0  

If you call daemon() and then daemonize a process, by default the current working directory will change to /. So if your program is a daemon then you should be looking for a core in / directory and not in the directory of the binary.

如果您调用daemon()然后守护进程,那么默认情况下,当前工作目录将更改为/。因此,如果您的程序是一个守护进程,那么您应该在/目录中查找核心,而不是在二进制文件的目录中查找。

#12


0  

Just in case someone else stumbles on this. I was running someone else's code - make sure they are not handling the signal, so they can gracefully exit. I commented out the handling, and got the core dump.

以防有人在这方面出错。我在运行别人的代码——确保他们没有处理信号,这样他们就可以优雅地退出。我注释掉了处理,得到了核心转储。

#1


40  

Make sure your current directory (at the time of crash -- server may change directories) is writable. If the server calls setuid, the directory has to be writable by that user.

确保当前目录(在崩溃时——服务器可能更改目录)是可写的。如果服务器调用setuid,则该目录必须由该用户编写。

Also check /proc/sys/kernel/core_pattern. That may redirect core dumps to another directory, and that directory must be writable. More info here.

也检查/proc/sys/kernel/core_pattern。这可能会将核心转储重定向到另一个目录,并且该目录必须是可写的。更多的信息在这里。

#2


45  

This link contains a good checklist why core dumps are not generated:

这个链接包含一个很好的清单,为什么不生成核心转储:

  • The core would have been larger than the current limit.
  • 核心会大于当前的极限。
  • You don't have the necessary permissions to dump core (directory and file). Notice that core dumps are placed in the dumping process' current directory which could be different from the parent process.
  • 您没有转储核心(目录和文件)的必要权限。注意,核心转储位于转储进程的当前目录中,该目录可能与父进程不同。
  • Verify that the file system is writeable and have sufficient free space.
  • 验证文件系统是可写的,并且有足够的空闲空间。
  • If a sub directory named core exist in the working directory no core will be dumped.
  • 如果工作目录中存在一个名为core的子目录,则不会转储任何核心。
  • If a file named core already exist but has multiple hard links the kernel will not dump core.
  • 如果一个名为core的文件已经存在,但是有多个硬链接,内核将不会转储core。
  • Verify the permissions on the executable, if the executable has the suid or sgid bit enabled core dumps will by default be disabled. The same will be the case if you have execute permissions but no read permissions on the file.
  • 如果可执行文件具有启用suid或sgid位的核心转储默认将被禁用,则验证可执行文件上的权限。如果您有执行权限,但是没有文件上的读权限,情况也是一样的。
  • Verify that the process has not changed working directory, core size limit, or dumpable flag.
  • 验证该进程没有更改工作目录、核心大小限制或哑弹标志。
  • Some kernel versions cannot dump processes with shared address space (AKA threads). Newer kernel versions can dump such processes but will append the pid to the file name.
  • 有些内核版本不能使用共享地址空间(即线程)转储进程。更新的内核版本可以转储这些进程,但会将pid附加到文件名。
  • The executable could be in a non-standard format not supporting core dumps. Each executable format must implement a core dump routine.
  • 可执行文件可以采用非标准格式,不支持核心转储。每个可执行格式必须实现一个核心转储例程。
  • The segmentation fault could actually be a kernel Oops, check the system logs for any Oops messages.
  • 分割错误实际上可能是一个内核,检查系统日志中是否有任何Oops消息。
  • The application called exit() instead of using the core dump handler.
  • 该应用程序调用exit()而不是使用核心转储处理程序。

#3


4  

Check:

检查:

$ sysctl kernel.core_pattern

to see how your dumps are created (%e will be the process name, and %t will be the system time).

查看如何创建转储(%e是进程名,%t是系统时间)。

If you've Ubuntu, your dumps are created by apport in /var/crash, but in different format (edit the file to see it).

如果你有Ubuntu,你的转储文件是由apport在/var/crash中创建的,但是格式不同(编辑文件以查看它)。

You can test it by:

你可以通过:

sleep 10 &
killall -SIGSEGV sleep

If core dumping is successful, you will see “(core dumped)” after the segmentation fault indication.

如果内核转储成功,您将在分割错误指示之后看到“(内核转储)”。

Read more:

阅读更多:

How to generate core dump file in Ubuntu

如何在Ubuntu中生成核心转储文件


Ubuntu

Ubuntu

Please read more at:

请阅读更多:

https://wiki.ubuntu.com/Apport

https://wiki.ubuntu.com/Apport

#4


2  

Remember if you are starting the server from a service, it will start a different bash session so the ulimit won't be effective there. Try to put this in your script itself:

请记住,如果您正在从一个服务启动服务器,它将启动一个不同的bash会话,因此ulimit在那里不会有效。试着把它写进你的剧本里:

ulimit -c unlimited

#5


2  

For the record, on Debian 9 Stretch (systemd), I had to install the package systemd-coredump. Afterwards, core dumps were generated in the folder /var/lib/systemd/coredump.

为了记录,在Debian 9 Stretch (systemd)上,我必须安装package system -coredump。之后,在文件夹/var/lib/system /coredump中生成核心转储

Furthermore, these coredumps are compressed in the lz4 format. To decompress, you can use the package liblz4-tool like this: lz4 -d FILE.

此外,这些核心转储以lz4格式压缩。要解压缩,可以使用liblz4-tool包,如:lz4 -d文件。

To be able to debug the decompressed coredump using gdb, I also had to rename the utterly long filename into something shorter...

为了能够使用gdb调试解压缩的coredump,我还必须将长文件名重命名为更短的……

#6


1  

Also, check to make sure you have enough disk space on /var/core or wherever your core dumps get written. If the partition is almos full or at 100% disk usage then that would be the problem. My core dumps average a few gigs so you should be sure to have at least 5-10 gig available on the partition.

此外,请检查是否在/var/core上或在编写核心转储的任何地方拥有足够的磁盘空间。如果分区是almos full或100%的磁盘使用,那么这就是问题所在。我的核心转储平均有几个gigs,所以您应该确保在分区上至少有5-10个可用的gig。

#7


1  

The answers given here cover pretty well most scenarios for which core dump is not created. However, in my instance, none of these applied. I'm posting this answer as an addition to the other answers.

这里给出的答案涵盖了大多数没有创建核心转储的场景。然而,在我的例子中,这些都不适用。我把这个答案作为一个附加的答案。

If your core file is not being created for whatever reason, I recommend looking at the /var/log/messages. There might be a hint in there to why the core file is not created. In my case there was a line stating the root cause:

如果您的核心文件不是因为任何原因创建的,我建议您查看/var/log/ messages.net这里可能有一个提示,说明为什么没有创建核心文件。在我的案例中,有一条线说明了根本原因:

Executable '/path/to/executable' doesn't belong to any package

To work around this issue edit /etc/abrt/abrt-action-save-package-data.conf and change ProcessUnpackaged from 'no' to 'yes'.

要解决这个问题,请编辑/etc/abrt/abrt-action-save-pack -data。conf和change processunpack从“no”到“yes”。

ProcessUnpackaged = yes

This setting specifies whether to create core for binaries not installed with package manager.

此设置指定是否为未安装包管理器的二进制文件创建核心。

#8


1  

If one is on a Linux distro (e.g. CentOS, Debian) then perhaps the most accessible way to find out about core files and related conditions is in the man page. Just run the following command from a terminal:

如果你使用的是Linux发行版(例如CentOS, Debian),那么了解核心文件和相关条件的最容易的方法可能是在手册页中。从终端运行以下命令:

man 5 core

#9


0  

Although this isn't going to be a problem for the person who asked the question, because they ran the program that was to produce the core file in a script with the ulimit command, I'd like to document that the ulimit command is specific to the shell in which you run it (like environment variables). I spent way too much time running ulimit and sysctl and stuff in one shell, and the command that I wanted to dump core in the other shell, and wondering why the core file was not produced.

尽管这不会是一个问题的人问这个问题,因为他们跑的程序生成核心文件和ulimit命令在脚本中,我想文档ulimit命令是特定的shell运行它(如环境变量)。我花了太多时间在一个shell中运行ulimit和sysctl,以及我想在另一个shell中转储core的命令,并想知道为什么没有生成core文件。

I will be adding it to my bashrc. The sysctl works for all processes once it is issued, but the ulimit only works for the shell in which it is issued (maybe also the descendents too) - but not for other shells that happen to be running.

我会把它添加到bashrc中。sysctl在发出后对所有进程都有效,但是ulimit只对发出它的shell(可能也对后代进程有效)有效,而对正在运行的其他shell无效。

#10


0  

Note: If you have written any crash handler yourself, then the core might not get generated. So search for code with something on the line:

注意:如果您自己编写了任何崩溃处理程序,那么内核可能不会生成。因此,搜索代码时要注意:

signal(SIGSEGV, <handler> );

so the SIGSEGV will be handled by handler and you will not get the core dump.

因此SIGSEGV将由处理程序处理,你不会得到核心转储。

#11


0  

If you call daemon() and then daemonize a process, by default the current working directory will change to /. So if your program is a daemon then you should be looking for a core in / directory and not in the directory of the binary.

如果您调用daemon()然后守护进程,那么默认情况下,当前工作目录将更改为/。因此,如果您的程序是一个守护进程,那么您应该在/目录中查找核心,而不是在二进制文件的目录中查找。

#12


0  

Just in case someone else stumbles on this. I was running someone else's code - make sure they are not handling the signal, so they can gracefully exit. I commented out the handling, and got the core dump.

以防有人在这方面出错。我在运行别人的代码——确保他们没有处理信号,这样他们就可以优雅地退出。我注释掉了处理,得到了核心转储。