Under Linux what would be the best way for a program to restart itself on a crash by catching the exception in a crashhandler (for example on a segfault)?
在Linux下,通过捕获crashhandler(例如segfault)中的异常,程序在崩溃时重启自己的最佳方式是什么?
7 个解决方案
#1
6
You can have a loop in which you essentially fork()
, do the real work in the child, and just wait on the child and check its exit status in the parent. You can also use a system which monitors and restarts programs in a similar fashion, such as daemontools, runit, etc.
您可以在循环中使用fork(),在子进程中执行真正的工作,然后等待子进程并检查它在父进程中的退出状态。您还可以使用以类似方式监视和重新启动程序的系统,如daemontools、runit等。
#2
9
simplest is
最简单的是
while [ 1 ]; do ./program && break; done
basically, you run program until it is return 0, then you break.
基本上,你运行程序直到它返回0,然后你中断。
#3
7
SIGSEGV
can be caught (see man 3 signal
or man 2 sigaction
), and the program can call one of the exec
family of function on itself in order to restart. Similarly for most runtime crashes (SIGFPE
, SIGILL
, SIGBUS
, SIGSYS
, ...).
可以捕获SIGSEGV(参见man 3信号或man 2信号),程序可以调用自身的exec函数族之一以重新启动。类似地,对于大多数运行时崩溃(SIGFPE、SIGILL、SIGBUS、SIGSYS…)。
I'd think a bit before doing this, though. It is a rather unusual strategy for a unix program, and you may surprise your users (not necessarily in a pleasant way, either).
不过,在做这件事之前,我想了一下。对于unix程序来说,这是一种非常不寻常的策略,您可能会让您的用户感到惊讶(也不一定是以令人愉快的方式)。
In any case, be sure to not auto-restart on SIGTERM
if there are any resources you want to clean up before dying, otherwise angry users will use SIGKILL
and you'll leave a mess.
在任何情况下,如果您希望在死之前清理任何资源,请确保不会在SIGTERM上自动重新启动,否则愤怒的用户将使用SIGKILL,您将会留下混乱。
#4
3
As a complement to what was proposed here:
作为对这里提议的补充:
Another option is to do like it is done for getty daemon. Please see /etc/inittab and appropriate inittab(5) man page. It seems it is most system-wide mean ;-).
另一个选择是像盖蒂守护进程那样做。请参阅/etc/inittab和适当的inittab(5)手册页。它似乎是全系统范围的平均值;-)。
It could look like file fragment below. Obvious advantage this mean is pretty standard and it allows to control your daemon through run levels.
它可以看起来像下面的文件片段。这种方法的明显优势是非常标准的,它允许通过运行级别来控制守护进程。
# Run gettys in standard runlevels
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2
3:2345:respawn:/sbin/mingetty tty3
4:2345:respawn:/sbin/mingetty tty4
5:2345:respawn:/sbin/mingetty tty5
6:2345:respawn:/sbin/mingetty tty6
#5
0
Processes can't restart themselves, but you could use a utility like crontab(1)
to schedule a script to check if the process is still alive at regular intervals.
进程不能自己重新启动,但是您可以使用crontab(1)之类的实用程序来调度脚本,以检查进程是否仍然正常运行。
#6
0
The program itself obviously shouldn't check whether it is running or not running :)
显然,程序本身不应该检查它是否正在运行:)
Most enterprise solutions are actually just fancy ways of grepping the output from ps()
for a given string, and performing an action in the event that certain criteria are satisfied - i.e. if your process is not found, then call the start script.
大多数企业解决方案实际上只是为给定字符串从ps()中提取输出,并在满足某些条件的情况下执行操作——例如,如果没有找到您的进程,那么调用start脚本。
#7
0
Try the following code if its specific to segfault. This can be modified as required.
如果特定于segfault,请尝试下面的代码。这可以根据需要进行修改。
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <poll.h>
sigjmp_buf buf;
void handler(int sig) {
siglongjmp(buf, 1);
}
int main() {
//signal(SIGINT, handler);
//register all signals
struct sigaction new_action, old_action;
new_action.sa_handler = handler;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction (SIGSEGV, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGSEGV, &new_action, NULL);
if (!sigsetjmp(buf, 1)){
printf("starting\n");
//code or function/method here
}
else{
printf("restarting\n");
//code or function/method here
}
while(1) {
poll(NULL,0,100); //ideally use usleep or nanosleep. for now using poll() as a timer
printf("processing...\n");
}
return 0; //or exit(SUCESS)
}
#1
6
You can have a loop in which you essentially fork()
, do the real work in the child, and just wait on the child and check its exit status in the parent. You can also use a system which monitors and restarts programs in a similar fashion, such as daemontools, runit, etc.
您可以在循环中使用fork(),在子进程中执行真正的工作,然后等待子进程并检查它在父进程中的退出状态。您还可以使用以类似方式监视和重新启动程序的系统,如daemontools、runit等。
#2
9
simplest is
最简单的是
while [ 1 ]; do ./program && break; done
basically, you run program until it is return 0, then you break.
基本上,你运行程序直到它返回0,然后你中断。
#3
7
SIGSEGV
can be caught (see man 3 signal
or man 2 sigaction
), and the program can call one of the exec
family of function on itself in order to restart. Similarly for most runtime crashes (SIGFPE
, SIGILL
, SIGBUS
, SIGSYS
, ...).
可以捕获SIGSEGV(参见man 3信号或man 2信号),程序可以调用自身的exec函数族之一以重新启动。类似地,对于大多数运行时崩溃(SIGFPE、SIGILL、SIGBUS、SIGSYS…)。
I'd think a bit before doing this, though. It is a rather unusual strategy for a unix program, and you may surprise your users (not necessarily in a pleasant way, either).
不过,在做这件事之前,我想了一下。对于unix程序来说,这是一种非常不寻常的策略,您可能会让您的用户感到惊讶(也不一定是以令人愉快的方式)。
In any case, be sure to not auto-restart on SIGTERM
if there are any resources you want to clean up before dying, otherwise angry users will use SIGKILL
and you'll leave a mess.
在任何情况下,如果您希望在死之前清理任何资源,请确保不会在SIGTERM上自动重新启动,否则愤怒的用户将使用SIGKILL,您将会留下混乱。
#4
3
As a complement to what was proposed here:
作为对这里提议的补充:
Another option is to do like it is done for getty daemon. Please see /etc/inittab and appropriate inittab(5) man page. It seems it is most system-wide mean ;-).
另一个选择是像盖蒂守护进程那样做。请参阅/etc/inittab和适当的inittab(5)手册页。它似乎是全系统范围的平均值;-)。
It could look like file fragment below. Obvious advantage this mean is pretty standard and it allows to control your daemon through run levels.
它可以看起来像下面的文件片段。这种方法的明显优势是非常标准的,它允许通过运行级别来控制守护进程。
# Run gettys in standard runlevels
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2
3:2345:respawn:/sbin/mingetty tty3
4:2345:respawn:/sbin/mingetty tty4
5:2345:respawn:/sbin/mingetty tty5
6:2345:respawn:/sbin/mingetty tty6
#5
0
Processes can't restart themselves, but you could use a utility like crontab(1)
to schedule a script to check if the process is still alive at regular intervals.
进程不能自己重新启动,但是您可以使用crontab(1)之类的实用程序来调度脚本,以检查进程是否仍然正常运行。
#6
0
The program itself obviously shouldn't check whether it is running or not running :)
显然,程序本身不应该检查它是否正在运行:)
Most enterprise solutions are actually just fancy ways of grepping the output from ps()
for a given string, and performing an action in the event that certain criteria are satisfied - i.e. if your process is not found, then call the start script.
大多数企业解决方案实际上只是为给定字符串从ps()中提取输出,并在满足某些条件的情况下执行操作——例如,如果没有找到您的进程,那么调用start脚本。
#7
0
Try the following code if its specific to segfault. This can be modified as required.
如果特定于segfault,请尝试下面的代码。这可以根据需要进行修改。
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <poll.h>
sigjmp_buf buf;
void handler(int sig) {
siglongjmp(buf, 1);
}
int main() {
//signal(SIGINT, handler);
//register all signals
struct sigaction new_action, old_action;
new_action.sa_handler = handler;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction (SIGSEGV, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGSEGV, &new_action, NULL);
if (!sigsetjmp(buf, 1)){
printf("starting\n");
//code or function/method here
}
else{
printf("restarting\n");
//code or function/method here
}
while(1) {
poll(NULL,0,100); //ideally use usleep or nanosleep. for now using poll() as a timer
printf("processing...\n");
}
return 0; //or exit(SUCESS)
}