I have a process that fails regularly & sometimes starts duplicate instances..
我有一个过程经常失败&有时开始重复实例。
When I run: ps x |grep -v grep |grep -c "processname"
I will get: 2
This is normal as the process runs with a recovery process..
当我运行:ps x |grep -v grep -v grep -c“processname”我将得到:2这是正常的,因为进程运行的恢复过程。
If I get 0
I will want to start the process if I get: 4
I will want to stop & restart the process
如果我得到0,我想要启动进程如果我得到:4,我想要停止并重新启动进程
What I need is a way of taking the result of ps x |grep -v grep |grep -c "processname"
我需要的是一种获取ps x |grep -v grep |grep -c "processname"的方法
Then setup a simple 3 option function
然后设置一个简单的3选项函数
ps x |grep -v grep |grep -c "processname"
if answer = 0 (start process & write NOK & Time to log /var/processlog/check)
if answer = 2 (Do nothing & write OK & time to log /var/processlog/check)
if answer = 4 (stot & restart the process & write NOK & Time to log /var/processlog/check)
The process is stopped with killall -9 process
The process is started with process -b -c /usr/local/etc
进程在killall -9进程中被停止进程在进程-b -c /usr/local/等中启动
My main problem is finding a way to act on the result of ps x |grep -v grep |grep -c "processname"
.
我的主要问题是找到一种方法来处理ps x |grep -v grep -v grep -c“processname”的结果。
Ideally, I would like to make the result of that grep a variable within the script with something like this:
理想情况下,我希望将grep的结果作为脚本中的一个变量,如下所示:
process=$(ps x |grep -v grep |grep -c "processname")
进程=$(ps x |grep -v grep |grep -c“processname”)
If possible.
如果可能的话。
4 个解决方案
#1
61
Here is a script I use to monitor if a process on a system is running.
Script is stored in crontab
and runs once every minute.
这里有一个脚本,用于监视系统上的进程是否正在运行。脚本存储在crontab中,每分钟运行一次。
#! /bin/bash
case "$(pidof amadeus.x86 | wc -w)" in
0) echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
;;
1) # all ok
;;
*) echo "Removed double Amadeus: $(date)" >> /var/log/amadeus.txt
kill $(pidof amadeus.x86 | awk '{print $1}')
;;
esac
0
If process is not found, restart it.1
If process is found, all ok.*
If process running 2 or more, kill the last.
如果没有找到进程,重新启动它。如果找到进程,一切正常。*如果进程运行2个或更多,则杀死最后一个进程。
A simpler version. This just test if process is running, and if not restart it.
It just tests the exit flag $?
from the pidof
program. It will be 0
of process is running and 1
if not.
一个更简单的版本。这只是测试进程是否正在运行,如果没有重新启动它。它只是测试退出标志$?从pidof程序。进程运行时为0,如果不运行则为1。
#!/bin/bash
pidof amadeus.x86 >/dev/null
if [[ $? -ne 0 ]] ; then
echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
fi
#2
8
I adopted the @Jotne solution and works perfectly! For example for mongodb server in my NAS
我采用了@Jotne解决方案,效果非常好!例如,我的NAS中的mongodb服务器
#! /bin/bash
case "$(pidof mongod | wc -w)" in
0) echo "Restarting mongod:"
mongod --config mongodb.conf
;;
1) echo "mongod already running"
;;
esac
#3
4
I have adopted your script for my situation Jotne.
我已经采纳了你的剧本以适应我的处境。
#! /bin/bash
logfile="/var/oscamlog/oscam1check.log"
case "$(pidof oscam1 | wc -w)" in
0) echo "oscam1 not running, restarting oscam1: $(date)" >> $logfile
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
;;
2) echo "oscam1 running, all OK: $(date)" >> $logfile
;;
*) echo "multiple instances of oscam1 running. Stopping & restarting oscam1: $(date)" >> $logfile
kill $(pidof oscam1 | awk '{print $1}')
;;
esac
While I was testing, I ran into a problem.. I started 3 extra process's of oscam1 with this line: /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
which left me with 8 process for oscam1. the problem is this.. When I run the script, It only kills 2 process's at a time, so I would have to run it 3 times to get it down to 2 process..
在我考试的时候,我遇到了一个问题。我用这行代码启动了3个额外的oscam1进程:/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmposcam1给我留下了8个oscam1过程。问题是这个. .当我运行脚本时,它一次只杀死2个进程,所以我必须运行它3次才能将它降到2个进程。
Other than killall -9 oscam1
followed by /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
, in *)
is there any better way to killall apart from the original process? So there would be zero downtime?
除了killall -9 oscam1,其次是/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp。除了原始过程之外,还有什么更好的方法可以杀死所有的东西吗?所以停机时间为零?
#4
-1
Use Systemctl in linux. It is new way to monitor systemD services and Units. Learn more here https://www.digitalocean.com/community/tutorials/how-to-use-systemctl-to-manage-systemd-services-and-units
在linux中使用Systemctl。它是监视系统服务和单元的新方法。https://www.digitalocean.com/community/tutorials/how-to-use-systemctl-to-manage-systemd-services-and-units了解更多在这里
#1
61
Here is a script I use to monitor if a process on a system is running.
Script is stored in crontab
and runs once every minute.
这里有一个脚本,用于监视系统上的进程是否正在运行。脚本存储在crontab中,每分钟运行一次。
#! /bin/bash
case "$(pidof amadeus.x86 | wc -w)" in
0) echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
;;
1) # all ok
;;
*) echo "Removed double Amadeus: $(date)" >> /var/log/amadeus.txt
kill $(pidof amadeus.x86 | awk '{print $1}')
;;
esac
0
If process is not found, restart it.1
If process is found, all ok.*
If process running 2 or more, kill the last.
如果没有找到进程,重新启动它。如果找到进程,一切正常。*如果进程运行2个或更多,则杀死最后一个进程。
A simpler version. This just test if process is running, and if not restart it.
It just tests the exit flag $?
from the pidof
program. It will be 0
of process is running and 1
if not.
一个更简单的版本。这只是测试进程是否正在运行,如果没有重新启动它。它只是测试退出标志$?从pidof程序。进程运行时为0,如果不运行则为1。
#!/bin/bash
pidof amadeus.x86 >/dev/null
if [[ $? -ne 0 ]] ; then
echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
fi
#2
8
I adopted the @Jotne solution and works perfectly! For example for mongodb server in my NAS
我采用了@Jotne解决方案,效果非常好!例如,我的NAS中的mongodb服务器
#! /bin/bash
case "$(pidof mongod | wc -w)" in
0) echo "Restarting mongod:"
mongod --config mongodb.conf
;;
1) echo "mongod already running"
;;
esac
#3
4
I have adopted your script for my situation Jotne.
我已经采纳了你的剧本以适应我的处境。
#! /bin/bash
logfile="/var/oscamlog/oscam1check.log"
case "$(pidof oscam1 | wc -w)" in
0) echo "oscam1 not running, restarting oscam1: $(date)" >> $logfile
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
;;
2) echo "oscam1 running, all OK: $(date)" >> $logfile
;;
*) echo "multiple instances of oscam1 running. Stopping & restarting oscam1: $(date)" >> $logfile
kill $(pidof oscam1 | awk '{print $1}')
;;
esac
While I was testing, I ran into a problem.. I started 3 extra process's of oscam1 with this line: /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
which left me with 8 process for oscam1. the problem is this.. When I run the script, It only kills 2 process's at a time, so I would have to run it 3 times to get it down to 2 process..
在我考试的时候,我遇到了一个问题。我用这行代码启动了3个额外的oscam1进程:/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmposcam1给我留下了8个oscam1过程。问题是这个. .当我运行脚本时,它一次只杀死2个进程,所以我必须运行它3次才能将它降到2个进程。
Other than killall -9 oscam1
followed by /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
, in *)
is there any better way to killall apart from the original process? So there would be zero downtime?
除了killall -9 oscam1,其次是/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp。除了原始过程之外,还有什么更好的方法可以杀死所有的东西吗?所以停机时间为零?
#4
-1
Use Systemctl in linux. It is new way to monitor systemD services and Units. Learn more here https://www.digitalocean.com/community/tutorials/how-to-use-systemctl-to-manage-systemd-services-and-units
在linux中使用Systemctl。它是监视系统服务和单元的新方法。https://www.digitalocean.com/community/tutorials/how-to-use-systemctl-to-manage-systemd-services-and-units了解更多在这里