Linux进程管理工具--God-详解（1）-入门

God是一个由Ruby编写的监控架构，它可以保障你的进程为运行状态，以及可以对一些特殊情况进行进程的重启。拓展可以通过frigga来进行全局god的管理。

最好的安装方式（通过ruby-gems）：

gem install god

快速启动

注意：快速启动需要0.12版本以上的，你可以使用以下命令查看版本：

god --version

一个简单的例子：使用god保持一个简单的进程。

下面是一个简单的脚本，名字为hello.py

#!/usr/bin/env python#import timewhile True:  print "hello"  time.sleep(1)

现在写一个god的配置文件来管理上边的hello.py的进程，simple.god:

God.watch do |w|  w.name = "hello"  w.start = "python /root/god/hello.py"  w.keepalive end

这是一个简单的god配置，我们首先声明一个God.watch模块，它可以监控、控制上边的进程。每个watch都必须有一个唯一的名字和启动这个进程的命令。keepalive告诉god保证这个进程的存活，如果这个进程死亡了，god会通过上边定义的start来启动进程。

这个simple的例子中，我们将god运行到前端，从而来保证我们可以进行相关的查看。

要运行god，我们需要指定配置文件(-C),以及要求他在前端运行(-D)

god -C simple.god -D

有迹象表明，god的监控流程有2种。1、最好的方式（基于事件），不是每个系统都支持它，但是支持的话会自动使用它。当产生相关事件（退出），god会立即知道。2、对于不支持事件的系统，god会使用轮询机制。PS：我这里是基于事件的机制，由于我这里的限制，没有基于轮询的，如果想看：http://godrb.com/

#EventsI [2014-08-11 11:10:10]  INFO: Loading simple.godI [2014-08-11 11:10:10]  INFO: Syslog enabled.I [2014-08-11 11:10:10]  INFO: Using pid file directory: /var/run/godI [2014-08-11 11:10:10]  INFO: Socket already in useI [2014-08-11 11:10:10]  INFO: Socket is stale, reopeningI [2014-08-11 11:10:10]  INFO: Started on drbunix:///tmp/god.17165.sockI [2014-08-11 11:10:10]  INFO: hello move 'unmonitored' to 'init'I [2014-08-11 11:10:10]  INFO: hello moved 'unmonitored' to 'init'I [2014-08-11 11:10:10]  INFO: hello [trigger] process is not running (ProcessRunning)I [2014-08-11 11:10:10]  INFO: hello move 'init' to 'start'I [2014-08-11 11:10:10]  INFO: hello start: python /root/god/hello.pyI [2014-08-11 11:10:10]  INFO: hello moved 'init' to 'start'I [2014-08-11 11:10:10]  INFO: hello [trigger] process is running (ProcessRunning)I [2014-08-11 11:10:10]  INFO: hello move 'start' to 'up'I [2014-08-11 11:10:10]  INFO: hello registered 'proc_exit' event for pid 25779I [2014-08-11 11:10:10]  INFO: hello moved 'start' to 'up'

从DEBUG信息中，你可以看出来，hello这个进程起初是没有启动的，而后god将它启动。PS：如果是基于轮询模式启动，你注意观察，他会5秒钟检查一次进程。

为了体现基于事件，我这里多加了一步操作（在别的终端杀掉hello.py，以验证基于事件的形式）：

[root@master ~]# ps -ef|grep hello.pyroot     25779     1  0 11:10 ?        00:00:00 python /root/god/hello.pyroot     25803 25782  0 11:10 pts/1    00:00:00 grep hello.py[root@master ~]# kill -9 25779

#Event 状态：I [2014-08-11 11:11:02]  INFO: hello [trigger] process 25779 exited {:thread_group_id=>25779, :pid=>25779, :exit_code=>9, :exit_signal=>17} (ProcessExits)I [2014-08-11 11:11:02]  INFO: hello move 'up' to 'start'I [2014-08-11 11:11:02]  INFO: hello deregistered 'proc_exit' event for pid 25779I [2014-08-11 11:11:02]  INFO: hello start: python /root/god/hello.pyI [2014-08-11 11:11:02]  INFO: hello moved 'up' to 'start'I [2014-08-11 11:11:02]  INFO: hello [trigger] process is running (ProcessRunning)I [2014-08-11 11:11:02]  INFO: hello move 'start' to 'up'I [2014-08-11 11:11:02]  INFO: hello registered 'proc_exit' event for pid 25807I [2014-08-11 11:11:02]  INFO: hello moved 'start' to 'up'

PS：如果是轮询（Polls）模式，它不是即刻启动，而是等到检查周期的到来。

到这里，你已经知道god如何来保证进程，还有一些更加有空的管理方式，如cpu达到多少重启进程，memory达到多少重启进程等等，下面是一个配置的例子：

God.watch do |w|  w.name = "hello"  w.start = "python /root/god/hello.py"  w.keepalive(:memory_max => 150.megabytes,              :cpu_max => 50.percent)  end

详解：:memory_max选项属于keepalive的子命令，:cpu_max同样也是。上边的配置中，如果内存达到了150M，或CPU达到了50%，god就回重启进程。默认情况下，这些进程30秒会被检查1次，并且会在（5超3次）的时候重启，以避免偶尔的超载情况。

这里就不在进行模拟内存泄露的情况了，下面贴一个重启cpu的日志，官方文档：http://godrb.com/ 搜memory leak

I [2014-08-11 13:35:46]  INFO: hello [trigger] cpu out of bounds [5.3571428566083%%, *90.3052064640262%%, *94.7069943292977%%, *96.3414634148933%%] (CpuUsage)I [2014-08-11 13:35:46]  INFO: hello move 'up' to 'restart'I [2014-08-11 13:35:46]  INFO: hello deregistered 'proc_exit' event for pid 26355I [2014-08-11 13:35:46]  INFO: hello stop: default lambda killerI [2014-08-11 13:35:46]  INFO: hello sent SIGTERMI [2014-08-11 13:35:47]  INFO: hello process stoppedI [2014-08-11 13:35:47]  INFO: hello start: python /root/god/hello.py

另外，你可以使用god对一些进程进行操作：

god start hello  #hello为进程名.对应simple.god文件中的w.namegod stop hellogod restart hello...

所以，当你使用god的管理进程时候，可以自己编写一些特定的配置文件来管理相关的进程。例如：http出错、磁盘io较大等等问题出现时，可以帮助你做一些事。

本文出自 “豆包的博客” 博客，请务必保留此出处http://407711169.blog.51cto.com/6616996/1538541

秒客网

Linux进程管理工具--God-详解（1）-入门

相关文章