Monit监控软件安装

Monit是一款功能非常丰富的进程、文件、目录和设备的监测软件，适用于Linux/Unix平台。

在CentOS 6.4上配置Monit的步骤：

我们以服务器IP地址：10.153.126.189，为例进行配置，监控10.153.110.12， 10.153.75.78这两台服务器。

一、安装EPEL。在命令行输入：

# rpm -ivh http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm

二、安装Monit。在命令行输入：

# yum install monit –y

这一步可能会报错：

# yum install monit –y
Loaded plugins: fastestmirror, security
Determining fastest mirrors
Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again

解决方法：

vi /etc/yum.repos.d/epel.repo

编辑[epel]下的baseurl前的#号去掉，mirrorlist前添加#号。正确配置如下：

[epel]
name=Extra Packages for Enterprise Linux 6 - $basearch
baseurl=http://download.fedoraproject.org/pub/epel/6/$basearch
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6

三、至此Monit安装完毕，接下来配置monit.conf文件。

1、文件位置在/etc/monit.conf，修改常见配置：

1）检测时间、日志位置：

set daemon 120
   with start delay 240
set logfile syslog facility log_daemon

每120s检测一次；启动后延时240s开始检测；

日志文件位置；

2）id文件

set idfile /var/monit/id
set eventqueue
     basedir /var/monit

我们必须定义“idfile”，Monit守护进程的一个独一无二的ID文件；
“eventqueue”，当monit的邮件因为SMTP或者网络故障发不出去，邮件会暂存在这里；以及确保/var/monit路径是存在的。然后使用下边的配置就可以了；

3）设置web界面：

set httpd port 1966 and
     SSL ENABLE
     PEMFILE  /var/certs/monit.pem
     allow monituser:romania
     allow localhost
     allow 192.168.0.0/16
     allow myhost.mydomain.ro

2、监控信息可以直接写到/etc/monit.conf文件中，也可以单独创建一个.cfg后缀的文件，将check... if 语句添加到里面，然后在monit.conf文件末尾加入include路径。

set daemon 30         # 每30秒查询一次服务器状态

set logfile /data/apps/monit/log/monit.log    # 默认日志位于 /var/log/monit 

set idfile /var/.monit.id

set eventqueue
basedir /data/apps/monit/data
slots 10000

set httpd port 2812 and             # 监听2812端口
use address 10.153.126.189     # 本服务器地址，这里我们的Monit安装到了10.153.126.189这台服务器上。
allow localhost 
allow 10.1.0.0/255.255.0.0
allow admin : pin                      # 在这里设置用户名及口令。admin为用户名，冒号后为密码。

# 接下来设置要监听的服务器

# address后面的是服务器的IP地址。第二行设置port端口号。exec后为当异常情况出现后执行的脚本。可以添加多个check ... if 语句，同时监听很多台服务器

# 这里 /data/apps/monit/contrib/sms.py这个脚本负责报警

check host read_kajuan_10.153.110.12 with address 10.153.110.12
      if failed port 80 with timeout 1 seconds for 2 cycles then exec "/data/apps/monit/contrib/sms.py"

check host read_kajuan_10.153.75.78 with address 10.153.75.78
      if failed port 80 with timeout 1 seconds for 2 cycles then exec "/data/apps/monit/contrib/sms.py"

#include /etc/monit.d/*

四、常见监控：

1、根据ip+端口，监控web服务器端口存活：

check host gamecenter_api_10.153.123.2 with address 10.153.123.2
    if failed port 8093 with timeout 1 seconds for 2 cycles then exec "/data/apps/monit/contrib/sms.py"

表示：在两次监控周期内，如果端口超时超过1m则报警。

2、根据pid，监控服务进程：

check process tomcat with pidfile /var/run/catalina.pid     # 进程pid
    start program = "/etc/init.d/tomcat start"              # 设置启动命令
    stop program  = "/etc/init.d/tomcat stop"               # 设置停止命令
    if 9 restarts within 10 cycles then timeout             # 设置在10个监视周期内重，启了9次则超时,不再监视这个服务。原因另外说明【3】
    if cpu usage > 90% for 5 cycles then alert              # 如果在5个周期内该服务的cpu使用率都超过90%则提示
    if failed url http://127.0.0.1:4000/ timeout 120 seconds for 5 cycles then restart # 若连续5个周期打开url都失败（120秒超时，超时也认为失败）则重启服务

设置超时后不再监视是为了让服务不要一直重启,如果连续重启多次不成功,极有可能再重启下去也不会成功的。并且tomcat的重启需要占用大量系统资源,假如一直重启下去,反而会使其它服务也无法正常运作。

3、可以对moint本身服务器进行监控：

# 系统名称，可以是IP或域名
check system www.example.com
    if loadavg (1min) > 4 then alert
    if loadavg (5min) > 2 then alert
    if memory usage > 75% then alert
    if cpu usage (user) > 70% then alert
    if cpu usage (system) > 30% then alert
    if cpu usage (wait) > 20% then alert

4、实例：

#
# 监控nginx
#
# 需要提供进程pid文件信息
check process nginx with pidfile /var/run/nginx.pid
    # 进程启动命令行，注：必须是命令全路径
    start program = "/etc/init.d/nginx start"
    # 进程关闭命令行
    stop program  = "/etc/init.d/nginx stop"
    # nginx进程状态测试，监测到nginx连不上了，则自动重启
    if failed host www.example.com port 80 protocol http then restart
    # 多次重启失败将不再尝试重启，这种就是系统出现严重错误的情况
    if 3 restarts within 5 cycles then timeout
    # 可选，设置分组信息
    group server

#   可选的ssl端口的监控，如果有的话
#    if failed port 443 type tcpssl protocol http
#       with timeout 15 seconds
#       then restart

#
# 监控apache
#
check process apache with pidfile /var/run/apache2.pid
    start program = "/etc/init.d/apache2 start"
    stop program  = "/etc/init.d/apache2 stop"
    # apache吃cpu和内存比较厉害，额外添加一些关于这方面的监控设置
    if cpu > 50% for 2 cycles then alert
    if cpu > 70% for 5 cycles then restart
    if totalmem > 1500 MB for 10 cycles then restart
    if children > 250 then restart
    if loadavg(5min) greater than 10 for 20 cycles then stop
    if failed host www.example.com port 8080 protocol http then restart
    if 3 restarts within 5 cycles then timeout
    group server
    # 可选，依赖于nginx
    depends on nginx

#
# 监控spawn-fcgi进程(其实就是fast-cgi进程)
#
check process spawn-fcgi with pidfile /var/run/spawn-fcgi.pid
    # spawn-fcgi一定要带-P参数才会生成pid文件，默认是没有的
    start program = "/usr/bin/spawn-fcgi -a 127.0.0.1 -p 8081 -C 10 -u userxxx -g groupxxx -P /var/run/spawn-fcgi.pid -f /usr/bin/php-cgi"
    stop program = "/usr/bin/killall /usr/bin/php-cgi"
    # fast-cgi走的不是http协议，monit的protocol参数也没有cgi对应的设置，这里去掉protocol http即可。
    if failed host 127.0.0.1 port 8081 then restart
    if 3 restarts within 5 cycles then timeout
    group server
    depends on nginx

注意：

start和stop的program参数里的命令必须是全路径，否则monit不能正常启动，比如killall应该是/usr/bin/killall。
对于spawn-fcgi，很多人会用它来管理PHP的fast-cgi进程，但spawn-fcgi本身也是有可能挂掉的，所以还是需要用monit来监控spawn-fcgi。spawn-fcgi必须带-P参数才会有pid文件，而且fast-cgi走的不是http协议，monit的protocol参数也没有cgi对应的设置，一定要去掉protocol http这项设置才管用。
进程多次重启失败monit将不再尝试重启，收到这样的通知邮件表明系统出现了严重的问题，要引起足够的重视，需要赶紧人工处理。
当然monit除了管理进程之外，还可以监控文件、目录、设备等，本文不做讨论，具体配置方式可以去参考monit的官方文档。

参考：

http://www.cnblogs.com/ddr888/archive/2011/03/02/1969087.html

http://feilong.me/2011/02/monitor-core-processes-with-monit

http://www.vpser.net/manage/monit.html

http://itoedr.blog.163.com/blog/static/1202842972014529115715267/

https://www.rails365.net/articles/bu-shu-zhi-shi-yong-monit-lai-jian-kong-fu-wu-si

http://linuxjcq.blog.51cto.com/3042600/717843

https://segmentfault.com/a/1190000002867212

秒客网

Monit监控软件安装

相关文章