基于Ping和Telnet/NC的监控脚本案例分析

时间:2022-09-25 15:19:24

 

案例一:单纯地对某些ip进行ping监控

[root@test opt]# cat /opt/hosts_ip_list 
192.168.10.10 
192.168.10.11
192.168.10.12
192.168.10.13
192.168.10.14
192.168.10.15
192.168.10.16
192.168.10.17

[root@test opt]# cat /opt/hosts_ip_monit.sh
#!/bin/bash
for ip in $(cat /opt/hosts_ip_list)   
  do
     ping -c 1 $ip &>/dev/null                      #ping 3次,当3次ping都失败时,则判定此ip网络通信失败。
     a=$?
     sleep 2
     ping -c 1 $ip &>/dev/null
     b=$?
     sleep 2
     ping -c 1 $ip &>/dev/null
     c=$?
     sleep 2
     DATE=$(date +%F" "%H:%M)
     if [ $a -ne 0 -a $b -ne 0 -a $c -ne 0 ];then
         echo -e "Date : $DATE\nHost : $ip\nProblem : Ping is failed."
         /bin/sed -i 's/^'$ip'/'#$ip'/g' /etc/hosts
     else
         echo "$ip ping is successful."
        /bin/sed -i 's/^'#$ip'/'$ip'/g' /etc/hosts
     fi
done

[root@test opt]# chmod 755 /opt/hosts_ip_monit.sh 

[root@test opt]# sh /opt/hosts_ip_monit.sh 
Date : 2018-04-24 15:49
Host : 192.168.10.10
Problem : Ping is failed.
Date : 2018-04-24 15:50
Host : 192.168.10.11
Problem : Ping is failed.
192.168.10.12 ping is successful.
192.168.10.13 ping is successful.
192.168.10.14 ping is successful.
192.168.10.15 ping is successful.
192.168.10.16 ping is successful.
Date : 2018-04-24 15:51
Host : 192.168.10.17
Problem : Ping is failed.

案例二:对/etc/hosts列表里的ip映射关系进行ping监控报警

测试系统服务器需要访问域名www.test.com,该域名解析的DNS地址有很多个,需要在测试系统服务器上的做host绑定。在/etc/hosts文件了做了www.test.com域名的很多绑定,
在域名解析时,会从host绑定配置里从上到下匹配,如果上面绑定的ip不通,则域名解析就会失败,不会主动去解析到下一个绑定的地址,除非将这个不通的ip绑定注释掉或删除掉。

现在要求:
当/etc/hosts文件里绑定的ip出现故障,ping不通的时候,将该ip的绑定自动注释,并发出邮件报警;如果该ip恢复了正常通信,将自动打开该ip的绑定设置。
 
[root@cx-app01 ~]# cat /etc/hosts
#192.168.10.10 www.test.com
#192.168.10.11 www.test.com
192.168.10.12 www.test.com
192.168.10.13 www.test.com
192.168.10.14 www.test.com
192.168.10.15 www.test.com
192.168.10.16 www.test.com
#192.168.10.17 www.test.com
 
[root@cx-app01 ~]# ping www.test.com
PING www.test.com (192.168.10.12) 56(84) bytes of data.
64 bytes from www.test.com (192.168.10.12): icmp_seq=1 ttl=50 time=31.1 ms
64 bytes from www.test.com (192.168.10.12): icmp_seq=2 ttl=50 time=30.7 ms
64 bytes from www.test.com (192.168.10.12): icmp_seq=3 ttl=50 time=30.8 ms
.......
 
[root@cx-app01 ~]# cat /opt/hosts_ip_list
192.168.10.10
192.168.10.11
192.168.10.12
192.168.10.13
192.168.10.14
192.168.10.15
192.168.10.16
192.168.10.17
 
[root@cx-app01 ~]# cat /opt/hosts_ip_monit.sh
#!/bin/bash
for ip in $(cat /opt/hosts_ip_list)  
  do
     ping -c 1 $ip &>/dev/null         
     a=$?
     sleep 2
     ping -c 1 $ip &>/dev/null
     b=$?
     sleep 2
     ping -c 1 $ip &>/dev/null
     c=$?
     sleep 2
     DATE=$(date +%F" "%H:%M)
     if [ $a -ne 0 -a $b -ne 0 -a $c -ne 0 ];then
         echo -e "Date : $DATE\nHost : $ip\nProblem : Ping is failed."
         cat /etc/hosts|grep "^#$ip"
         d=$?
           if [ $d -ne 0 ];then
              /bin/bash /opt/sendemail.sh zhangsan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系"
              /bin/bash /opt/sendemail.sh lisi@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系"
              /bin/bash /opt/sendemail.sh liuwu@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系"
              /bin/sed -i 's/^'$ip'/'#$ip'/g' /etc/hosts
           else
              echo "$ip is not conneted,and it has been done"
           fi
     else
         echo "$ip ping is successful."
         cat /etc/hosts|grep "^#$ip"
         f=$?
           if [ $f -eq 0 ];then
              /bin/bash /opt/sendemail.sh zhangsan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接成功,现已在/etc/hosts文件里恢复该ip的映射关系"
              /bin/bash /opt/sendemail.sh lisi@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接成功,现已在/etc/hosts文件里恢复该ip的映射关系"
              /bin/bash /opt/sendemail.sh liuwu@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接成功,现已在/etc/hosts文件里恢复该ip的映射关系"
              /bin/sed -i 's/^'#$ip'/'$ip'/g' /etc/hosts
           else
              echo "$ip connection has been restored"
           fi
     fi
done
 
 
采用sendemail进行邮件告警发送,sendemail部署参考:http://www.cnblogs.com/kevingrace/p/5961861.html
[root@cx-app01 ~]# cat /opt/sendemail.sh
#!/bin/bash
# Filename: SendEmail.sh
# Notes: 使用sendEmail
#
# 脚本的日志文件
LOGFILE="/tmp/Email.log"
:>"$LOGFILE"
exec 1>"$LOGFILE"
exec 2>&1
SMTP_server='smtp.test.com'
username='monit@test.com'
password='monit@123'
from_email_address='monit@test.com'
to_email_address="$1"
message_subject_utf8="$2"
message_body_utf8="$3"
# 转换邮件标题为GB2312,解决邮件标题含有中文,收到邮件显示乱码的问题。
message_subject_gb2312=`iconv -t GB2312 -f UTF-8 << EOF
$message_subject_utf8
EOF`
[ $? -eq 0 ] && message_subject="$message_subject_gb2312" || message_subject="$message_subject_utf8"
# 转换邮件内容为GB2312,解决收到邮件内容乱码
message_body_gb2312=`iconv -t GB2312 -f UTF-8 << EOF
$message_body_utf8
EOF`
[ $? -eq 0 ] && message_body="$message_body_gb2312" || message_body="$message_body_utf8"
# 发送邮件
sendEmail='/usr/local/bin/sendEmail'
set -x
$sendEmail -s "$SMTP_server" -xu "$username" -xp "$password" -f "$from_email_address" -t "$to_email_address" -u "$message_subject" -m "$message_body" -o message-content-type=text -o message-charset=gb2312
 
 
每10分钟定时执行该监控脚本
[root@cx-app01 ~]# crontab -l
*/10 * * * *  /bin/bash -x /opt/hosts_ip_monit.sh > /dev/null 2>&1

案例三:通过nc工具对/etc/hosts列表里的ip的443端口跟本机通信是否正常进行探测

案例二是针对ping编写的监控脚本,下面介绍下利用nc探测端口通信是否正常的脚本:

探测本机对下面/etc/hosts文件里的ip地址的443端口通信是否正常,如果通信失败,则发出报警,并在/etc/hosts文件里注释掉该ip地址的绑定关系。
如果注释掉的ip的443端口跟本机恢复了通信,则去掉/etc/hosts文件里该ip的注释!

[root@cx-app01 ~]# cat /etc/hosts
192.168.10.201 www.test.com
192.168.10.205  www.test.com
192.168.10.17  www.test.com
192.168.10.85  www.test.com
192.168.10.176   www.test.com
192.168.10.245  www.test.com
192.168.10.25    www.test.com
192.168.10.47  www.test.com

[root@cx-app01 ~]# cat /opt/hosts_ip_list 
192.168.10.201
192.168.10.205
192.168.10.17
192.168.10.85
192.168.10.176
192.168.10.245
192.168.10.25
192.168.10.47

采用nc工具去探测端口是否正常通信(yum install -y nc)
[root@cx-app01 ~]# /usr/bin/nc -z  -w 10 192.168.10.201 443
Connection to 192.168.10.201 443 port [tcp/https] succeeded!

针对上面ip列表里的地址,进行批量ip的443端口通信的探测。脚本如下:
[root@cx-app01 ~]# cat /opt/host_ip_nc_monit.sh 
#!/bin/bash
for ip in $(cat /opt/hosts_ip_list)  
do
    echo -e "Date : $DATE\nHost : $ip\nProblem : Port 443 is connected."
    cat /etc/hosts|grep "^#$ip" 
    a=$?
    if [ $a -ne 0 ];then
       /usr/bin/nc -z  -w 10 $ip 443
       b=$?
       if [ $b -ne 0 ];then
          /bin/bash /opt/sendemail.sh wangshibo@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系"
          /bin/bash /opt/sendemail.sh linan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系"
          /bin/sed -i 's/^'$ip'/'#$ip'/g' /etc/hosts
       else
       echo "$HOSTNAME跟$ip的443端口正常连接"
       fi
    else
       /usr/bin/nc -z  -w 10 $ip 443
       c=$?
       if [ $c -eq 0 ];then
         /bin/bash /opt/sendemail.sh wangshibo@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接成功,现已在/etc/hosts文件里恢复该ip的映射关系"
         /bin/bash /opt/sendemail.sh linan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接成功,现已在/etc/hosts文件里恢复该ip的映射关系"
         /bin/sed -i 's/^'#$ip'/'$ip'/g' /etc/hosts
       else
         echo "$HOSTNAME跟$ip的443端口连接失败"
       fi
    fi
done

给脚本赋权
[root@cx-app01 ~]# chmod 755 /opt/host_ip_nc_monit.sh

执行脚本:
[root@cx-app01 ~]# sh /opt/host_ip_nc_monit.sh
Date : 
Host : 192.168.10.201
Problem : Port 443 is connected.
Connection to 192.168.10.201 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.201的443端口正常连接
Date : 
Host : 192.168.10.205
Problem : Port 443 is connected.
Connection to 192.168.10.205 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.205的443端口正常连接
Date : 
Host : 192.168.10.17
Problem : Port 443 is connected.
Connection to 192.168.10.17 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.17的443端口正常连接
Date : 
Host : 192.168.10.85
Problem : Port 443 is connected.
Connection to 192.168.10.85 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.85的443端口正常连接
Date : 
Host : 192.168.10.176
Problem : Port 443 is connected.
Connection to 192.168.10.176 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.176的443端口正常连接
Date : 
Host : 192.168.10.245
Problem : Port 443 is connected.
Connection to 192.168.10.245 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.245的443端口正常连接
Date : 
Host : 192.168.10.25
Problem : Port 443 is connected.
Connection to 192.168.10.25 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.25的443端口正常连接
Date : 
Host : 192.168.10.47
Problem : Port 443 is connected.
Connection to 192.168.10.47 443 port [tcp/https] succeeded!
cx-app01.veredholdings.cn跟192.168.10.47的443端口正常连接

结合crontab进行计划任务
[root@cx-app01 ~]# crontab -l
*/10 * * * *  /bin/bash -x /opt/host_ip_nc_monit.sh > /dev/null 2>&1