案例一:单纯地对某些ip进行ping监控
[root@test opt]# cat /opt/hosts_ip_list 192.168.10.10 192.168.10.11 192.168.10.12 192.168.10.13 192.168.10.14 192.168.10.15 192.168.10.16 192.168.10.17 [root@test opt]# cat /opt/hosts_ip_monit.sh #!/bin/bash for ip in $(cat /opt/hosts_ip_list) do ping -c 1 $ip &>/dev/null #ping 3次,当3次ping都失败时,则判定此ip网络通信失败。 a=$? sleep 2 ping -c 1 $ip &>/dev/null b=$? sleep 2 ping -c 1 $ip &>/dev/null c=$? sleep 2 DATE=$(date +%F" "%H:%M) if [ $a -ne 0 -a $b -ne 0 -a $c -ne 0 ];then echo -e "Date : $DATE\nHost : $ip\nProblem : Ping is failed." /bin/sed -i 's/^'$ip'/'#$ip'/g' /etc/hosts else echo "$ip ping is successful." /bin/sed -i 's/^'#$ip'/'$ip'/g' /etc/hosts fi done [root@test opt]# chmod 755 /opt/hosts_ip_monit.sh [root@test opt]# sh /opt/hosts_ip_monit.sh Date : 2018-04-24 15:49 Host : 192.168.10.10 Problem : Ping is failed. Date : 2018-04-24 15:50 Host : 192.168.10.11 Problem : Ping is failed. 192.168.10.12 ping is successful. 192.168.10.13 ping is successful. 192.168.10.14 ping is successful. 192.168.10.15 ping is successful. 192.168.10.16 ping is successful. Date : 2018-04-24 15:51 Host : 192.168.10.17 Problem : Ping is failed.
案例二:对/etc/hosts列表里的ip映射关系进行ping监控报警
测试系统服务器需要访问域名www.test.com,该域名解析的DNS地址有很多个,需要在测试系统服务器上的做host绑定。在/etc/hosts文件了做了www.test.com域名的很多绑定, 在域名解析时,会从host绑定配置里从上到下匹配,如果上面绑定的ip不通,则域名解析就会失败,不会主动去解析到下一个绑定的地址,除非将这个不通的ip绑定注释掉或删除掉。 现在要求: 当/etc/hosts文件里绑定的ip出现故障,ping不通的时候,将该ip的绑定自动注释,并发出邮件报警;如果该ip恢复了正常通信,将自动打开该ip的绑定设置。 [root@cx-app01 ~]# cat /etc/hosts #192.168.10.10 www.test.com #192.168.10.11 www.test.com 192.168.10.12 www.test.com 192.168.10.13 www.test.com 192.168.10.14 www.test.com 192.168.10.15 www.test.com 192.168.10.16 www.test.com #192.168.10.17 www.test.com [root@cx-app01 ~]# ping www.test.com PING www.test.com (192.168.10.12) 56(84) bytes of data. 64 bytes from www.test.com (192.168.10.12): icmp_seq=1 ttl=50 time=31.1 ms 64 bytes from www.test.com (192.168.10.12): icmp_seq=2 ttl=50 time=30.7 ms 64 bytes from www.test.com (192.168.10.12): icmp_seq=3 ttl=50 time=30.8 ms ....... [root@cx-app01 ~]# cat /opt/hosts_ip_list 192.168.10.10 192.168.10.11 192.168.10.12 192.168.10.13 192.168.10.14 192.168.10.15 192.168.10.16 192.168.10.17 [root@cx-app01 ~]# cat /opt/hosts_ip_monit.sh #!/bin/bash for ip in $(cat /opt/hosts_ip_list) do ping -c 1 $ip &>/dev/null a=$? sleep 2 ping -c 1 $ip &>/dev/null b=$? sleep 2 ping -c 1 $ip &>/dev/null c=$? sleep 2 DATE=$(date +%F" "%H:%M) if [ $a -ne 0 -a $b -ne 0 -a $c -ne 0 ];then echo -e "Date : $DATE\nHost : $ip\nProblem : Ping is failed." cat /etc/hosts|grep "^#$ip" d=$? if [ $d -ne 0 ];then /bin/bash /opt/sendemail.sh zhangsan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系" /bin/bash /opt/sendemail.sh lisi@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系" /bin/bash /opt/sendemail.sh liuwu@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系" /bin/sed -i 's/^'$ip'/'#$ip'/g' /etc/hosts else echo "$ip is not conneted,and it has been done" fi else echo "$ip ping is successful." cat /etc/hosts|grep "^#$ip" f=$? if [ $f -eq 0 ];then /bin/bash /opt/sendemail.sh zhangsan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接成功,现已在/etc/hosts文件里恢复该ip的映射关系" /bin/bash /opt/sendemail.sh lisi@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接成功,现已在/etc/hosts文件里恢复该ip的映射关系" /bin/bash /opt/sendemail.sh liuwu@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip连接成功,现已在/etc/hosts文件里恢复该ip的映射关系" /bin/sed -i 's/^'#$ip'/'$ip'/g' /etc/hosts else echo "$ip connection has been restored" fi fi done 采用sendemail进行邮件告警发送,sendemail部署参考:http://www.cnblogs.com/kevingrace/p/5961861.html [root@cx-app01 ~]# cat /opt/sendemail.sh #!/bin/bash # Filename: SendEmail.sh # Notes: 使用sendEmail # # 脚本的日志文件 LOGFILE="/tmp/Email.log" :>"$LOGFILE" exec 1>"$LOGFILE" exec 2>&1 SMTP_server='smtp.test.com' username='monit@test.com' password='monit@123' from_email_address='monit@test.com' to_email_address="$1" message_subject_utf8="$2" message_body_utf8="$3" # 转换邮件标题为GB2312,解决邮件标题含有中文,收到邮件显示乱码的问题。 message_subject_gb2312=`iconv -t GB2312 -f UTF-8 << EOF $message_subject_utf8 EOF` [ $? -eq 0 ] && message_subject="$message_subject_gb2312" || message_subject="$message_subject_utf8" # 转换邮件内容为GB2312,解决收到邮件内容乱码 message_body_gb2312=`iconv -t GB2312 -f UTF-8 << EOF $message_body_utf8 EOF` [ $? -eq 0 ] && message_body="$message_body_gb2312" || message_body="$message_body_utf8" # 发送邮件 sendEmail='/usr/local/bin/sendEmail' set -x $sendEmail -s "$SMTP_server" -xu "$username" -xp "$password" -f "$from_email_address" -t "$to_email_address" -u "$message_subject" -m "$message_body" -o message-content-type=text -o message-charset=gb2312 每10分钟定时执行该监控脚本 [root@cx-app01 ~]# crontab -l */10 * * * * /bin/bash -x /opt/hosts_ip_monit.sh > /dev/null 2>&1
案例三:通过nc工具对/etc/hosts列表里的ip的443端口跟本机通信是否正常进行探测
案例二是针对ping编写的监控脚本,下面介绍下利用nc探测端口通信是否正常的脚本: 探测本机对下面/etc/hosts文件里的ip地址的443端口通信是否正常,如果通信失败,则发出报警,并在/etc/hosts文件里注释掉该ip地址的绑定关系。 如果注释掉的ip的443端口跟本机恢复了通信,则去掉/etc/hosts文件里该ip的注释! [root@cx-app01 ~]# cat /etc/hosts 192.168.10.201 www.test.com 192.168.10.205 www.test.com 192.168.10.17 www.test.com 192.168.10.85 www.test.com 192.168.10.176 www.test.com 192.168.10.245 www.test.com 192.168.10.25 www.test.com 192.168.10.47 www.test.com [root@cx-app01 ~]# cat /opt/hosts_ip_list 192.168.10.201 192.168.10.205 192.168.10.17 192.168.10.85 192.168.10.176 192.168.10.245 192.168.10.25 192.168.10.47 采用nc工具去探测端口是否正常通信(yum install -y nc) [root@cx-app01 ~]# /usr/bin/nc -z -w 10 192.168.10.201 443 Connection to 192.168.10.201 443 port [tcp/https] succeeded! 针对上面ip列表里的地址,进行批量ip的443端口通信的探测。脚本如下: [root@cx-app01 ~]# cat /opt/host_ip_nc_monit.sh #!/bin/bash for ip in $(cat /opt/hosts_ip_list) do echo -e "Date : $DATE\nHost : $ip\nProblem : Port 443 is connected." cat /etc/hosts|grep "^#$ip" a=$? if [ $a -ne 0 ];then /usr/bin/nc -z -w 10 $ip 443 b=$? if [ $b -ne 0 ];then /bin/bash /opt/sendemail.sh wangshibo@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系" /bin/bash /opt/sendemail.sh linan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接失败,现已在/etc/hosts文件里注释掉该ip的映射关系" /bin/sed -i 's/^'$ip'/'#$ip'/g' /etc/hosts else echo "$HOSTNAME跟$ip的443端口正常连接" fi else /usr/bin/nc -z -w 10 $ip 443 c=$? if [ $c -eq 0 ];then /bin/bash /opt/sendemail.sh wangshibo@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接成功,现已在/etc/hosts文件里恢复该ip的映射关系" /bin/bash /opt/sendemail.sh linan@test.com "测试系统跟www.test.com通信情况" "$HOSTNAME跟$ip的443端口连接成功,现已在/etc/hosts文件里恢复该ip的映射关系" /bin/sed -i 's/^'#$ip'/'$ip'/g' /etc/hosts else echo "$HOSTNAME跟$ip的443端口连接失败" fi fi done 给脚本赋权 [root@cx-app01 ~]# chmod 755 /opt/host_ip_nc_monit.sh 执行脚本: [root@cx-app01 ~]# sh /opt/host_ip_nc_monit.sh Date : Host : 192.168.10.201 Problem : Port 443 is connected. Connection to 192.168.10.201 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.201的443端口正常连接 Date : Host : 192.168.10.205 Problem : Port 443 is connected. Connection to 192.168.10.205 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.205的443端口正常连接 Date : Host : 192.168.10.17 Problem : Port 443 is connected. Connection to 192.168.10.17 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.17的443端口正常连接 Date : Host : 192.168.10.85 Problem : Port 443 is connected. Connection to 192.168.10.85 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.85的443端口正常连接 Date : Host : 192.168.10.176 Problem : Port 443 is connected. Connection to 192.168.10.176 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.176的443端口正常连接 Date : Host : 192.168.10.245 Problem : Port 443 is connected. Connection to 192.168.10.245 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.245的443端口正常连接 Date : Host : 192.168.10.25 Problem : Port 443 is connected. Connection to 192.168.10.25 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.25的443端口正常连接 Date : Host : 192.168.10.47 Problem : Port 443 is connected. Connection to 192.168.10.47 443 port [tcp/https] succeeded! cx-app01.veredholdings.cn跟192.168.10.47的443端口正常连接 结合crontab进行计划任务 [root@cx-app01 ~]# crontab -l */10 * * * * /bin/bash -x /opt/host_ip_nc_monit.sh > /dev/null 2>&1