collectd+influxDB+grafana搭建性能监控平台
前言
-
InfluxDB 是 Go 语言开发的一个开源分布式时序数据库,非常适合存储指标、事件、分析等数据;键值时间数据库性能还不错
-
collectd 是C 语言写的一个系统性能采集工具
-
Grafana 是纯 Javascript 开发的前端工具,用于访问 InfluxDB,自定义报表、显示图表等。V3.0以上版本支持zabbix 数据库,可以非常方便直接由zabbix_agent 采集数据。
1 环境信息
172监控整个集群
collectd采集数据,influxDB存储数据,grafana展示数据
三者关系为:
采集数据(collectd)-> 存储数据(influxdb) -> 显示数据(grafana)
根据测试需要,在145/146/146/167/171/178这六台机器上均需安装collectd收集数据
influxDB和grafana安装在172上
笔者集群机器为同一网段,操作系统均为CentOS Linux release 7.4.1708 (Core)
2 安装collectd
安装依赖包和collectd
145/146/146/167/171/178均需安装collectd
yum -y install epel-release
yum -y install collectd
collectd安装完毕后的版本目前为collectd-5.8.0-4.el7.x86_64
collectd依赖epel-release安装源
安装rrdtool插件
为了与influxdb通知,collectd作为客户端,需连接influxdb的25826端口,所以要打开network plugin并配置server属性;
为了让influxdb能够识别collectd的数据,要安装rrdtool插件,否则/var/lib/collectd/rrd目录不会生成
rrdtool插件与依赖包的安装:
yum install collectd-rrdtool rrdtool rrdtool-devel
参数配置
默认安装路径下,collectd的配置文件路径为/etc/collectd.conf
配置文件collectd.conf中:
-
两个##开头表示该插件还没有构建,也就不能使用
-
一个#开头表示该插件已经构建,但是不能使用
-
没有#表示该插件已经构建且能使用
vi /etc/collectd.conf
Hostname "node171"
FQDNLookup true
BaseDir "/var/lib/collectd"
PIDFile "/var/run/collectd.pid"
PluginDir "/usr/lib64/collectd"
TypesDB "/usr/share/collectd/types.db"
LoadPlugin syslog
LoadPlugin cpu
LoadPlugin disk
LoadPlugin interface
LoadPlugin memory
LoadPlugin rrdtool
LoadPlugin swap
<Plugin cpu>
ReportByCpu true
ReportByState true
ValuesPercentage true
</Plugin>
<Plugin interface>
Interface "eth0"
IgnoreSelected false
</Plugin>
<Plugin load>
ReportRelative true
</Plugin>
<Plugin network>
Server "*.*.*.*" "25826"
</Plugin>
<Plugin rrdtool>
DataDir "/var/lib/collectd/rrd"
</Plugin>
配置项中最重要的部分是network插件,这里配置的IP是指安装influxdb的IP
collectd至少开启输入输出network、rrdtool 插件
启停collectd
systemctl stop collectd.service
systemctl start collectd.service
systemctl enable collectd.service 开机启动
systemctl status collectd.service 检查插件的加载情况
日志
打开日志,并配置日志级别和路径等属性
LoadPlugin logfile
<Plugin logfile>
LogLevel info
File "/var/log/collectd.log"
</Plugin>
重启collectd,在/var/log/collectd.log中可看到日志
rrd目录
启动collectd后,可在/var/lib/collectd目录下看到rrd目录
[root@node171 ~]# cd /var/lib/collectd/rrd/node171
[root@node171 node171]# ll
total 0
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-0
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-1
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-2
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-3
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-4
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-5
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-6
drwxr-xr-x 2 root root 209 Jun 21 09:14 cpu-7
drwxr-xr-x 2 root root 124 Jun 21 09:48 disk-dm-0
drwxr-xr-x 2 root root 94 Jun 21 09:14 disk-dm-1
drwxr-xr-x 2 root root 94 Jun 21 09:14 disk-dm-2
drwxr-xr-x 2 root root 94 Jun 21 09:14 disk-sr0
drwxr-xr-x 2 root root 147 Jun 21 09:48 disk-xvda
drwxr-xr-x 2 root root 94 Jun 21 09:14 disk-xvda1
drwxr-xr-x 2 root root 147 Jun 21 09:48 disk-xvda2
drwxr-xr-x 2 root root 92 Jun 21 09:14 interface-eth0
drwxr-xr-x 2 root root 31 Jun 21 11:05 load
drwxr-xr-x 2 root root 162 Jun 21 09:14 memory
drwxr-xr-x 2 root root 195 Jun 21 09:14 processes
drwxr-xr-x 2 root root 116 Jun 21 09:14 swap
drwxr-xr-x 2 root root 23 Jun 21 09:14 users
3 安装influxDB
1.1.0以后版本的influxDB没有web页面,笔者安装的是influxdb-1.1.0-1.x86_64版本
influxDB安装
yum -y install https://dl.influxdata.com/influxdb/releases/influxdb-1.1.0-1.x86_64.rpm
配置参数
默认安装路径下,influxDB的配置文件路径为/etc/influxdb/influxdb.conf
配置文件指明influxdb使用collectd采集器,放开8083前端端口
NOTE: This interface is deprecated as of 1.1.0 and will be removed in a future release.
配置文件中已明确,前端页面,在1.1.0版本之后会被删除。
~$ vim /etc/influxdb/influxdb.conf
[admin]
# Determines whether the admin service is enabled.
enabled = true
# The default bind address used by the admin service.
bind-address = ":8083"
# Whether the admin service should use HTTPS.
# https-enabled = false
# The SSL certificate used when HTTPS is enabled.
# https-certificate = "/etc/ssl/influxdb.pem"
###
### [http]
###
### Controls how the HTTP endpoints are configured. These are the primary
### mechanism for getting data into and out of InfluxDB.
###
[http]
# Determines whether HTTP endpoint is enabled.
enabled = true
# The bind address used by the HTTP service.
bind-address = ":8086"
[collectd]
enabled = true
bind-address = "127.0.0.1:25826"
database = "collectd"
typesdb = "/usr/share/collectd/types.db"
启停influxDB
systemctl start influxdb.service
systemctl enable influxdb.service
systemctl stop influxdb.service
service influxdb status
测试influxDB
这里的数据库无需在influxdb里事先创建好,collectd在向influxdb发送数据的时候会自动创建该数据库。这里重启influxdb服务,会发现其会开启一个UDP的25826端口用来接收数据 。
检查collectd收集的参数是否由influxDB接收
[root@node172 ~]# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 1.1.0
InfluxDB shell version: 1.1.0
> show databases
name: databases
name
----
_internal
collectd
> use collectd
Using database collectd
> show measurements
name: measurements
name
----
cpu_value
disk_io_time
disk_read
disk_value
disk_weighted_io_time
disk_write
interface_rx
interface_tx
load_longterm
load_midterm
load_shortterm
memory_value
swap_value
>
influxDB使用的端口
-
8083: Web admin管理服务的端口, http://yourIP:8083
-
8086: HTTP API的端口
-
netstat -tlnpu |grep influxd
查看influxDB端口UDP协议监控
用浏览器查看8083端口即可访问influxdb前端,并可用类似sql命令来访问数据
4 安装grafana
grafana安装
grafana的安装不建议安装低版本,版本过低,后续无法监控mysql等
建议安装最新版本的grafana
wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.1.4-1.x86_64.rpm
sudo yum localinstall grafana-5.1.4-1.x86_64.rpm
grafana启停
systemctl stop grafana-server.service
systemctl start grafana-server.service
systemctl enable grafana-server.service
grafana页面
web访问页面:
预制用户admin,密码admin,用户可配置
下图为已设置过的grafana面板
grafana连接数据库
点击Data Sources-ADD new,填写如下参数并保存
设置grafana面板
负载设置
network I/O,只能切换成Switch editor mode编辑,并修改Left Y轴的单位为bytes/s,如下所示:
流入的sql脚本:
SELECT derivative("value") AS "value" FROM "interface_rx" WHERE "host" = \'client174\' AND "type" = \'if_octets\' AND "instance" = \'eth0\'
流出时,将interface_rx改为interface_tx即可
类似配置,可配置cpu,memory,swap等