Supervisor是由python语言编写,基于linux操作系统的一款服务器管理工具,用以监控服务器的运行,发现问题能立即自动预警及自动重启等功能。
Supervisor类似于monit,
monit和supervisor的一个比较大的差异是supervisor管理的进程必须由supervisor来启动,
monit可以管理已经在运行的程序;
supervisor还要求管理的程序是非daemon程序,supervisord会帮你把它转成daemon程序,
因此如果用supervisor来管理nginx的话,必须在nginx的配置文件里添加一行设置daemon off让nginx以非daemon方式启动。
一、Supervisor的组成
1. supervisord
The server piece of supervisor is named supervisord.
It is responsible for starting child programs at its own invocation,
responding to commands from clients, restarting crashed or exited subprocesseses,
logging its subprocess stdout and stderr output, and generating and handling “events”
corresponding to points in subprocess lifetimes.
The server process uses a configuration file. This is typically located in /etc/supervisord.conf.
This configuration file is an “Windows-INI” style config file.
It is important to keep this file secure via proper filesystem permissions
because it may contain unencrypted usernames and passwords.
2. supervisorctl
The command-line client piece of the supervisor is named supervisorctl.
It provides a shell-like interface to the features provided by supervisord. From supervisorctl,
a user can connect to different supervisord processes, get status on the subprocesses controlled by,
stop and start subprocesses of, and get lists of running processes of a supervisord.
The command-line client talks to the server across a UNIX domain socket or an internet (TCP) socket.
The server can assert that the user of a client should present authentication credentials before it allows him to perform commands.
The client process typically uses the same configuration file as the server but any configuration file with a [supervisorctl]
section in it will work.
3. Web Server
A (sparse) web user interface with functionality comparable to supervisorctl may be accessed via a browser
if you start supervisord against an internet socket.
Visit the server URL (e.g. http://localhost:9001/) to view and control process status through
the web interface after activating the configuration file’s [inet_http_server] section.
4. XML-RPC Interface
The same HTTP server which serves the web UI serves up an XML-RPC interface that can be used to interrogate
and control supervisor and the programs it runs. See XML-RPC API Documentation.
Platform Requirements
二、Supervisor安装
首先必须安装好python环境,linux自带python,但建议安装2.7.0以上的版本。
Supervisor可以通过
$ sudo easy_install supervisor
安装。安装成功后显示finished, 可以再次进入python环境,
输入"import supervisor", 如果没有提示错误,则表示安装成功。
当然也可以通过Supervisor官网下载后setup.py install安装。
出现错误提示:
Installed /usr/local/python2.7.3/lib/python2.7/site-packages/supervisor-4.0.0_dev-py2.7.egg
Processing dependencies for supervisor==4.0.0-dev
Searching for meld3>=1.0.0
Reading https://pypi.python.org/simple/meld3/
Download error on https://pypi.python.org/simple/meld3/: [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed -- Some packages may not be found!
Couldn't find index page for 'meld3' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading https://pypi.python.org/simple/
Download error on https://pypi.python.org/simple/: [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed -- Some packages may not be found!
No local packages or download links found for meld3>=1.0.0
error: Could not find suitable distribution for Requirement.parse('meld3>=1.0.0')
解决办法:
上网查询了问题原因: 是curl的证书太老了需要下载最新的证书:
下载最新的证书文件 、
$ wget http://curl.haxx.se/ca/cacert.pem
更名为ca-bundle.crt放置到默认目录
$ mv cacert.pem ca-bundle.crt
$ mv ca-bundle.crt /etc/pki/tls/certs
下载并安装好证书后, 还是出现上述的问题,
根据证书过期联想到时间, 输入date命令查看时间, 原来是时间太小了,
用date -s 修改时间后,就可以正常的easy_install了。
三、Supervisor配置
接下来是对supervisor配置,首先要生成配置文件,在shell终端输入:
$ echo_supervisord_conf > /etc/supervisord.conf
可以通过文本编辑器修改这个文件,
$ vim /etc/supervisord.conf
下面是一个示例的配置文件:
;/etc/supervisord.conf
[unix_http_server]
file = /var/run/supervisor.sock
chmod = 0777
chown= root:root
[inet_http_server]
# Web管理界面设定
port=9001
;username = admin
;password = yourpassword
[supervisorctl]
; 必须和'unix_http_server'里面的设定匹配
serverurl = unix:///var/run/supervisord.sock
[supervisord]
logfile=/var/log/supervisord/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=true ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
user=root ; (default is current user, required if root)
childlogdir=/var/log/supervisord/ ; ('AUTO' child log dir, default $TEMP)
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
;管理的单个进程的配置,可以添加多个program
[program:tail]
command=tail -f /etc/supervisord.conf
autostart = true
startsecs = 5
user = felinx
redirect_stderr = true
; 对这个program的log的配置,上面的logfile_maxbytes是supervisord本身的log配置
stdout_logfile_maxbytes = 20MB
stdoiut_logfile_backups = 20
stdout_logfile = /var/log/supervisord/chatdemo.log
[program:app]
command=python app.py --port=61000
directory=/opt/test/supervisor ; 先进入到这个目录,再执行command, 对于程序环境要在当前目录的程序很有用
autostart=true ; start at supervisord start (default: true)
autorestart=unexpected ; whether/when to restart (default: unexpected)
startsecs=1 ; number of secs prog must stay running (def. 1)
user=root
; 配置一组进程,对于类似的program可以通过这种方式添加,避免手工一个个添加
[program:groupworker]
command=python /home/felinx/demos/groupworker/worker.py
numprocs=24
process_name=%(program_name)s_%(process_num)02d
autostart = true
startsecs = 5
user = felinx
redirect_stderr = true
stdout_logfile = /var/log/supervisord/groupworker.log
;(更多配置说明请参考:http://supervisord.org/configuration.html)
编辑完成后保存退出.
使用命令启动supervisor:
$ supervisord
$ supervisorctl
用ps命令查看运行情况,应用现在已经自动运行了。
四、Supervisord管理
Supervisord安装完成后有两个可用的命令行supervisord和supervisorctl,
命令使用解释如下:
? supervisord, 初始启动Supervisord,启动、管理配置中设置的进程。
? supervisorctl stop tail, 停止某一个进程(programxxx),programxxx为[program:chatdemon]里配置的值,这个示例就是chatdemon。
? supervisorctl start tail, 启动某个进程
? supervisorctl restart tail,重启某个进程
? supervisorctl stop tail: ,重启所有属于名为groupworker这个分组的进程(start,restart同理)
? supervisorctl stop all, 停止全部进程,注:start、restart、stop都不会载入最新的配置文件。
? supervisorctl reload, 载入最新的配置文件,停止原有进程并按新的配置启动、管理所有进程。
? supervisorctl update, 根据最新的配置文件,启动新配置或有改动的进程,配置没有改动的进程不会受影响而重启。
注意:显示用stop停止掉的进程,用reload或者update都不会自动重启。
五、页面管理
supervisor自带有Web Server, 可以通过页面来管理进程,
前提是开启配置文件中的[inet_http_server]项。
如果服务器是单网卡,可以修改如下:
[inet_http_server] ; inet (TCP) server disabled by default
port=127.0.0.1:51000 ; (ip_address:port specifier, *:port for all iface)
;username=user ; (default is no username (open server))
;password=123 ; (default is no password (open server))
如果是多网卡,则需要指定一张网卡:
[inet_http_server] ; inet (TCP) server disabled by default
port=192.168.2.13:51000 ; (ip_address:port specifier, *:port for all iface)
;username=user ; (default is no username (open server))
;password=123 ; (default is no password (open server))
在浏览器地址栏中输入:
http://192.168.2.13:51000
就可以进行页面化的管理了。