I meet a lot weird thing when using celery. Such as, I update tasks.py, supervisorctl reload
(restart), but tasks is wrong. Some tasks seems disappear and so on.
Today I found that because supervisorctl stop all
can not stop all celery workers. And only kill -9 'pgrep python' can kill them all.
使用芹菜时,我遇到了很多奇怪的事情。比如,我更新tasks.py,supervisorctl reload(重启),但任务错误。有些任务似乎消失了,等等。今天我发现,因为supervisorctl停止所有不能阻止所有芹菜工人。只有杀死-9'pgrep python'才能杀死所有人。
situation:
root@ubuntu12:/data/www/article_fetcher# supervisorctl
celery_beat RUNNING pid 29597, uptime 0:52:18
celery_worker1 RUNNING pid 29556, uptime 0:52:20
celery_worker2 RUNNING pid 29570, uptime 0:52:19
celery_worker3 RUNNING pid 29557, uptime 0:52:20
celery_worker4 RUNNING pid 29586, uptime 0:52:18
uwsgi RUNNING pid 29604, uptime 0:52:18
supervisor> stop all
celery_beat: stopped
celery_worker2: stopped
celery_worker4: stopped
celery_worker3: stopped
uwsgi: stopped
celery_worker1: stopped
supervisor> status
celery_beat STOPPED Aug 04 11:05 AM
celery_worker1 STOPPED Aug 04 11:05 AM
celery_worker2 STOPPED Aug 04 11:05 AM
celery_worker3 STOPPED Aug 04 11:05 AM
celery_worker4 STOPPED Aug 04 11:05 AM
uwsgi STOPPED Aug 04 11:05 AM
processes:
root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root 8683 0.0 0.1 61420 11768 ? Ss Aug03 0:27 /usr/bin/python /usr/bin/supervisord
root 29310 0.1 0.1 57120 11344 pts/2 S+ 11:05 0:00 /usr/bin/python /usr/bin/supervisorctl
nobody 29556 2.2 0.5 132484 45988 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29557 2.2 0.5 132480 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29570 2.4 0.5 132740 45996 ? S 11:06 0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody 29571 26.9 1.4 217688 115804 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29572 33.7 0.7 158396 59808 ? R 11:06 0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody 29573 29.6 1.4 215176 115928 ? R 11:06 0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody 29574 27.2 1.4 218244 118180 ? R 11:06 0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......
I found this question:Stopping Supervisor doesn't stop Celery workers, but it is asking different thing, the accepted answer supervisorctl stop all
do not work actually.So I decide find the right way.
我发现了这个问题:停止主管不会阻止Celery工作者,但它会问不同的事情,接受的答案supervisorctl停止所有实际上都不起作用。所以我决定找到正确的方法。
1 个解决方案
#1
2
I look into supervisor docs and find this:
我调查了主管文档并找到了这个:
killasgroup
If true, when resorting to send SIGKILL to the program to terminate it send it to its whole process group instead, taking care of its children as well, useful e.g with Python programs using multiprocessing.
如果为true,那么当尝试将SIGKILL发送到程序以终止它时,将其发送到整个进程组,同时照顾它的子进程,这对于使用多处理的Python程序很有用。
Default: false
Required: No.
Introduced: 3.0a11
Then I think that each worker create 4 child process(by cpu cores) become a process group, that's why supervisorctl stop all
do not work.
So I add killasgroup
to supervisord.conf:
然后我认为每个工作者创建4个子进程(通过cpu核心)成为一个进程组,这就是为什么supervisorctl停止所有不工作的原因。所以我将killasgroup添加到supervisord.conf:
[program:celery_worker1]
; Set full path to celery program if using virtualenv
directory=/data/www/article_fetcher
command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
user=nobody
numprocs=1
stdout_logfile=/data/www/article_fetcher/logs/celery.log
stderr_logfile=/data/www/article_fetcher/logs/celery.log
autostart=true
autorestart=true
startsecs=5
killasgroup=true
.....
.....
Then supervisorctl stop all
really stop celery workers! very well~
然后supervisorctl停止所有真正停止芹菜工人!非常好〜
#1
2
I look into supervisor docs and find this:
我调查了主管文档并找到了这个:
killasgroup
If true, when resorting to send SIGKILL to the program to terminate it send it to its whole process group instead, taking care of its children as well, useful e.g with Python programs using multiprocessing.
如果为true,那么当尝试将SIGKILL发送到程序以终止它时,将其发送到整个进程组,同时照顾它的子进程,这对于使用多处理的Python程序很有用。
Default: false
Required: No.
Introduced: 3.0a11
Then I think that each worker create 4 child process(by cpu cores) become a process group, that's why supervisorctl stop all
do not work.
So I add killasgroup
to supervisord.conf:
然后我认为每个工作者创建4个子进程(通过cpu核心)成为一个进程组,这就是为什么supervisorctl停止所有不工作的原因。所以我将killasgroup添加到supervisord.conf:
[program:celery_worker1]
; Set full path to celery program if using virtualenv
directory=/data/www/article_fetcher
command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
user=nobody
numprocs=1
stdout_logfile=/data/www/article_fetcher/logs/celery.log
stderr_logfile=/data/www/article_fetcher/logs/celery.log
autostart=true
autorestart=true
startsecs=5
killasgroup=true
.....
.....
Then supervisorctl stop all
really stop celery workers! very well~
然后supervisorctl停止所有真正停止芹菜工人!非常好〜