现象:nginx域名配置合并之后,发现consul-template无法完成nginx重载,然后发现需要重启nginx,才能让配置生效。
注意:下次哪个服务有报错,就看重启时所有日志输出,各种情况日志输出。不要忽略细节。很多时候其实已经看到了问题,却没有深入查看问题。
查看进程最大打开文件个数
#cat /proc/31146/limits|grep "Max open files"
Max open files 1024 4096 files
# cat /usr/lib/systemd/system/openresty.service
[Unit]
[Service]
LimitNOFILE=655350
[Install]
#
consul-template无法重载,是因为进程本身无法重载,进程无法打开文件了
[root@vm-nginx003.mm.machangwei.com mcw]# ps -ef|grep nginx root 3114 1 0 02:19 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 3115 3114 0 02:19 ? 00:00:00 nginx: worker process nobody 3116 3114 0 02:19 ? 00:00:00 nginx: worker process root 3132 9036 0 02:19 pts/1 00:00:00 grep --color=auto nginx [root@vm-nginx003.mm.machangwei.com mcw]# [root@vm-nginx003.mm.machangwei.com mcw]# systemctl reload openresty [root@vm-nginx003.mm.machangwei.com mcw]# ps -ef|grep nginx root 3114 1 2 02:19 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 3115 3114 0 02:19 ? 00:00:00 nginx: worker process is shutting down nobody 3116 3114 0 02:19 ? 00:00:00 nginx: worker process is shutting down nobody 3171 3114 4 02:20 ? 00:00:00 nginx: worker process nobody 3172 3114 4 02:20 ? 00:00:00 nginx: worker process root 3175 9036 0 02:20 pts/1 00:00:00 grep --color=auto nginx [root@vm-nginx003.mm.machangwei.com mcw]#
查看日志报错:有打开太多文件数
[root@vm-nginx003.mm.machangwei.com mcw]# ls /data/logs/nginx/nginx_error.log /data/logs/nginx/nginx_error.log [root@vm-nginx003.mm.machangwei.com mcw]# vim /data/logs/nginx/nginx_error.log 2024/05/18 00:59:57 [notice] 15124#15124: exit 2024/05/18 00:59:57 [notice] 1586#1586: signal 17 (SIGCHLD) received from 15123 2024/05/18 00:59:57 [notice] 1586#1586: signal 14 (SIGALRM) received 2024/05/18 00:59:57 [notice] 1586#1586: worker process 15123 exited with code 0 2024/05/18 00:59:57 [notice] 1586#1586: worker process 15124 exited with code 0 2024/05/18 00:59:57 [notice] 1586#1586: exit 2024/05/18 01:00:15 [notice] 15493#15493: using the "epoll" event method 2024/05/18 01:00:15 [notice] 15493#15493: openresty/1.19.3.1 2024/05/18 01:00:15 [notice] 15493#15493: built by gcc 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC) 2024/05/18 01:00:15 [notice] 15493#15493: OS: Linux 5.4.65-200.el7.x86_64 2024/05/18 01:00:15 [notice] 15493#15493: getrlimit(RLIMIT_NOFILE): 1024:4096 2024/05/18 01:00:15 [notice] 15496#15496: start worker processes 2024/05/18 01:00:15 [notice] 15496#15496: start worker process 15497 2024/05/18 01:00:15 [notice] 15496#15496: start worker process 15498 2024/05/18 01:00:29 [notice] 15496#15496: signal 1 (SIGHUP) received from 15545, reconfiguring 2024/05/18 01:00:29 [notice] 15496#15496: reconfiguring 2024/05/18 01:00:29 [warn] 15496#15496: conflicting server name "test-content-review.mcwcn.com" on 0.0.0.0:80, ignored 2024/05/18 01:00:29 [warn] 15496#15496: could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 51 2 or server_names_hash_bucket_size: 128; ignoring server_names_hash_bucket_size 2024/05/18 01:00:29 [emerg] 15496#15496: open() "/data/logs/nginx/dev-account-mcwapps_error.log" failed (24: Too many open files) 2024/05/18 01:01:03 [notice] 15496#15496: signal 1 (SIGHUP) received from 15723, reconfiguring 2024/05/18 01:01:03 [notice] 15496#15496: reconfiguring 2024/05/18 01:01:03 [warn] 15496#15496: conflicting server name "test-content-review.mcwcn.com" on 0.0.0.0:80, ignored 2024/05/18 01:01:03 [warn] 15496#15496: could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 51 2 or server_names_hash_bucket_size: 128; ignoring server_names_hash_bucket_size 2024/05/18 01:01:03 [emerg] 15496#15496: open() "/data/logs/nginx/dev-account-mcwapps_error.log" failed (24: Too many open files) 2024/05/18 01:01:14 [notice] 15496#15496: signal 1 (SIGHUP) received from 15750, reconfiguring 2024/05/18 01:01:14 [notice] 15496#15496: reconfiguring 2024/05/18 01:00:29 [warn] 15496#15496: conflicting server name "test-content-review.mcwcn.com" on 0.0.0.0:80, ignored 2024/05/18 01:00:29 [warn] 15496#15496: could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 51 2 or server_names_hash_bucket_size: 128; ignoring server_names_hash_bucket_size 2024/05/18 01:00:29 [emerg] 15496#15496: open() "/data/logs/nginx/dev-account-mcwapps_error.log" failed (24: Too many open files) 2024/05/18 01:01:03 [notice] 15496#15496: signal 1 (SIGHUP) received from 15723, reconfiguring 2024/05/18 01:01:03 [notice] 15496#15496: reconfiguring 2024/05/18 01:01:03 [warn] 15496#15496: conflicting server name "test-content-review.mcwcn.com" on 0.0.0.0:80, ignored 2024/05/18 01:01:03 [warn] 15496#15496: could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 51 2 or server_names_hash_bucket_size: 128; ignoring server_names_hash_bucket_size 2024/05/18 01:01:03 [emerg] 15496#15496: open() "/data/logs/nginx/dev-account-mcwapps_error.log" failed (24: Too many open files) 2024/05/18 01:01:14 [notice] 15496#15496: signal 1 (SIGHUP) received from 15750, reconfiguring 2024/05/18 01:01:14 [notice] 15496#15496: reconfiguring 2024/05/18 01:01:14 [warn] 15496#15496: conflicting server name "test-content-review.mcwcn.com" on 0.0.0.0:80, ignored 2024/05/18 01:01:14 [warn] 15496#15496: could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 51 2 or server_names_hash_bucket_size: 128; ignoring server_names_hash_bucket_size 2024/05/18 01:01:14 [emerg] 15496#15496: open() "/data/logs/nginx/dev-account-mcwapps_error.log" failed (24: Too many open files) 2024/05/18 01:02:46 [notice] 15496#15496: signal 1 (SIGHUP) received from 16163, reconfiguring 2024/05/18 01:02:46 [notice] 15496#15496: reconfiguring 2024/05/18 01:02:46 [warn] 15496#15496: conflicting server name "test-content-review.mcwcn.com" on 0.0.0.0:80, ignored 2024/05/18 01:02:46 [warn] 15496#15496: could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 51 2 or server_names_hash_bucket_size: 128; ignoring server_names_hash_bucket_size 2024/05/18 01:02:46 [emerg] 15496#15496: open() "/data/logs/nginx/dev-account-mcwapps_error.log" failed (24: Too many open files) 2024/05/18 01:03:54 [notice] 15496#15496: signal 3 (SIGQUIT) received from 16526, shutting down 2024/05/18 01:03:54 [notice] 15498#15498: gracefully shutting down 2024/05/18 01:03:54 [notice] 15498#15498: signal 15 (SIGTERM) received from 1, exiting 2024/05/18 01:03:54 [notice] 15496#15496: signal 15 (SIGTERM) received from 1, exiting 2024/05/18 01:03:54 [notice] 15497#15497: signal 15 (SIGTERM) received from 1, exiting
这里查看的,不是那个进程所能打开的最大个数
[root@vm-nginx003.mm.machangwei.com mcw]# cat /etc/security/limits.conf # End of file * soft nproc 655350 * hard nproc 655350 * soft nofile 655350 * hard nofile 655350 [root@vm-nginx003.mm.machangwei.com mcw]#
[root@vm-nginx003.mm.machangwei.com mcw]# sysctl -a|grep fs.file-max fs.file-max = 7930900 [root@vm-nginx003.mm.machangwei.com mcw]#
找到进程id,查看进程可以打开的最大个数,虽然nginx配置了worker可以打开很多个文件,但是也没有设置master进程打开文件个数
[root@vm-nginx003.mm.machangwei.com mcw]# ps -ef|grep nginx root 31146 1 0 02:01 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 31147 31146 0 02:01 ? 00:00:00 nginx: worker process nobody 31148 31146 0 02:01 ? 00:00:00 nginx: worker process root 32633 9036 0 02:07 pts/1 00:00:00 grep --color=auto nginx [root@vm-nginx003.mm.machangwei.com mcw]# [root@vm-nginx003.mm.machangwei.com mcw]# cat /proc/31146/limits|grep "Max open files" Max open files 1024 4096 files [root@vm-nginx003.mm.machangwei.com mcw]#
master进程是systemd启动,systemd启动的进程需要设置打开文件大小个数,新增配置项LimitNOFILE=655350,把数弄大点
[root@vm-nginx003.mm.machangwei.com mcw]# cat /usr/lib/systemd/system/openresty.service [Unit] Description=The OpenResty Application Platform After=syslog.target network-online.target remote-fs.target nss-lookup.target Wants=network-online.target [Service] LimitNOFILE=655350 Type=forking PIDFile=/usr/local/openresty/nginx/logs/nginx.pid ExecStartPre=/usr/local/openresty/nginx/sbin/nginx -t ExecStart=/usr/local/openresty/nginx/sbin/nginx #ExecStart=/bin/openresty ExecReload=/bin/kill -s HUP $MAINPID ExecStop=/bin/kill -s QUIT $MAINPID PrivateTmp=true [Install] WantedBy=multi-user.target [root@vm-nginx003.mm.machangwei.com mcw]#
重启之后,查看进程最大支持打开文件个数,已经被修改了
[root@vm-nginx003.mm.machangwei.com mcw]# ps -ef|grep nginx root 3114 1 0 02:19 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 3223 3114 0 02:20 ? 00:00:00 nginx: worker process nobody 3224 3114 0 02:20 ? 00:00:00 nginx: worker process root 3618 9036 0 02:21 pts/1 00:00:00 grep --color=auto nginx [root@vm-nginx003.mm.machangwei.com mcw]# [root@vm-nginx003.mm.machangwei.com mcw]# cat /proc/3114/limits|grep "Max open files" Max open files 655350 655350 files [root@vm-nginx003.mm.machangwei.com mcw]#
多次重载nginx,可以看到旧的子进程在关闭,新的子进程在启动代替它
[root@vm-nginx003.mm.machangwei.com mcw]# ps -ef|grep nginx root 3114 1 0 02:19 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 3115 3114 0 02:19 ? 00:00:00 nginx: worker process nobody 3116 3114 0 02:19 ? 00:00:00 nginx: worker process root 3132 9036 0 02:19 pts/1 00:00:00 grep --color=auto nginx [root@vm-nginx003.mm.machangwei.com mcw]# [root@vm-nginx003.mm.machangwei.com mcw]# systemctl reload openresty [root@vm-nginx003.mm.machangwei.com mcw]# ps -ef|grep nginx root 3114 1 2 02:19 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 3115 3114 0 02:19 ? 00:00:00 nginx: worker process is shutting down nobody 3116 3114 0 02:19 ? 00:00:00 nginx: worker process is shutting down nobody 3171 3114 4 02:20 ? 00:00:00 nginx: worker process nobody 3172 3114 4 02:20 ? 00:00:00 nginx: worker process root 3175 9036 0 02:20 pts/1 00:00:00 grep --color=auto nginx [root@vm-nginx003.mm.machangwei.com mcw]#
systemd 启动配置openresty.service,配置的是master打开的文件个数,nginx.conf里面设置的是worker打开的文件个数
# ps -ef|grep nginx root 3114 1 0 02:19 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx nobody 3223 3114 0 02:20 ? 00:00:01 nginx: worker process nobody 3224 3114 0 02:20 ? 00:00:01 nginx: worker process root 14607 9036 0 03:07 pts/1 00:00:00 grep --color=auto nginx # # cat /proc/3114/limits |grep "Max open files" Max open files 655350 655350 files # cat /proc/3223/limits |grep "Max open files" Max open files 51200 51200 files #
=====
参考信息如下:
nginx重载,/bin/kill -s HUP 不生效
在Nginx中,/bin/kill -s HUP
命令通常用于重新加载Nginx的配置文件,而不是挂起(suspend)Nginx进程。如果你发现这个命令没有生效,可能是由以下几个原因造成的:
-
错误的进程ID:确保你发送HUP信号的进程ID(PID)是正确的。你可以通过
ps
命令查找Nginx的主进程ID。bash复制代码ps aux | grep nginx 通常,你会看到类似
nginx master process
的行,后面跟着的就是主进程的PID。 -
权限问题:你可能没有足够的权限来发送信号给Nginx进程。确保你具有发送信号到该进程的权限,或者使用root用户来执行该命令。
-
Nginx进程未运行:确保Nginx服务正在运行。你可以使用
systemctl status nginx
(对于使用systemd的系统)或service nginx status
(对于使用SysVinit的系统)来检查Nginx的状态。 -
Nginx配置文件有错误:如果Nginx的配置文件存在语法错误,
nginx -s reload
或/bin/kill -s HUP [nginx_pid]
可能不会生效,因为Nginx无法重新加载一个无效的配置。你可以使用nginx -t
来测试配置文件的语法是否正确。 -
Nginx不是通过init系统启动的:如果Nginx不是通过init系统(如systemd或SysVinit)启动的,那么它可能不会响应HUP信号。确保Nginx是通过标准的init系统服务来管理的。
-
SELinux或AppArmor安全策略:如果你的系统启用了SELinux或AppArmor这样的安全模块,它们可能会阻止进程之间的信号传递。检查相关的安全策略或日志文件以确认是否有相关的限制。
-
其他系统级限制:有些系统级的限制(如cgroups、资源限制等)可能会阻止进程接收或处理信号。
为了解决这个问题,你可以尝试以下步骤:
- 使用正确的Nginx进程ID重新发送HUP信号。
- 确保你具有发送信号所需的权限。
- 检查Nginx的状态和日志文件,确认它是否正在运行且没有错误。
- 使用
nginx -t
测试Nginx配置文件的语法是否正确。 - 如果可能的话,尝试通过init系统(如
systemctl reload nginx
)来重新加载Nginx配置。 - 检查SELinux或AppArmor的安全策略,确保它们没有阻止信号传递。
- 如果以上步骤都无法解决问题,考虑重启Nginx服务(但请注意,这会导致短暂的服务中断)。
nginx报错 open() "_error.log" failed (24: Too many open files)
Nginx 报错 "open() "_error.log" failed (24: Too many open files)" 意味着 Nginx 进程尝试打开的文件数量超过了系统或用户级别的限制。在 Linux 系统中,每个进程和每个用户都有打开文件数量的限制。
要解决这个问题,你可以采取以下几个步骤:
-
检查当前限制:
你可以使用ulimit -n
命令来查看当前 shell 会话中打开文件的数量限制。然而,Nginx 进程的限制可能由系统或用户级别的限制设置,而不是由 shell 会话设置。为了查看 Nginx 进程的实际限制,你可以查看
/proc/[nginx_pid]/limits
目录中的文件,其中[nginx_pid]
是 Nginx 主进程的 PID。或者,你可以使用cat /proc/[nginx_pid]/limits | grep 'Max open files'
命令。 -
增加用户级别的限制:
如果你确定问题是用户级别的限制,你可以修改/etc/security/limits.conf
文件,添加或修改类似下面的行来增加打开文件的限制:复制代码* soft nofile 4096 * hard nofile 65535 或者,只为运行 Nginx 的用户(如
www-data
)设置限制:复制代码www-data soft nofile 4096 www-data hard nofile 65535 注意:修改这些限制后,需要重启 Nginx 或重新登录用户才能生效。
-
增加系统级别的限制:
如果你发现系统级别的限制也很低,你可能需要修改/etc/sysctl.conf
文件(或相应的 sysctl 配置位置),增加fs.file-max
的值。例如:复制代码fs.file-max = 100000 修改后,需要运行
sysctl -p
或重启系统来应用新的设置。 -
优化 Nginx 配置:
确保你的 Nginx 配置中没有导致打开过多文件的设置。例如,避免不必要的日志记录、优化缓存设置等。 -
检查文件描述符泄漏:
如果 Nginx 进程持续打开新的文件描述符而没有关闭它们,这可能是资源泄漏的迹象。使用工具如lsoft
或strace
可以帮助你诊断这种情况。 -
重启 Nginx:
在修改限制或修复潜在的问题后,重启 Nginx 以确保更改生效。 -
监控和日志:
监控 Nginx 进程的文件描述符使用情况,并定期检查日志文件以获取可能的线索。