My Java program is failing with
我的Java程序失败了
Caused by: java.io.IOException: Too many open files
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:883)...
Here are key lines from /etc/security/limits.conf
. They set the max files for a user at 500k:
以下是/etc/security/limits.conf中的关键行。他们将用户的最大文件设置为500k:
root soft nofile 500000
root hard nofile 500000
* soft nofile 500000
* hard nofile 500000
I ran lsof
to to count the number of files open -- both globally and by the jvm process. I examined counters in /proc/sys/fs
. All seems OK. My process only has 4301 files open and the limit is 500k:
我运行lsof来计算打开的文件数 - 全局和jvm进程。我检查了/ proc / sys / fs中的计数器。一切似乎都好。我的进程只打开4301个文件,限制为500k:
:~# lsof | wc -l
5526
:~# lsof -uusername | wc -l
4301
:~# cat /proc/sys/fs/file-max
744363
:~# cat /proc/sys/fs/file-max
744363
:~# cat /proc/sys/fs/file-nr
4736 0 744363
This is an Ubuntu 11.04 server. I have even rebooted so I am positive these parameters are being used.
这是一个Ubuntu 11.04服务器。我甚至重新启动所以我很肯定这些参数正在被使用。
I don't know if it's relevant, but the process is started by an upstart script, which starts the process using setuidgid, like this:
我不知道它是否相关,但是这个过程是由一个upstart脚本启动的,它使用setuidgid启动进程,如下所示:
exec setuidgid username java $JAVA_OPTS -jar myprogram.jar
What I am missing?
我错过了什么?
2 个解决方案
#1
16
It turns out the problem was that my program was running as an upstart init script, and that the exec
stanza does not invoke a shell. ulimit
and the settings in limits.conf apply only to user processes in a shell.
事实证明问题是我的程序作为一个新手init脚本运行,并且exec节没有调用shell。 ulimit和limits.conf中的设置仅适用于shell中的用户进程。
I verified this by changing the exec stanza to
我通过更改exec节来验证这一点
exec sudo -u username java $JAVA_OPTS -jar program.jar
which runs java in username's default shell. That allowed the program to use as many open files as it needs.
在用户名的默认shell中运行java。这使得程序可以根据需要使用尽可能多的打开文件。
I have seen it mentioned that you can also call ulimit -n
prior to invoking the command; for an upstart script I think you would use a script
stanza instead.
我看到它提到你也可以在调用命令之前调用ulimit -n;对于一个upstart脚本,我认为你会使用脚本节。
I found a better diagnostic than lsof
to be ls /proc/{pid}/fd | wc -l
, to obtain a precise count of the open file descriptor. By monitoring that I could see that the failures occurred right at 4096 open fds. I don't know where that 4096 comes from; it's not in /etc anywhere; I guess it's compiled into the kernel.
我发现一个比lsof更好的诊断是ls / proc / {pid} / fd | wc -l,获取打开文件描述符的精确计数。通过监控我可以看到故障发生在4096开放fds。我不知道4096来自哪里;它不在/ etc中;我猜它已编译到内核中。
#2
4
I have this snippet of bash at the top of a server creation script:
我在服务器创建脚本的顶部有这个bash片段:
# Jack up the max number of open file descriptors at the kernel
echo "fs.file-max = 1000000" >> /etc/sysctl.conf
invoke-rc.d procps start
# Increase max open file descriptors for this process
ulimit -n 1000000
# And for future ones as well
cat >> /etc/profile <<LIMITS
ulimit -n 1000000
LIMITS
cat >> /etc/security/limits.conf <<LIMITS
root - nofile 1000000
LIMITS
#1
16
It turns out the problem was that my program was running as an upstart init script, and that the exec
stanza does not invoke a shell. ulimit
and the settings in limits.conf apply only to user processes in a shell.
事实证明问题是我的程序作为一个新手init脚本运行,并且exec节没有调用shell。 ulimit和limits.conf中的设置仅适用于shell中的用户进程。
I verified this by changing the exec stanza to
我通过更改exec节来验证这一点
exec sudo -u username java $JAVA_OPTS -jar program.jar
which runs java in username's default shell. That allowed the program to use as many open files as it needs.
在用户名的默认shell中运行java。这使得程序可以根据需要使用尽可能多的打开文件。
I have seen it mentioned that you can also call ulimit -n
prior to invoking the command; for an upstart script I think you would use a script
stanza instead.
我看到它提到你也可以在调用命令之前调用ulimit -n;对于一个upstart脚本,我认为你会使用脚本节。
I found a better diagnostic than lsof
to be ls /proc/{pid}/fd | wc -l
, to obtain a precise count of the open file descriptor. By monitoring that I could see that the failures occurred right at 4096 open fds. I don't know where that 4096 comes from; it's not in /etc anywhere; I guess it's compiled into the kernel.
我发现一个比lsof更好的诊断是ls / proc / {pid} / fd | wc -l,获取打开文件描述符的精确计数。通过监控我可以看到故障发生在4096开放fds。我不知道4096来自哪里;它不在/ etc中;我猜它已编译到内核中。
#2
4
I have this snippet of bash at the top of a server creation script:
我在服务器创建脚本的顶部有这个bash片段:
# Jack up the max number of open file descriptors at the kernel
echo "fs.file-max = 1000000" >> /etc/sysctl.conf
invoke-rc.d procps start
# Increase max open file descriptors for this process
ulimit -n 1000000
# And for future ones as well
cat >> /etc/profile <<LIMITS
ulimit -n 1000000
LIMITS
cat >> /etc/security/limits.conf <<LIMITS
root - nofile 1000000
LIMITS