bash fork错误(资源暂时不可用)不会停止,并在每次尝试终止/重新启动时继续出现

时间:2022-09-22 13:40:47

I mistakenly used a limited server as an iperf server for 5000 parallel connections. (limit is 1024 processes) Now every time I log in, I see this:

我错误地将一个有限的服务器作为一个iperf服务器,用于5000个并行连接。(极限是1024个进程)现在每次登录,我都会看到:

-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable

Then, I try to kill them, but when I do ps, I get this:

然后,我试图杀死他们,但当我做ps时,我得到了这个:

-bash-4.1$ ps
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable

Same happens when I do a killall or similar things. I have even tried to reboot the system but again this is what I get after reboot:

当我做杀人或类似的事情时也会发生同样的事情。我甚至尝试重新启动系统,但这也是我重新启动后得到的:

-bash-4.1$ sudo reboot
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
-bash-4.1$ 

So Basically I cannot do anything. all the commands get this error :/ I can, however, do "exit".

基本上我什么都做不了。所有的命令都得到这个错误:/然而,我可以执行“exit”。

This is an off-site server that I do not have physical access to, so I cannot turn it off/on physically.

这是一个非站点服务器,我没有物理访问权限,因此无法物理地关闭/打开它。

Any ideas how I can fix this problem? I highly appreciate any help.

有什么办法可以解决这个问题吗?我非常感谢任何帮助。

2 个解决方案

#1


18  

Given that you can login, you may want to try using exec to execute all your commands. After executing exec you will have to log in again, since exec will kill your shell (by replacing it with the command you run).

考虑到您可以登录,您可能想尝试使用exec来执行所有的命令。在执行exec之后,您将不得不再次登录,因为exec将杀死您的shell(通过使用运行的命令替换它)。

exec won't take up an extra process slot because it will replaces the running shell with the program to run. Thus, it should be able to bypass the ulimit restriction.

exec不会占用额外的进程槽,因为它将用要运行的程序替换正在运行的shell。因此,它应该能够绕过ulimit限制。

#2


5  

I had the same issue recently. In my case the reason was there was code that was executing under my ownership and consumed almost all the resources leaving nothing for my commands. Here's what I did, "exec top" to identify the PID thats consuming maximum resources "exec kill -9 " killing the PID identified by above command.

最近我也遇到了同样的问题。在我的例子中,原因是有代码在我的控制下执行,并消耗了几乎所有的资源,没有为我的命令留下任何东西。下面是我所做的,“exec top”来识别消耗最大资源的PID,“exec kill -9”杀死了上述命令识别的PID。

After killing the PID, everything came back to normal and I was able to login back.

杀死PID后,一切恢复正常,我可以重新登录。

#1


18  

Given that you can login, you may want to try using exec to execute all your commands. After executing exec you will have to log in again, since exec will kill your shell (by replacing it with the command you run).

考虑到您可以登录,您可能想尝试使用exec来执行所有的命令。在执行exec之后,您将不得不再次登录,因为exec将杀死您的shell(通过使用运行的命令替换它)。

exec won't take up an extra process slot because it will replaces the running shell with the program to run. Thus, it should be able to bypass the ulimit restriction.

exec不会占用额外的进程槽,因为它将用要运行的程序替换正在运行的shell。因此,它应该能够绕过ulimit限制。

#2


5  

I had the same issue recently. In my case the reason was there was code that was executing under my ownership and consumed almost all the resources leaving nothing for my commands. Here's what I did, "exec top" to identify the PID thats consuming maximum resources "exec kill -9 " killing the PID identified by above command.

最近我也遇到了同样的问题。在我的例子中,原因是有代码在我的控制下执行,并消耗了几乎所有的资源,没有为我的命令留下任何东西。下面是我所做的,“exec top”来识别消耗最大资源的PID,“exec kill -9”杀死了上述命令识别的PID。

After killing the PID, everything came back to normal and I was able to login back.

杀死PID后,一切恢复正常,我可以重新登录。