火花壳消耗纱线资源,如何阻止

时间:2023-02-06 21:27:41

Here is my issue, now, when I start using the spark shell, it would consume lot of resources and perhaps keep them bounded/held up; there by impacting other parallel running applications.

这是我的问题,现在,当我开始使用spark shell时,它将消耗大量资源并且可能使它们受限制/保持不动;通过影响其他并行运行的应用程序。

say for example, i am running some spark-shell commands and accidentally leave the shell open and not close the session, all resources it will keep held up, and all other users wont have anything to work on, unless i close my session

比如说,我正在运行一些spark-shell命令并且意外地让shell打开而不是关闭会话,它将保持所有资源,并且所有其他用户都没有任何工作,除非我关闭我的会话

How to fix this issue from yarn perspective.

如何从纱线角度解决这个问题。

1 个解决方案

#1


0  

You may want to set resource pools usage for Yarn in Cloudera. You can allocate some resources to each users. Even if you use all your resources, there will be some available for others users.

您可能希望在Cloudera中为Yarn设置资源池使用情况。您可以为每个用户分配一些资源。即使您使用了所有资源,也会为其他用户提供一些资源。

If you don't want to split yarn resources between users. You can set SPARK to use dynamic allocation (check spark.dynamicAllocation.enabled property in http://spark.apache.org/docs/latest/configuration.html). So if you leave your spark-shell opened and your job is finished, spark will give back the resource to Yarn. But you can't set the number of executors while using dynamic allocation

如果您不想在用户之间拆分纱线资源。您可以将SPARK设置为使用动态分配(请参阅http://spark.apache.org/docs/latest/configuration.html中的spark.dynamicAllocation.enabled属性)。因此,如果你打开你的火花壳并完成你的工作,那么火花将把资源回馈给纱线。但是在使用动态分配时无法设置执行程序的数量

Regards, Arnaud

#1


0  

You may want to set resource pools usage for Yarn in Cloudera. You can allocate some resources to each users. Even if you use all your resources, there will be some available for others users.

您可能希望在Cloudera中为Yarn设置资源池使用情况。您可以为每个用户分配一些资源。即使您使用了所有资源,也会为其他用户提供一些资源。

If you don't want to split yarn resources between users. You can set SPARK to use dynamic allocation (check spark.dynamicAllocation.enabled property in http://spark.apache.org/docs/latest/configuration.html). So if you leave your spark-shell opened and your job is finished, spark will give back the resource to Yarn. But you can't set the number of executors while using dynamic allocation

如果您不想在用户之间拆分纱线资源。您可以将SPARK设置为使用动态分配(请参阅http://spark.apache.org/docs/latest/configuration.html中的spark.dynamicAllocation.enabled属性)。因此,如果你打开你的火花壳并完成你的工作,那么火花将把资源回馈给纱线。但是在使用动态分配时无法设置执行程序的数量

Regards, Arnaud