I have an R analysis composed of three parts (partA
, partB
, and partC
). I submit each part to SLURM (e.g. sbatch partA
), and each part is parallelized via #SBATCH --array=1-1500
. The parts are in serial, so I need to wait for one to finish before starting the next. Right now I'm manually starting each job, but that's not a great solution.
我有一个由三部分组成的R分析(partA, partB, partC)。我将每个部分提交给SLURM(例如sbatch partA),每个部分都是通过# sbatch(数组=1-1500)并行化的。零件是串行的,所以我需要等一个零件完成后再开始下一个。现在我手动地开始每一项工作,但这不是一个好的解决方案。
I would like to automate the three sbatch calls. For example:
我想自动化这三个sbatch调用。例如:
sbatch partA
- sbatch partA
- when
partA
is done,sbatch partB
- 当partA完成时,sbatch partB
- when
partB
is done,sbatch partC
- 当partB完成时,sbatch partC。
I used this solution to get the job ID of partA
, and pass that to strigger
to accomplish step 2 above. However I'm stuck at that point, because I don't know how to get the job ID of partB
from strigger
. Here's what my code looks like:
我使用此解决方案获得partA的作业ID,并将其传递给strigger来完成上面的步骤2。但是我一直被困在那里,因为我不知道如何从脱衣舞娘那里得到partB的工作ID。我的代码是这样的:
#!/bin/bash
# step 1: sbatch partA
partA_ID=$(sbatch --parsable partA.sh)
# step 2: sbatch partB
strigger --set --jobid=$partA_ID --fini --program=/path/to/partB.batch
# step 3: sbatch partC
... ?
How do I complete step 3?
如何完成第3步?
1 个解决方案
#1
2
strigger
is not the proper tool to achieve that goal, it is more aimed at administrators than regular users. Only slurm user
can actually set triggers (see the "Important note" in the strigger manpage).
strigger并不是实现这个目标的合适工具,它更多的是针对管理员而不是普通用户。只有slurm用户可以设置触发器(请参阅strigger manpage中的“重要提示”)。
In your case, you should submit all three jobs at once, with dependencies set among them.
在您的示例中,您应该一次提交所有三个作业,并在它们之间设置依赖项。
For instance:
例如:
$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=afterany:${partA_ID} partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=afterany:${partB_ID} partC.sh)
This will submit three job arrays but the second one will only start when all jobs in the first one have finished. And the third one will only start when all jobs in the second one have finished.
这将提交三个作业数组,但第二个作业将只在第一个作业中的所有作业完成时开始。第三个任务只有在第二个任务完成后才会开始。
An alternative can be
另一个可以
$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=aftercorr:${partA_ID} partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=aftercorr:${partB_ID} partC.sh)
This will submit three job arrays, but the all jobs of the second one will not start until the corresponding job in the first one (i.e. job that has the same $SLURM_ARRAY_TASK_ID
) has finished. And all jobs in the third one will start only when the corresponding job in the second one have finished.
这将提交三个作业数组,但是第二个作业的所有作业直到第一个作业(即具有相同$SLURM_ARRAY_TASK_ID的作业)中的相应作业完成之后才会开始。第三个任务中的所有任务只有在第二个任务完成时才会开始。
For more details, see the --dependency
section in the sbatch manpage.
有关详细信息,请参阅sbatch manpage中的-dependency部分。
#1
2
strigger
is not the proper tool to achieve that goal, it is more aimed at administrators than regular users. Only slurm user
can actually set triggers (see the "Important note" in the strigger manpage).
strigger并不是实现这个目标的合适工具,它更多的是针对管理员而不是普通用户。只有slurm用户可以设置触发器(请参阅strigger manpage中的“重要提示”)。
In your case, you should submit all three jobs at once, with dependencies set among them.
在您的示例中,您应该一次提交所有三个作业,并在它们之间设置依赖项。
For instance:
例如:
$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=afterany:${partA_ID} partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=afterany:${partB_ID} partC.sh)
This will submit three job arrays but the second one will only start when all jobs in the first one have finished. And the third one will only start when all jobs in the second one have finished.
这将提交三个作业数组,但第二个作业将只在第一个作业中的所有作业完成时开始。第三个任务只有在第二个任务完成后才会开始。
An alternative can be
另一个可以
$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=aftercorr:${partA_ID} partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=aftercorr:${partB_ID} partC.sh)
This will submit three job arrays, but the all jobs of the second one will not start until the corresponding job in the first one (i.e. job that has the same $SLURM_ARRAY_TASK_ID
) has finished. And all jobs in the third one will start only when the corresponding job in the second one have finished.
这将提交三个作业数组,但是第二个作业的所有作业直到第一个作业(即具有相同$SLURM_ARRAY_TASK_ID的作业)中的相应作业完成之后才会开始。第三个任务中的所有任务只有在第二个任务完成时才会开始。
For more details, see the --dependency
section in the sbatch manpage.
有关详细信息,请参阅sbatch manpage中的-dependency部分。