To maximize CPU usage (I run things on a Debian Lenny in EC2) I have a simple script to launch jobs in parallel:
为了最大化CPU使用率(我在EC2中的Debian Lenny上运行)我有一个简单的脚本来并行启动作业:
#!/bin/bash
for i in apache-200901*.log; do echo "Processing $i ..."; do_something_important; done &
for i in apache-200902*.log; do echo "Processing $i ..."; do_something_important; done &
for i in apache-200903*.log; do echo "Processing $i ..."; do_something_important; done &
for i in apache-200904*.log; do echo "Processing $i ..."; do_something_important; done &
...
I'm quite satisfied with this working solution, however I couldn't figure out how to write further code which only executed once all of the loops have been completed.
我对这个工作解决方案非常满意,但是我无法弄清楚如何编写进一步的代码,只有在所有循环完成后才执行。
Is there a way to get control of this?
有没有办法控制这个?
4 个解决方案
#1
There's a bash
builtin command for that.
有一个bash内置命令。
wait [n ...]
Wait for each specified process and return its termination sta‐
tus. Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job’s pipeline are
waited for. If n is not given, all currently active child pro‐
cesses are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
#2
Using GNU Parallel will make your script even shorter and possibly more efficient:
使用GNU Parallel将使您的脚本更短,可能更高效:
parallel 'echo "Processing "{}" ..."; do_something_important {}' ::: apache-*.log
This will run one job per CPU core and continue to do that until all files are processed.
这将为每个CPU核心运行一个作业,并继续执行此操作,直到处理完所有文件。
Your solution will basically split the jobs into groups before running. Here 32 jobs in 4 groups:
您的解决方案基本上会在运行之前将作业分成组。这里有4组32个职位:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
GNU Parallel会在完成后生成一个新进程 - 保持CPU处于活动状态,从而节省时间:
To learn more:
了解更多:
- Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
- Walk through the tutorial (man parallel_tutorial). You command line will love you for it.
观看介绍视频以获得快速介绍:https://www.youtube.com/playlist?list = PL284C9FF2488BC6D1
阅读教程(man parallel_tutorial)。你命令行会爱你。
#3
I had to do this recently and ended up with the following solution:
我最近必须这样做,最后得到以下解决方案:
while true; do
wait -n || {
code="$?"
([[ $code = "127" ]] && exit 0 || exit "$code")
break
}
done;
Here's how it works:
以下是它的工作原理:
wait -n
exits as soon as one of the (potentially many) background jobs exits. It always evaluates to true and the loop goes on until:
等待-n一旦(可能很多)后台作业退出就退出。它总是计算为true,循环继续,直到:
- Exit code
127
: the last background job successfully exited. In that case, we ignore the exit code and exit the sub-shell with code 0. - Any of the background job failed. We just exit the sub-shell with that exit code.
退出代码127:成功退出最后一个后台作业。在这种情况下,我们忽略退出代码并退出代码为0的子shell。
任何后台作业都失败了。我们只用退出代码退出子shell。
With set -e
, this will guarantee that the script will terminate early and pass through the exit code of any failed background job.
使用set -e,这将保证脚本将提前终止并通过任何失败的后台作业的退出代码。
#4
This is my crude solution:
这是我的原始解决方案:
function run_task {
cmd=$1
output=$2
concurency=$3
if [ -f ${output}.done ]; then
# experiment already run
echo "Command already run: $cmd. Found output $output"
return
fi
count=`jobs -p | wc -l`
echo "New active task #$count: $cmd > $output"
$cmd > $output && touch $output.done &
stop=$(($count >= $concurency))
while [ $stop -eq 1 ]; do
echo "Waiting for $count worker threads..."
sleep 1
count=`jobs -p | wc -l`
stop=$(($count > $concurency))
done
}
The idea is to use "jobs" to see how many children are active in the background and wait till this number drops (a child exits). Once a child exists, the next task can be started.
我们的想法是使用“工作”来查看有多少孩子在后台活动并等到这个数字下降(孩子退出)。一旦孩子存在,就可以开始下一个任务。
As you can see, there is also a bit of extra logic to avoid running the same experiments/commands multiple times. It does the job for me.. However, this logic could be either skipped or further improved (e.g., check for file creation timestamps, input parameters, etc.).
如您所见,还有一些额外的逻辑可以避免多次运行相同的实验/命令。它为我完成了工作。但是,可以跳过或进一步改进该逻辑(例如,检查文件创建时间戳,输入参数等)。
#1
There's a bash
builtin command for that.
有一个bash内置命令。
wait [n ...]
Wait for each specified process and return its termination sta‐
tus. Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job’s pipeline are
waited for. If n is not given, all currently active child pro‐
cesses are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
#2
Using GNU Parallel will make your script even shorter and possibly more efficient:
使用GNU Parallel将使您的脚本更短,可能更高效:
parallel 'echo "Processing "{}" ..."; do_something_important {}' ::: apache-*.log
This will run one job per CPU core and continue to do that until all files are processed.
这将为每个CPU核心运行一个作业,并继续执行此操作,直到处理完所有文件。
Your solution will basically split the jobs into groups before running. Here 32 jobs in 4 groups:
您的解决方案基本上会在运行之前将作业分成组。这里有4组32个职位:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
GNU Parallel会在完成后生成一个新进程 - 保持CPU处于活动状态,从而节省时间:
To learn more:
了解更多:
- Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
- Walk through the tutorial (man parallel_tutorial). You command line will love you for it.
观看介绍视频以获得快速介绍:https://www.youtube.com/playlist?list = PL284C9FF2488BC6D1
阅读教程(man parallel_tutorial)。你命令行会爱你。
#3
I had to do this recently and ended up with the following solution:
我最近必须这样做,最后得到以下解决方案:
while true; do
wait -n || {
code="$?"
([[ $code = "127" ]] && exit 0 || exit "$code")
break
}
done;
Here's how it works:
以下是它的工作原理:
wait -n
exits as soon as one of the (potentially many) background jobs exits. It always evaluates to true and the loop goes on until:
等待-n一旦(可能很多)后台作业退出就退出。它总是计算为true,循环继续,直到:
- Exit code
127
: the last background job successfully exited. In that case, we ignore the exit code and exit the sub-shell with code 0. - Any of the background job failed. We just exit the sub-shell with that exit code.
退出代码127:成功退出最后一个后台作业。在这种情况下,我们忽略退出代码并退出代码为0的子shell。
任何后台作业都失败了。我们只用退出代码退出子shell。
With set -e
, this will guarantee that the script will terminate early and pass through the exit code of any failed background job.
使用set -e,这将保证脚本将提前终止并通过任何失败的后台作业的退出代码。
#4
This is my crude solution:
这是我的原始解决方案:
function run_task {
cmd=$1
output=$2
concurency=$3
if [ -f ${output}.done ]; then
# experiment already run
echo "Command already run: $cmd. Found output $output"
return
fi
count=`jobs -p | wc -l`
echo "New active task #$count: $cmd > $output"
$cmd > $output && touch $output.done &
stop=$(($count >= $concurency))
while [ $stop -eq 1 ]; do
echo "Waiting for $count worker threads..."
sleep 1
count=`jobs -p | wc -l`
stop=$(($count > $concurency))
done
}
The idea is to use "jobs" to see how many children are active in the background and wait till this number drops (a child exits). Once a child exists, the next task can be started.
我们的想法是使用“工作”来查看有多少孩子在后台活动并等到这个数字下降(孩子退出)。一旦孩子存在,就可以开始下一个任务。
As you can see, there is also a bit of extra logic to avoid running the same experiments/commands multiple times. It does the job for me.. However, this logic could be either skipped or further improved (e.g., check for file creation timestamps, input parameters, etc.).
如您所见,还有一些额外的逻辑可以避免多次运行相同的实验/命令。它为我完成了工作。但是,可以跳过或进一步改进该逻辑(例如,检查文件创建时间戳,输入参数等)。