I was searching Google for a page offering some simple OpenMp algorithms. Probably there is an example to calculate min, max, median, average from a huge data array but I am not capable to find it.
我正在谷歌搜索一个提供一些简单的OpenMp算法的页面。可能有一个例子来计算巨大数据阵列的最小值,最大值,中值,平均值,但我无法找到它。
At least I would normally try to divide the array into one chunk for each core and do some boundary calculation afterwards to get the result for the complete array.
至少我通常会尝试将数组划分为每个核心的一个块,然后进行一些边界计算以获得完整数组的结果。
I just didn't want to reinvent the wheel.
我只是不想重新发明*。
Additional Remark: I know that there are thousands of examples that work with simple reduction. e.g. Calculating PI.
补充说明:我知道有成千上万的例子可以简单地减少。例如计算PI。
const int num_steps = 100000;
double x, sum = 0.0;
const double step = 1.0/double(num_steps);
#pragma omp parallel for reduction(+:sum) private(x)
for (int i=1;i<= num_steps; i++){
x = double(i-0.5)*step;
sum += 4.0/(1.0+x*x);
}
const double pi = step * sum;
but when these kind of algorithms aren't usable there are almost no examples left for reducing algorithms.
但是当这些算法不可用时,几乎没有留下用于减少算法的例子。
4 个解决方案
#1
23
OpenMP (at least 2.0) supports reduction for some simple operations, but not for max and min.
OpenMP(至少2.0)支持减少一些简单的操作,但不支持max和min。
In the following example the reduction
clause is used to make a sum and a critical
section is used to update a shared variable using a thread-local one without conflicts.
在以下示例中,reduction子句用于生成总和,而临界部分用于使用线程本地更新共享变量而不会发生冲突。
#include <iostream>
#include <cmath>
int main()
{
double sum = 0;
uint64_t ii;
uint64_t maxii = 0;
uint64_t maxii_shared = 0;
#pragma omp parallel shared(maxii_shared) private(ii) firstprivate(maxii)
{
#pragma omp for reduction(+:sum) nowait
for(ii=0; ii<10000000000; ++ii)
{
sum += std::pow((double)ii, 2.0);
if(ii > maxii) maxii = ii;
}
#pragma omp critical
{
if(maxii > maxii_shared) maxii_shared = maxii;
}
}
std::cerr << "Sum: " << sum << " (" << maxii_shared << ")" << std::endl;
}
EDIT: a cleaner implementation:
编辑:更清洁的实现:
#include <cmath>
#include <limits>
#include <vector>
#include <iostream>
#include <algorithm>
#include <tr1/random>
// sum the elements of v
double sum(const std::vector<double>& v)
{
double sum = 0.0;
#pragma omp parallel for reduction(+:sum)
for(size_t ii=0; ii< v.size(); ++ii)
{
sum += v[ii];
}
return sum;
}
// extract the minimum of v
double min(const std::vector<double>& v)
{
double shared_min;
#pragma omp parallel
{
double min = std::numeric_limits<double>::max();
#pragma omp for nowait
for(size_t ii=0; ii<v.size(); ++ii)
{
min = std::min(v[ii], min);
}
#pragma omp critical
{
shared_min = std::min(shared_min, min);
}
}
return shared_min;
}
// generate a random vector and use sum and min functions.
int main()
{
using namespace std;
using namespace std::tr1;
std::tr1::mt19937 engine(time(0));
std::tr1::uniform_real<> unigen(-1000.0,1000.0);
std::tr1::variate_generator<std::tr1::mt19937,
std::tr1::uniform_real<> >gen(engine, unigen);
std::vector<double> random(1000000);
std::generate(random.begin(), random.end(), gen);
cout << "Sum: " << sum(random) << " Mean:" << sum(random)/random.size()
<< " Min:" << min(random) << endl;
}
#2
9
in OpenMP 3.1 onwards one can implement for min, max through reduction clause, you can have a look at detailed example covering this in this link.
在OpenMP 3.1中可以通过简化子句实现min,max,你可以在这个链接中查看详细的例子。
#3
4
OpenMP doesn't support these reduction operations. Consider Intel Threading Building Blocks' parallel_reduce algorithm, where you can implement arbitrary algorithm.
OpenMP不支持这些还原操作。考虑英特尔线程构建模块的parallel_reduce算法,您可以在其中实现任意算法。
Here an example. It uses summation of partial results. You may implement any function you want.
这是一个例子。它使用部分结果的总和。您可以实现任何您想要的功能。
#include <stdio.h>
#include <tbb/blocked_range.h>
#include <tbb/parallel_reduce.h>
#include <tbb/task_scheduler_init.h>
///////////////////////////////////////////////////////////////////////////////
class PiCalculation
{
private:
long num_steps;
double step;
public:
// Pi partial value
double pi;
// Calculate partial value
void operator () (const tbb::blocked_range<long> &r)
{
double sum = 0.0;
long end = r.end();
for (int i = r.begin(); i != end; i++)
{
double x = (i + 0.5) * step;
sum += 4.0/(1.0 + x * x);
}
pi += sum * step;
}
// Combine results. Here you can implement any functions
void join(PiCalculation &p)
{
pi += p.pi;
}
PiCalculation(PiCalculation &p, tbb::split)
{
pi = 0.0;
num_steps = p.num_steps;
step = p.step;
}
PiCalculation(long steps)
{
pi = 0.0;
num_steps = steps;
step = 1./(double)num_steps;
}
};
///////////////////////////////////////////////////////////////////////////////
int main()
{
tbb::task_scheduler_init init;
const long steps = 100000000;
PiCalculation pi(steps);
tbb::parallel_reduce(tbb::blocked_range<long>(0, steps, 1000000), pi);
printf ("Pi is %3.20f\n", pi.pi);
}
Please check this link for additional reduction algorithms. http://cache-www.intel.com/cd/00/00/30/11/301132_301132.pdf#page=19 Please look through paragraph 3.3.1. There is an example on finding minimum value in an array.
请检查此链接以获取其他减少算法。 http://cache-www.intel.com/cd/00/00/30/11/301132_301132.pdf#page=19请仔细阅读第3.3.1段。有一个关于在数组中找到最小值的示例。
#4
3
This are typical reduction problems.
这是典型的减少问题。
Besides the page pointed by Suvesh, you might have a look at the documentation for the reduction clause.
除了Suvesh指出的页面,您可以查看还原子句的文档。
#1
23
OpenMP (at least 2.0) supports reduction for some simple operations, but not for max and min.
OpenMP(至少2.0)支持减少一些简单的操作,但不支持max和min。
In the following example the reduction
clause is used to make a sum and a critical
section is used to update a shared variable using a thread-local one without conflicts.
在以下示例中,reduction子句用于生成总和,而临界部分用于使用线程本地更新共享变量而不会发生冲突。
#include <iostream>
#include <cmath>
int main()
{
double sum = 0;
uint64_t ii;
uint64_t maxii = 0;
uint64_t maxii_shared = 0;
#pragma omp parallel shared(maxii_shared) private(ii) firstprivate(maxii)
{
#pragma omp for reduction(+:sum) nowait
for(ii=0; ii<10000000000; ++ii)
{
sum += std::pow((double)ii, 2.0);
if(ii > maxii) maxii = ii;
}
#pragma omp critical
{
if(maxii > maxii_shared) maxii_shared = maxii;
}
}
std::cerr << "Sum: " << sum << " (" << maxii_shared << ")" << std::endl;
}
EDIT: a cleaner implementation:
编辑:更清洁的实现:
#include <cmath>
#include <limits>
#include <vector>
#include <iostream>
#include <algorithm>
#include <tr1/random>
// sum the elements of v
double sum(const std::vector<double>& v)
{
double sum = 0.0;
#pragma omp parallel for reduction(+:sum)
for(size_t ii=0; ii< v.size(); ++ii)
{
sum += v[ii];
}
return sum;
}
// extract the minimum of v
double min(const std::vector<double>& v)
{
double shared_min;
#pragma omp parallel
{
double min = std::numeric_limits<double>::max();
#pragma omp for nowait
for(size_t ii=0; ii<v.size(); ++ii)
{
min = std::min(v[ii], min);
}
#pragma omp critical
{
shared_min = std::min(shared_min, min);
}
}
return shared_min;
}
// generate a random vector and use sum and min functions.
int main()
{
using namespace std;
using namespace std::tr1;
std::tr1::mt19937 engine(time(0));
std::tr1::uniform_real<> unigen(-1000.0,1000.0);
std::tr1::variate_generator<std::tr1::mt19937,
std::tr1::uniform_real<> >gen(engine, unigen);
std::vector<double> random(1000000);
std::generate(random.begin(), random.end(), gen);
cout << "Sum: " << sum(random) << " Mean:" << sum(random)/random.size()
<< " Min:" << min(random) << endl;
}
#2
9
in OpenMP 3.1 onwards one can implement for min, max through reduction clause, you can have a look at detailed example covering this in this link.
在OpenMP 3.1中可以通过简化子句实现min,max,你可以在这个链接中查看详细的例子。
#3
4
OpenMP doesn't support these reduction operations. Consider Intel Threading Building Blocks' parallel_reduce algorithm, where you can implement arbitrary algorithm.
OpenMP不支持这些还原操作。考虑英特尔线程构建模块的parallel_reduce算法,您可以在其中实现任意算法。
Here an example. It uses summation of partial results. You may implement any function you want.
这是一个例子。它使用部分结果的总和。您可以实现任何您想要的功能。
#include <stdio.h>
#include <tbb/blocked_range.h>
#include <tbb/parallel_reduce.h>
#include <tbb/task_scheduler_init.h>
///////////////////////////////////////////////////////////////////////////////
class PiCalculation
{
private:
long num_steps;
double step;
public:
// Pi partial value
double pi;
// Calculate partial value
void operator () (const tbb::blocked_range<long> &r)
{
double sum = 0.0;
long end = r.end();
for (int i = r.begin(); i != end; i++)
{
double x = (i + 0.5) * step;
sum += 4.0/(1.0 + x * x);
}
pi += sum * step;
}
// Combine results. Here you can implement any functions
void join(PiCalculation &p)
{
pi += p.pi;
}
PiCalculation(PiCalculation &p, tbb::split)
{
pi = 0.0;
num_steps = p.num_steps;
step = p.step;
}
PiCalculation(long steps)
{
pi = 0.0;
num_steps = steps;
step = 1./(double)num_steps;
}
};
///////////////////////////////////////////////////////////////////////////////
int main()
{
tbb::task_scheduler_init init;
const long steps = 100000000;
PiCalculation pi(steps);
tbb::parallel_reduce(tbb::blocked_range<long>(0, steps, 1000000), pi);
printf ("Pi is %3.20f\n", pi.pi);
}
Please check this link for additional reduction algorithms. http://cache-www.intel.com/cd/00/00/30/11/301132_301132.pdf#page=19 Please look through paragraph 3.3.1. There is an example on finding minimum value in an array.
请检查此链接以获取其他减少算法。 http://cache-www.intel.com/cd/00/00/30/11/301132_301132.pdf#page=19请仔细阅读第3.3.1段。有一个关于在数组中找到最小值的示例。
#4
3
This are typical reduction problems.
这是典型的减少问题。
Besides the page pointed by Suvesh, you might have a look at the documentation for the reduction clause.
除了Suvesh指出的页面,您可以查看还原子句的文档。