OpenMP(C / C ++):在线程之间共享unordered_map >和vector 的有效方法

时间:2022-11-07 13:49:42

I have a for loop that I would like to make parallel, however the threads must share an unordered_map and a vector.

我有一个我希望并行的for循环,但是线程必须共享一个unordered_map和一个向量。

Because the for loop is somewhat big I will post here a concise overview of it so that I can make my main problem clear. Please read the comments.

因为for循环有点大,所以我将在这里简要介绍它,以便我可以清楚地解决主要问题。请阅读评论。

   unordered_map<string, vector<int>> sharedUM;

   /*
      here I call a function that updates the unordered_map with some
      initial data, however the unordered_map will need to be updated by
      the threads inside the for loop
   */

   vector<int> sharedVector;
  /* 
     the shared vector initially is empty, the threads will 
    fill it with integers, the order of these integers should be in ascending
    order, however I can simply sort the array after all the 
    threads finish executing so I guess we can assume that the order 
    does not matter
  */

   #pragma omp parallel for
   for(int i=0; i<N; i++){

      key = generate_a_key_value_according_to_an_algorithm();
      std::unordered_map<string, vector<int>::iterator it = sharedUM.find(key);

      /*
       according to the data inside it->second(the value), 
       the thread makes some conclusions which then
       uses in order to figure out whether 
       it should run a high complexity algorithm
       or not.
      */
       bool conclusion = make_conclusion();

       if(conclusion == true){

           results = run_expensive_algorithm();

          /*
             According to the results, 
             the thread updates some values of
             the key that it previously searched for inside the unordered_map
             this update may help other threads avoid running 
             the expensive algorithm
          */

       }

       sharedVector.push_back(i);

   }

Initially I left the code as it is, so I just used that #pragma over the for loop, however I got a few problems regarding the update of the sharedVector. So I decided to use simple locks in order to force a thread acquire the lock before writing to the vector. So in my implementation I had something like this:

最初我保留了代码,所以我只是在for循环中使用了#pragma,但是我遇到了一些关于sharedVector更新的问题。所以我决定使用简单的锁来强制线程在写入向量之前获取锁。所以在我的实现中,我有这样的事情:

      omp_lock_t sharedVectorLock;
      omp_init_lock(&sharedVectorLock);
      ...
      for(...)
      ...
       omp_set_lock(&sharedVectorLock);
       sharedVector.push_back(i);
       omp_unset_lock(&sharedVectorLock);
      ...
      omp_destroy_lock(&sharedVectorLock);

I had run my application many times and everything seemed to be working great, and that's until I decided to rerun it automatically too many times until I got wrong results. Because I'm very new to the world of OpenMP and the threads in general, I wasn't aware of the fact that we should lock all the readers when a writer is updating some shared data. As you can see here in my application the threads always read some data from the unordered_map in order make some conclusions and learn things about the key that was assigned to them. What happens though if two threads have to work with the same key, and while some other thread is trying to read the values of this key, another one has reached the point of updating those values? I believe that's where my problem occurs.

我已经多次运行我的应用程序,一切似乎工作得很好,直到我决定自动重新运行它太多次,直到我得到错误的结果。因为我对OpenMP世界和一般的线程都很陌生,所以我不知道当作者更新某些共享数据时我们应该锁定所有读者。正如您在我的应用程序中看到的那样,线程总是从unordered_map中读取一些数据,以便得出一些结论并了解分配给它们的密钥。如果两个线程必须使用相同的密钥,并且其他一些线程试图读取该密钥的值,另一个线程已达到更新这些值的程度,会发生什么?我相信这就是我的问题所在。

However my main problem right now is that I'm not sure what would be the best way to avoid such things from happening. It's like my system works for 99% of the time, but that 1% ruins everything because two threads are rarely assigned with the same key which in turn is because my unordered_map is usually big.

然而,我现在的主要问题是,我不确定什么是避免这种事情发生的最佳方法。这就像我的系统99%的时间都在工作,但是1%会破坏一切,因为两个线程很少被分配相同的密钥,而这又是因为我的unordered_map通常很大。

Would locking the unordered_map do my job? Most likely, but that wouldn't be efficient because a thread A that wants to work with the key x would have to wait for a thread B that is already working with the key y where y can be different than x to finish.

锁定unordered_map会完成我的工作吗?最有可能,但这不会有效,因为想要使用键x的线程A必须等待已经使用键y的线程B,其中y可以与x不同以完成。

So my main question is, how should I approach this problem? How can I lock the unordered_map if and only if two threads are working with the same key?

所以我的主要问题是,我应该如何处理这个问题?当且仅当两个线程使用相同的密钥时,如何锁定unordered_map?

Thank you in advance

先谢谢你

3 个解决方案

#1


1  

1 on using locks and mutexes. You must declare and initialise the lock variables outside of the parallel block (before #pragma omp parallel) and then use them inside the parallel block: (1) acquire a lock (this may block if another thread has locked it), (2) change the variable with the race condition, (3) release the lock. Finally, destroy it after exiting the parallel block. A lock declared inside the parallel block is local to the thread and hence cannot provide synchronisation. This may explain your problems.

1使用锁和互斥锁。你必须声明并初始化并行块之外的锁变量(在#pragma omp parallel之前),然后在并行块中使用它们:(1)获取一个锁(如果另一个线程锁定它,这可能会阻塞),(2)用竞争条件改变变量,(3)释放锁。最后,在退出并行块后将其销毁。在并行块内声明的锁是线程的本地锁,因此无法提供同步。这可以解释你的问题。

2 on writing into complicated C++ containers. OpenMP was designed originally for simple FORTRAN do loops (similar to C/C++ for loops with integer control variables). Everything more complicated will give you headache. To be on the safe side, any non-constant operation on a C++ container must be performed within a lock (use the same lock for any such operation on the same container) or omp critical region (use the same name for any such operation on the same container). This includes pop() and push() etc, anything but simple reads. This can only remain efficient if such non-constant container operations take only a tiny fraction of the time.

2写入复杂的C ++容器。 OpenMP最初是为简单的FORTRAN do循环设计的(类似于带有整数控制变量的循环的C / C ++)。更复杂的一切会让你头疼。为了安全起见,C ++容器上的任何非常量操作必须在锁内执行(对同一容器上的任何此类操作使用相同的锁)或omp关键区域(对任何此类操作使用相同的名称)同一个容器)。这包括pop()和push()等,除了简单的读取之外的任何东西。只有这种非恒定的容器操作只需要很短的时间,这才能保持有效。

3 If I were you, I wouldn't bother with openMP (I have used it but am regretting this now). With C++ you could use TBB, which also comes with some threadsafe but lock-free containers. It also allows you to think in terms of tasks, not threads, which are executed recursively (a parent task spawns child tasks, etc), but TBB has some simple implementations for parallel for loops, for instance.

3如果我是你,我不会打扰openMP(我已经使用过了,但我现在后悔了)。使用C ++,您可以使用TBB,它还附带一些线程安全但无锁的容器。它还允许您考虑任务,而不是线程,递归执行(父任务产生子任务等),但TBB有一些简单的并行for循环实现。

#2


0  

An alternative approach would be to use TBB's concurrent_unordered_map. http://threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_unordered_map_cls.htm

另一种方法是使用TBB的concurrent_unordered_map。 http://threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_unordered_map_cls.htm

You don't have to use the rest of TBB's parallelism support (though if you're starting from scratch in C++ it's certainly more "c++-ish" than OpenMP).

您不必使用TBB的其余并行支持(尽管如果您从头开始使用C ++,它肯定比OpenMP更“c ++ - ish”)。

#3


-1  

May be this could help:

可能这可能会有所帮助:

    vector<bool> sv(N);

replace

    sharedVector.push_back(i);   

by

    sv[i]=true;

this allows to avoid locks (very time consuming) and sharedVector can easily be sorted, e.g

这允许避免锁定(非常耗时)并且可以容易地对sharedVector进行分类,例如

    for(int i=0; i<N;i++){
        if(sv[i])sharedVector.push_back(i);
    }

#1


1  

1 on using locks and mutexes. You must declare and initialise the lock variables outside of the parallel block (before #pragma omp parallel) and then use them inside the parallel block: (1) acquire a lock (this may block if another thread has locked it), (2) change the variable with the race condition, (3) release the lock. Finally, destroy it after exiting the parallel block. A lock declared inside the parallel block is local to the thread and hence cannot provide synchronisation. This may explain your problems.

1使用锁和互斥锁。你必须声明并初始化并行块之外的锁变量(在#pragma omp parallel之前),然后在并行块中使用它们:(1)获取一个锁(如果另一个线程锁定它,这可能会阻塞),(2)用竞争条件改变变量,(3)释放锁。最后,在退出并行块后将其销毁。在并行块内声明的锁是线程的本地锁,因此无法提供同步。这可以解释你的问题。

2 on writing into complicated C++ containers. OpenMP was designed originally for simple FORTRAN do loops (similar to C/C++ for loops with integer control variables). Everything more complicated will give you headache. To be on the safe side, any non-constant operation on a C++ container must be performed within a lock (use the same lock for any such operation on the same container) or omp critical region (use the same name for any such operation on the same container). This includes pop() and push() etc, anything but simple reads. This can only remain efficient if such non-constant container operations take only a tiny fraction of the time.

2写入复杂的C ++容器。 OpenMP最初是为简单的FORTRAN do循环设计的(类似于带有整数控制变量的循环的C / C ++)。更复杂的一切会让你头疼。为了安全起见,C ++容器上的任何非常量操作必须在锁内执行(对同一容器上的任何此类操作使用相同的锁)或omp关键区域(对任何此类操作使用相同的名称)同一个容器)。这包括pop()和push()等,除了简单的读取之外的任何东西。只有这种非恒定的容器操作只需要很短的时间,这才能保持有效。

3 If I were you, I wouldn't bother with openMP (I have used it but am regretting this now). With C++ you could use TBB, which also comes with some threadsafe but lock-free containers. It also allows you to think in terms of tasks, not threads, which are executed recursively (a parent task spawns child tasks, etc), but TBB has some simple implementations for parallel for loops, for instance.

3如果我是你,我不会打扰openMP(我已经使用过了,但我现在后悔了)。使用C ++,您可以使用TBB,它还附带一些线程安全但无锁的容器。它还允许您考虑任务,而不是线程,递归执行(父任务产生子任务等),但TBB有一些简单的并行for循环实现。

#2


0  

An alternative approach would be to use TBB's concurrent_unordered_map. http://threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_unordered_map_cls.htm

另一种方法是使用TBB的concurrent_unordered_map。 http://threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_unordered_map_cls.htm

You don't have to use the rest of TBB's parallelism support (though if you're starting from scratch in C++ it's certainly more "c++-ish" than OpenMP).

您不必使用TBB的其余并行支持(尽管如果您从头开始使用C ++,它肯定比OpenMP更“c ++ - ish”)。

#3


-1  

May be this could help:

可能这可能会有所帮助:

    vector<bool> sv(N);

replace

    sharedVector.push_back(i);   

by

    sv[i]=true;

this allows to avoid locks (very time consuming) and sharedVector can easily be sorted, e.g

这允许避免锁定(非常耗时)并且可以容易地对sharedVector进行分类,例如

    for(int i=0; i<N;i++){
        if(sv[i])sharedVector.push_back(i);
    }