多线程c程序中的随机函数

PLease see the whole question

请看整个问题

I know that srand() should be called only once, but my 2nd code segment shows that that does not solve the issue!!!!

我知道应该只调用srand()一次，但是我的第二个代码段显示这并不能解决问题!

The program I have written is giving me outputs which I can't quite make out why is it so. Different alterations of code segments give different outputs.

我写的程序给了我输出，我不太明白为什么是这样。不同的代码段改变会产生不同的输出。

Objective of code:
The code uses omp to simply run a piece of code for 3 threads. Each thread has to print 3 random values using the rand() function. So, a total of 9 outputs would come. Thread 0 is the main thread/ the main program's run flow. Thread 1 and Thread 2 are the fellow new threads created at the start of code for the threads.
The code:

代码的目标:代码使用omp来简单地为三个线程运行一段代码。每个线程都必须使用rand()函数打印3个随机值。总共有9个输出。线程0是主线程/主程序的运行流。线程1和线程2是线程开始时创建的新线程。代码:

#include<omp.h>
#include<stdio.h>
#include<stdlib.h>
#include<time.h>

int main()
{


     #pragma omp parallel num_threads(3)
    {
        srand(time(NULL));
        int i=0;
        for(i=0;i<3;i++)
        {
            printf("\nRandom number: %d by thread %d", rand(), omp_get_thread_num());
        }
    }

    return 0;
}

The output:

输出:

Random number: 17105 by thread 0
Random number: 30076 by thread 0
Random number: 21481 by thread 0
Random number: 17105 by thread 1
Random number: 30076 by thread 1
Random number: 21481 by thread 1
Random number: 17105 by thread 2
Random number: 30076 by thread 2
Random number: 21481 by thread 2

But if I make keep the srand(time(NULL)) before the code for thread like,

但是如果我在线程的代码之前保留srand(时间(NULL))，

 srand(time(NULL));  
 #pragma omp parallel num_threads(3)
{
    int i=0;
    ......
    ......(rest is same)

The output is, The output:

输出为:

Random number: 16582 by thread 0
Random number: 14267 by thread 0
Random number: 14030 by thread 0
Random number: 41 by thread 1
Random number: 18467 by thread 1
Random number: 6334 by thread 1
Random number: 41 by thread 2
Random number: 18467 by thread 2
Random number: 6334 by thread 2

The Problem, and my doubts:

问题和我的疑问:

By placing the `srand` outside, all the threads' 1st call to `rand()` gave the same random number, all of their 2nd call gave the same random number, and similarly for their 3rd call also.
通过将“srand”放在外面，所有线程对“rand()”的第一个调用都给出了相同的随机数，第二个调用都给出了相同的随机数，第三个调用也是如此。
By placing the `srand` inside, the main thread's calls resulted in different random numbers than the others. BUT, the 2 new other threads among them gave the same random number for their respective calls to `rand()`.
通过将“srand”放在内部，主线程的调用会产生不同的随机数。但是，其中的另外两个线程为它们各自的“rand()”调用提供了相同的随机数。

So,

所以,

What is actually happening here? How come the placement of the `srand()` function make a difference only to the main thread (thread `0`)?
这里到底发生了什么?为什么' srand() '函数的位置只对主线程(线程' 0 ')有影响?
Why is it that either ways the the other 2 new threads always output same random number for the respective call to `rand()`?
为什么其他两个新线程总是输出相同的随机数来调用“rand()”?
How is this `srand()` and `rand()` even linked, to cause this abnormality?
这个“srand()”和“rand()”是如何联系在一起，导致这种异常的呢?
And I tried giving wait intervals to each thread to remove that possibility of the `rand()` function being called by different threads at the same time, which might result in same random number maybe. But the problem was exactly like before. No change in the output (just the time at which output occurred was different).
我尝试给每个线程设置等待间隔，以消除不同线程同时调用“rand()”函数的可能性，这可能会导致相同的随机数。但问题和以前完全一样。输出中没有变化(只是发生输出的时间不同)。

Please help me understand this whole thing..

请帮助我理解这一切。

4 个解决方案

#1

Updated: Inserted direct answers to the OP's enumerated questions.

更新:插入对OP的枚举问题的直接答案。

What is actually happening here?

这里到底发生了什么?

Although some versions of the rand() function may be "thread safe" in some sense, there is no reason to believe or expect that without any external memory synchronization, the set of values returned by multiple rand() calls executed by different threads will be the same as the set of values returned by the same number of calls all executed by one thread. In particular, rand() maintains internal state that is modified on each call, and without any memory synchronization, it is entirely plausible that one thread will not see updates to that internal state that are performed by other threads. In that case, two or more threads may generate partially or wholly the same sequence of numbers.

虽然有些版本的rand()函数可能是“线程安全”在某种意义上,没有理由相信或认为没有任何外部存储器同步,一组由多个rand()调用返回值由不同的线程将执行相同的一组相同数量的调用返回的值都由一个线程执行。特别地，rand()维护在每次调用中被修改的内部状态，并且在没有任何内存同步的情况下，一个线程不会看到由其他线程执行的对该内部状态的更新是完全合理的。在这种情况下，两个或多个线程可能生成部分或全部相同的数字序列。

How come the placement of the srand() function make a difference only to the main thread (thread 0)?

为什么srand()函数的位置只对主线程(线程0)有影响?

The only thing that can be said for certain is that if the srand() is outside the parallel block then it is executed only by the main thread, whereas if it is inside then it is executed separately by each thread. Inasmuch as your code is not properly synchronized, the effects of each case are not predictable from the source code, so my next comments are mostly speculative.

唯一可以肯定的是，如果srand()位于并行块之外，那么它只由主线程执行，而如果它在主线程内，则由每个线程分别执行。由于您的代码没有得到适当的同步，每个案例的影响都无法从源代码中预测出来，所以我的下一个评论主要是猜测性的。

Supposing that time(), with its (only) one-second precision, returns the same value in each thread, placing srand() inside the parallel region ensures that every thread sees the same initial random number seed. If they then do not see each other's updates then they will generate the same sequences of pseudo-random numbers. Note, however, that you can neither safely rely on the threads seeing each other's updates nor safely rely on them not seeing each others updates.

假设time()具有(仅)一秒的精度，在每个线程中返回相同的值，在并行区域中放置srand()可以确保每个线程看到相同的初始随机数种子。如果他们没有看到彼此的更新，那么他们将生成相同的伪随机数序列。但是，请注意，您既不能安全地依赖查看彼此更新的线程，也不能安全地依赖它们不查看彼此的更新。

If you put the srand() outside the parallel region, however, so that it is executed only by the main thread, then there are additional possibilities. If OMP maintains a thread pool whose members are already started before you enter the parallel section, then it may be that threads 1 and 2 fail to see the effect of thread 0's srand() call at all, and therefore both proceed with the default seed. There are other possibilities.

但是，如果您将srand()放在并行区域之外，使其仅由主线程执行，那么还有其他的可能性。如果OMP维护一个线程池，其成员在您进入并行部分之前已经启动，那么可能是线程1和线程2根本看不到线程0的srand()调用的效果，因此它们都继续使用默认的种子。还有其他的可能性。

Why is it that either ways the the other 2 new threads always output same random number for the respective call to rand()?

为什么其他两个新线程总是为各自调用rand()输出相同的随机数?

It's impossible to say with any certainty. I'm inclined to guess, however, that none of the threads involved see each other's updates to rand()'s internal state.

不可能肯定地说。然而，我倾向于猜测，涉及的线程中没有一个会看到彼此对rand()的内部状态的更新。

How is this srand() and rand() even linked, to cause this abnormality?

这个srand()和rand()是如何联系在一起的?

The two functions are intimately linked. The purpose of srand() is to modify rand()'s internal state (to "seed" it, hence the "s" in "srand"), so as to start the psuedo-random number sequence it generates at a different (but still deterministic) point.

这两项职能密切相关。srand()的目的是修改rand()的内部状态(“种子”它，因此是“srand”中的“s”)，以便启动它在一个不同(但仍然是确定的)点上生成的psuedo-random number序列。

This problem can be solved in the same way that any problem involving multi-threaded access to shared variables can be solved: by applying synchronization. In this case, the most straightforward form of synchronization would probably be to protect rand() calls with a mutex. Since this is OMP code, your best bet might be to use OMP locks to implement a mutex, as it seems dicey to mix explicit pthreads objects with OMP declarations.

这个问题可以像任何涉及到对共享变量的多线程访问的问题一样得到解决:通过应用同步。在这种情况下，最直接的同步形式可能是使用互斥对象保护rand()调用。由于这是OMP代码，所以最好的办法可能是使用OMP锁来实现互斥，因为将显式pthreads对象与OMP声明混合起来似乎比较冒险。

#2

Random number generators are actually not so random. They take some internal state (the "seed"), deterministically extract an integer from that state, and deterministically mutate the state so that it will be different on the next call.

随机数生成器其实不是那么随机。它们取某个内部状态(“种子”)，从该状态中决定性地提取一个整数，并决定性地改变该状态，以便下次调用时它将是不同的。

Normally, the computations involve are complex bit manipulations intended to guarantee that the sequence of outputs "looks" random, is well-distributed over the possible range, and satisfies other requirements. But at base, it is a deterministic function on a global internal state. Without the complicated computations, it would not be much different from this:

通常，所涉及的计算都是复杂的位操作，目的是保证输出序列“看起来”是随机的，在可能的范围内分布良好，并满足其他需求。但从根本上说，它是一个全局内态的确定性函数。如果没有复杂的计算，它和这个没有什么不同:

# File: not_so_random.c
static unsigned seed = 1;
void srand(unsigned newseed) { seed = newseed; }
int  rand(void)              { return seed++; }

([Note 1])

([注1])

It's pretty easy to see how that would produce a race condition if it were executed in parallel threads.

如果在并行线程中执行，那么很容易看出如何产生竞争条件。

You could make this "kind of" multithread safe by making seed atomic. Even if the mutation were more complicated than an atomic increment, making access atomic would ensure that the next seed was the result of the mutation made by some call to rand. Still, a race condition is possible: two threads could both pick up the seed at the same time, and then they would receive the same random number. Other odd behaviours are also possible, including the one where a thread gets the same random number twice, or even an earlier one. And particularly odd behaviours are possible if srand is being called simultaneously with rand, since that is always a race condition.

通过制造种子原子，你可以使这种“类型”的多线程安全。即使突变比原子增量更复杂，使访问原子也能确保下一个种子是由一些对rand的调用所产生的突变的结果。不过，一个竞争条件是可能的:两个线程可以同时获取种子，然后它们将收到相同的随机数。其他奇怪的行为也是可能的，包括一个线程两次获得相同的随机数，甚至更早的一个。尤其是当srand与rand同时被调用时，特别奇怪的行为是可能的，因为这总是一个竞态条件。

On the other hand, you could protect all calls to rand and srand with a mutex, which would avoid all race conditions as long as srand is called before threading starts. (Otherwise, any call to srand in one thread will reset the random number sequence in every other thread.) However, if multiple threads are simultaneously consuming lots of random numbers, you'll see a lot of mutex contention and possibly synchronization artefacts. [Note 2].

另一方面，您可以使用互斥对象保护所有对rand和srand的调用，只要在线程开始之前调用srand，就可以避免所有的竞争条件。(否则，在一个线程中对srand的任何调用都将重置其他线程中的随机数序列。)但是，如果多个线程同时使用大量随机数，您将看到大量互斥对象争用，可能还会看到同步人工制品。[注2]。

In a multiprocessing world, library functions which depend on global state are not so great, and many of the old interfaces have multithread-safe alternatives. Posix requires rand_r, which is similar to rand except that it expects as an argument the address of a seed variable. With this interface, each thread can simply use its own seed, and the threads will effectively have independent random number generators. [Note 3]

在多处理世界中，依赖全局状态的库函数不是很好，许多旧接口都有多线程安全的替代方案。Posix需要rand_r，它与rand类似，但是它期望种子变量的地址作为参数。通过这个接口，每个线程都可以简单地使用它自己的种子，并且线程将有效地拥有独立的随机数生成器。[注3]

Of course, these seeds have to be initialized in some way, and it would obviously be counterproductive to initialize them all to the same value, since that would result in each thread getting the same random number sequence.

当然，这些种子必须以某种方式进行初始化，并且将它们初始化为相同的值显然会适得其反，因为这会导致每个线程获得相同的随机数序列。

In this sample code, I used the system /dev/urandom device to provide a few seed bytes for each thread. /dev/urandom is implemented by the operating system (or, at least, by many OSs); it produces a highly-random stream of bytes. Usually, the randomness of this stream is reinforced by mixing in random events, like the timing of keyboard interrupts. It's a moderately expensive way to generate a random number, but it will generate quite a good random number. So that's perfect for producing random seeds for each thread: I want the seeds to be random, and I'm not going to need very many of them. [Note 4]

在这个示例代码中，我使用system /dev/urandom设备为每个线程提供一些种子字节。/dev/urandom是由操作系统实现的(或者，至少是由许多OSs实现的);它产生一个高随机性的字节流。通常，这种流的随机性通过混合随机事件来加强，比如键盘中断的时间。生成随机数是一种比较昂贵的方法，但是它会生成一个很好的随机数。所以这是为每条线产生随机种子的完美方法:我希望种子是随机的，我不需要很多。[注4]

So here's one possible implementation:

这里有一个可能的实现:

#define _XOPEN_SOURCE
#include<omp.h>
#include<stdio.h>
#include<stdlib.h>
// This needs to be the maximum number of threads.
// I presume there is a way to find the correct value.
#define THREAD_COUNT 3
// Hand-built alternative to thread-local storage
unsigned int seed[THREAD_COUNT];

int main() {
  FILE* r = fopen("/dev/urandom", "r");
  if (fread(seed, sizeof seed, 1, r) != 1) exit(1);
  fclose(r);

#pragma omp parallel num_threads(3)
  {
    // Get the address of this thread's RNG seed.
    int* seedp = &seed[omp_get_thread_num()];
    int i=0;
    for(i=0;i<3;i++) {
      printf("Random number: %d by thread %d\n",
             rand_r(seedp), omp_get_thread_num());
    }
  }

  return 0;
}

Notes:

While that example is roughly as random as the famous xkcd on the subject, the following is an example of a rand implementation taken (with minor edits) straight from the C standard (§7.22.2 para.5), which is judged to be a "sufficiently-random" implementation of rand. The similarity with my example is evident.

虽然这个例子是大致一样随机著名xkcd上的主题,以下是兰德公司的一个例子实现了(小编辑)直接从C标准(§7.22.2 para.5),被认为是一个“完全随机”兰德的实现。与我的例子相似是显而易见的。
```
/* RAND_MAX assumed to be 32767. */
static unsigned long next = 1;
int rand(void) {
    next = next * 1103515245 + 12345;
    return((unsigned)(next/65536) % 32768);
}

void srand(unsigned seed) { next = seed; }
```
Neither the C standard nor Posix requires rand to be thread-safe, but it is not prohibited either. The Gnu implementation of the standard C library mutex protects rand() and srand(). But obviously that is not the rand implementation being used by OP, because glibc's rand() produces much larger random numbers.

C标准和Posix都不要求rand是线程安全的，但也不禁止。标准C库互斥的Gnu实现保护rand()和srand()。但是显然，这并不是OP所使用的rand实现，因为glibc的rand()产生的随机数要大得多。
If your system does not have rand_r, you could use a simple modification of the sample code from Note 1 above:

如果您的系统没有rand_r，您可以使用上面注释1中示例代码的简单修改:
```
int rand_r(unsigned *seedp) {
    *seedp = *seedp * 1103515245 + 12345;
    return((unsigned)(*seedp/65536) % 32768);
}
```
If your OS does not provide /dev/urandom, then it is quite possible that your OS is Windows, in which case you can use rand_s to generate the one-time seed.

如果您的OS不提供/dev/urandom，那么很可能您的OS是Windows，在这种情况下，您可以使用rand_s来生成一次性的种子。

#3

The reason is that time() has seconds precision, so each thread is calling srand() with the same seed leading to the same pseudo random number sequence.

原因是time()具有秒精度，因此每个线程都在使用相同的种子调用srand()，从而导致相同的伪随机数序列。

Just call srand() once at the begining of the program and not in each thread, that would make every run of the program generate 3 different sequences one for every thread.

只需在程序开始时调用srand()一次，而不是在每个线程中，这将使程序的每次运行为每个线程生成3个不同的序列。

#4

It looks like, on your platform, rand() is thread safe because each thread gets its own PRNG that is seeded when the thread is created. You could get this platform to do what you want by first generating a seed for each thread and then having each thread call srand with its seed before it calls rand. But that might break on other platforms with different behavior, so you just shouldn't use rand.

看起来，在您的平台上，rand()是线程安全的，因为每个线程在创建线程时都有自己的PRNG。您可以通过为每个线程生成一个种子，然后让每个线程在调用rand之前调用srand它的种子来让这个平台做您想做的事情。但这可能会在其他平台上以不同的行为方式出现问题，所以您不应该使用rand。

#1

Updated: Inserted direct answers to the OP's enumerated questions.

更新:插入对OP的枚举问题的直接答案。

What is actually happening here?

这里到底发生了什么?

How come the placement of the srand() function make a difference only to the main thread (thread 0)?

为什么srand()函数的位置只对主线程(线程0)有影响?

Why is it that either ways the the other 2 new threads always output same random number for the respective call to rand()?

为什么其他两个新线程总是为各自调用rand()输出相同的随机数?

It's impossible to say with any certainty. I'm inclined to guess, however, that none of the threads involved see each other's updates to rand()'s internal state.

不可能肯定地说。然而，我倾向于猜测，涉及的线程中没有一个会看到彼此对rand()的内部状态的更新。

How is this srand() and rand() even linked, to cause this abnormality?

这个srand()和rand()是如何联系在一起的?

#2

# File: not_so_random.c
static unsigned seed = 1;
void srand(unsigned newseed) { seed = newseed; }
int  rand(void)              { return seed++; }

([Note 1])

([注1])

It's pretty easy to see how that would produce a race condition if it were executed in parallel threads.

如果在并行线程中执行，那么很容易看出如何产生竞争条件。

当然，这些种子必须以某种方式进行初始化，并且将它们初始化为相同的值显然会适得其反，因为这会导致每个线程获得相同的随机数序列。

So here's one possible implementation:

这里有一个可能的实现:

#define _XOPEN_SOURCE
#include<omp.h>
#include<stdio.h>
#include<stdlib.h>
// This needs to be the maximum number of threads.
// I presume there is a way to find the correct value.
#define THREAD_COUNT 3
// Hand-built alternative to thread-local storage
unsigned int seed[THREAD_COUNT];

int main() {
  FILE* r = fopen("/dev/urandom", "r");
  if (fread(seed, sizeof seed, 1, r) != 1) exit(1);
  fclose(r);

#pragma omp parallel num_threads(3)
  {
    // Get the address of this thread's RNG seed.
    int* seedp = &seed[omp_get_thread_num()];
    int i=0;
    for(i=0;i<3;i++) {
      printf("Random number: %d by thread %d\n",
             rand_r(seedp), omp_get_thread_num());
    }
  }

  return 0;
}

Notes:

While that example is roughly as random as the famous xkcd on the subject, the following is an example of a rand implementation taken (with minor edits) straight from the C standard (§7.22.2 para.5), which is judged to be a "sufficiently-random" implementation of rand. The similarity with my example is evident.

虽然这个例子是大致一样随机著名xkcd上的主题,以下是兰德公司的一个例子实现了(小编辑)直接从C标准(§7.22.2 para.5),被认为是一个“完全随机”兰德的实现。与我的例子相似是显而易见的。
```
/* RAND_MAX assumed to be 32767. */
static unsigned long next = 1;
int rand(void) {
    next = next * 1103515245 + 12345;
    return((unsigned)(next/65536) % 32768);
}

void srand(unsigned seed) { next = seed; }
```
Neither the C standard nor Posix requires rand to be thread-safe, but it is not prohibited either. The Gnu implementation of the standard C library mutex protects rand() and srand(). But obviously that is not the rand implementation being used by OP, because glibc's rand() produces much larger random numbers.

C标准和Posix都不要求rand是线程安全的，但也不禁止。标准C库互斥的Gnu实现保护rand()和srand()。但是显然，这并不是OP所使用的rand实现，因为glibc的rand()产生的随机数要大得多。
If your system does not have rand_r, you could use a simple modification of the sample code from Note 1 above:

如果您的系统没有rand_r，您可以使用上面注释1中示例代码的简单修改:
```
int rand_r(unsigned *seedp) {
    *seedp = *seedp * 1103515245 + 12345;
    return((unsigned)(*seedp/65536) % 32768);
}
```
If your OS does not provide /dev/urandom, then it is quite possible that your OS is Windows, in which case you can use rand_s to generate the one-time seed.

如果您的OS不提供/dev/urandom，那么很可能您的OS是Windows，在这种情况下，您可以使用rand_s来生成一次性的种子。

#3

The reason is that time() has seconds precision, so each thread is calling srand() with the same seed leading to the same pseudo random number sequence.

原因是time()具有秒精度，因此每个线程都在使用相同的种子调用srand()，从而导致相同的伪随机数序列。

Just call srand() once at the begining of the program and not in each thread, that would make every run of the program generate 3 different sequences one for every thread.

只需在程序开始时调用srand()一次，而不是在每个线程中，这将使程序的每次运行为每个线程生成3个不同的序列。

秒客网

多线程c程序中的随机函数

4 个解决方案

#1

#2

Notes:

#3

#4

#1

#2

Notes:

#3

#4

相关文章