This is a follow on from a previously posted question:
以下是之前发布的一个问题:
How to generate a random number in C?
如何在C中生成随机数?
I wish to be able to generate a random number from within a particular range, such as 1 to 6 to mimic the sides of a die.
我希望能够从一个特定的范围内生成一个随机数,比如1到6来模拟骰子的边。
How would I go about doing this?
我该怎么做呢?
10 个解决方案
#1
148
All the answers so far are mathematically wrong. Returning rand() % N
does not uniformly give a number in the range [0, N)
unless N
divides the length of the interval into which rand()
returns (i.e. is a power of 2). Furthermore, one has no idea whether the moduli of rand()
are independent: it's possible that they go 0, 1, 2, ...
, which is uniform but not very random. The only assumption it seems reasonable to make is that rand()
puts out a Poisson distribution: any two nonoverlapping subintervals of the same size are equally likely and independent. For a finite set of values, this implies a uniform distribution and also ensures that the values of rand()
are nicely scattered.
到目前为止,所有的答案在数学上都是错误的。返回rand()% N数量不一致就给出一个范围(0,N)除非N区间的长度分为rand()返回(即是2的幂)。此外,一个不知道是否rand()是独立的模:可能他们0,1,2,…它是均匀的,但不是随机的。唯一合理的假设是rand()提出了一个泊松分布:相同大小的任意两个不重叠子区间都是相等的且独立的。对于一组有限的值,这意味着一个统一的分布,并且还确保rand()的值是均匀分散的。
This means that the only correct way of changing the range of rand()
is to divide it into boxes; for example, if RAND_MAX == 11
and you want a range of 1..6
, you should assign {0,1}
to 1, {2,3}
to 2, and so on. These are disjoint, equally-sized intervals and thus are uniformly and independently distributed.
这意味着更改rand()范围的唯一正确方法是将其划分为方框;例如,如果RAND_MAX == 11,您希望范围为1。6,应该将{0,1}赋给1,{2,3}赋给2,以此类推。它们是不连续的,大小相等的间隔,因此它们是均匀的,独立的分布。
The suggestion to use floating-point division is mathematically plausible but suffers from rounding issues in principle. Perhaps double
is high-enough precision to make it work; perhaps not. I don't know and I don't want to have to figure it out; in any case, the answer is system-dependent.
使用浮点除法的建议在数学上是合理的,但在原则上存在四舍五入问题。也许双精度是足够高的,可以让它工作;也许不是。我不知道,我也不想弄清楚;无论如何,答案是系统相关的。
The correct way is to use integer arithmetic. That is, you want something like the following:
正确的方法是使用整数算法。也就是说,你想要以下的东西:
#include <stdlib.h> // For random(), RAND_MAX// Assumes 0 <= max <= RAND_MAX// Returns in the closed interval [0, max]long random_at_most(long max) { unsigned long // max <= RAND_MAX < ULONG_MAX, so this is okay. num_bins = (unsigned long) max + 1, num_rand = (unsigned long) RAND_MAX + 1, bin_size = num_rand / num_bins, defect = num_rand % num_bins; long x; do { x = random(); } // This is carefully written not to overflow while (num_rand - defect <= (unsigned long)x); // Truncated division is intentional return x/bin_size;}
The loop is necessary to get a perfectly uniform distribution. For example, if you are given random numbers from 0 to 2 and you want only ones from 0 to 1, you just keep pulling until you don't get a 2; it's not hard to check that this gives 0 or 1 with equal probability. This method is also described in the link that nos gave in their answer, though coded differently. I'm using random()
rather than rand()
as it has a better distribution (as noted by the man page for rand()
).
循环是得到完全均匀分布的必要条件。例如,如果你得到从0到2的随机数并且你想要从0到1的随机数,你只需要一直拉,直到你没有得到2;我们不难发现,它给出0或1的概率是相等的。这个方法在nos给出答案的链接中也有描述,尽管编码不同。我使用的是random()而不是rand(),因为它有更好的分布(正如rand()的man页面所指出的)。
If you want to get random values outside the default range [0, RAND_MAX]
, then you have to do something tricky. Perhaps the most expedient is to define a function random_extended()
that pulls n
bits (using random_at_most()
) and returns in [0, 2**n)
, and then apply random_at_most()
with random_extended()
in place of random()
(and 2**n - 1
in place of RAND_MAX
) to pull a random value less than 2**n
, assuming you have a numerical type that can hold such a value. Finally, of course, you can get values in [min, max]
using min + random_at_most(max - min)
, including negative values.
如果你想在默认范围[0,RAND_MAX]之外获得随机值,那么你需要做一些棘手的事情。最有利的是定义一个函数random_extended(),把n比特(使用random_at_most()),并返回在[0,2 * * n),然后应用random_at_most与random_extended()()代替随机()(2 * * n - 1代替RAND_MAX)将一个随机值小于2 * * n,假设你有一个数值类型,可以容纳这样的值。最后,当然,您可以在[min, max]中使用min + random_at_most(max - min)获取值,包括负值。
#2
32
Following on from @Ryan Reich's answer, I thought I'd offer my cleaned up version. The first bounds check isn't required given the second bounds check, and I've made it iterative rather than recursive. It returns values in the range [min, max], where max >= min
and 1+max-min < RAND_MAX
.
根据@Ryan Reich的回答,我想我应该提供我的清理版。考虑到第二个边界检查,第一个边界检查不是必需的,我已经使它迭代而不是递归的。它返回范围内的值[min, max],其中max >= min, 1+max-min < RAND_MAX。
unsigned int rand_interval(unsigned int min, unsigned int max){ int r; const unsigned int range = 1 + max - min; const unsigned int buckets = RAND_MAX / range; const unsigned int limit = buckets * range; /* Create equal size buckets all in a row, then fire randomly towards * the buckets until you land in one of them. All buckets are equally * likely. If you land off the end of the line of buckets, try again. */ do { r = rand(); } while (r >= limit); return min + (r / buckets);}
#3
16
unsigned intrandr(unsigned int min, unsigned int max){ double scaled = (double)rand()/RAND_MAX; return (max - min +1)*scaled + min;}
See here for other options.
其他选项请参见这里。
#4
16
Here is a formula if you know the max and min values of a range, and you want to generate numbers inclusive in between the range:
这里有一个公式,如果你知道一个范围的最大值和最小值,并且你想要生成范围之间包含的数字:
r = (rand() % (max + 1 - min)) + min
#5
11
Wouldn't you just do:
难道你只做:
srand(time(NULL));int r = ( rand() % 6 ) + 1;
%
is the modulus operator. Essentially it will just divide by 6 and return the remainder... from 0 - 5
%是模运算符。本质上它会除以6,然后返回余数。从0 - 5
#6
7
For those who understand the bias problem but can't stand the unpredictable run-time of rejection-based methods, this series produces a progressively less biased random integer in the [0, n-1]
interval:
对于那些理解偏差问题但无法忍受基于拒绝的方法的不可预测运行时的人,本系列将在[0,n -1]区间内生成一个逐步减小偏差的随机整数:
r = n / 2;r = (rand() * n + r) / (RAND_MAX + 1);r = (rand() * n + r) / (RAND_MAX + 1);r = (rand() * n + r) / (RAND_MAX + 1);...
It does so by synthesising a high-precision fixed-point random number of i * log_2(RAND_MAX + 1)
bits (where i
is the number of iterations) and performing a long multiplication by n
.
它通过合成高精度定点随机数i * log_2(RAND_MAX + 1)位(其中i是迭代次数)并执行长乘法n来实现。
When the number of bits is sufficiently large compared to n
, the bias becomes immeasurably small.
当比特数比n足够大时,偏差就会变得非常小。
It does not matter if RAND_MAX + 1
is less than n
(as in this question), or if it is not a power of two, but care must be taken to avoid integer overflow if RAND_MAX * n
is large.
RAND_MAX + 1是否小于n(如本问题所示),或者是否不是2的幂,都没有关系,但是如果RAND_MAX * n较大,则必须注意避免整数溢出。
#7
4
In order to avoid the modulo bias (suggested in other answers) you can always use:
为了避免模态偏差(在其他答案中建议),您可以使用:
arc4random_uniform(MAX-MIN)+MIN
Where "MAX" is the upper bound and "MIN" is lower bound. For example, for numbers between 10 and 20:
其中“MAX”为上界,“MIN”为下界。例如,对于10到20之间的数字:
arc4random_uniform(20-10)+10arc4random_uniform(10)+10
Simple solution and better than using "rand() % N".
简单的解决方案,比使用“rand() % N”更好。
#8
4
Here is a slight simpler algorithm than Ryan Reich's solution:
这里有一个比赖安·赖希的解决方案更简单的算法:
/// Begin and end are *inclusive*; => [begin, end]uint32_t getRandInterval(uint32_t begin, uint32_t end) { uint32_t range = (end - begin) + 1; uint32_t limit = ((uint64_t)RAND_MAX + 1) - (((uint64_t)RAND_MAX + 1) % range); /* Imagine range-sized buckets all in a row, then fire randomly towards * the buckets until you land in one of them. All buckets are equally * likely. If you land off the end of the line of buckets, try again. */ uint32_t randVal = rand(); while (randVal >= limit) randVal = rand(); /// Return the position you hit in the bucket + begin as random number return (randVal % range) + begin;}
Example (RAND_MAX := 16, begin := 2, end := 7) => range := 6 (1 + end - begin) => limit := 12 (RAND_MAX + 1) - ((RAND_MAX + 1) % range)The limit is always a multiple of the range,so we can split it into range-sized buckets: Possible-rand-output: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Buckets: [0, 1, 2, 3, 4, 5][0, 1, 2, 3, 4, 5][X, X, X, X, X] Buckets + begin: [2, 3, 4, 5, 6, 7][2, 3, 4, 5, 6, 7][X, X, X, X, X]1st call to rand() => 13 → 13 is not in the bucket-range anymore (>= limit), while-condition is true → retry...2nd call to rand() => 7 → 7 is in the bucket-range (< limit), while-condition is false → Get the corresponding bucket-value 1 (randVal % range) and add begin => 3
#9
2
While Ryan is correct, the solution can be much simpler based on what is known about the source of the randomness. To re-state the problem:
虽然Ryan是对的,但是基于对随机性来源的了解,解决方案可以简单得多。重复的问题:
- There is a source of randomness, outputting integer numbers in range
[0, MAX)
with uniform distribution. - 有一个随机性的来源,输出范围为[0,max]且分布均匀的整数。
- The goal is to produce uniformly distributed random integer numbers in range
[rmin, rmax]
where0 <= rmin < rmax < MAX
. - 目标是在范围[rmin, rmax]中产生均匀分布的随机整数,其中0 <= rmin < rmax < MAX。
In my experience, if the number of bins (or "boxes") is significantly smaller than the range of the original numbers, and the original source is cryptographically strong - there is no need to go through all that rigamarole, and simple modulo division would suffice (like output = rnd.next() % (rmax+1)
, if rmin == 0
), and produce random numbers that are distributed uniformly "enough", and without any loss of speed. The key factor is the randomness source (i.e., kids, don't try this at home with rand()
).
根据我的经验,如果箱子的数量(或“盒子”)的范围明显小于原来的数字,和原始密码地强烈——不需要经过冗长的废话,和简单的模分裂就足够了(如输出= rnd.next()%(征求+ 1),如果rmin = = 0),并生成随机数,分布均匀“足够”,和没有任何损失的速度。关键因素是随机性源(即。孩子们,不要在家里用rand()这个词。
Here's an example/proof of how it works in practice. I wanted to generate random numbers from 1 to 22, having a cryptographically strong source that produced random bytes (based on Intel RDRAND). The results are:
这里有一个例子/证明它在实践中是如何工作的。我想生成从1到22的随机数,有一个加密的强源产生随机字节(基于Intel RDRAND)。结果是:
Rnd distribution test (22 boxes, numbers of entries in each box): 1: 409443 4.55% 2: 408736 4.54% 3: 408557 4.54% 4: 409125 4.55% 5: 408812 4.54% 6: 409418 4.55% 7: 408365 4.54% 8: 407992 4.53% 9: 409262 4.55%10: 408112 4.53%11: 409995 4.56%12: 409810 4.55%13: 409638 4.55%14: 408905 4.54%15: 408484 4.54%16: 408211 4.54%17: 409773 4.55%18: 409597 4.55%19: 409727 4.55%20: 409062 4.55%21: 409634 4.55%22: 409342 4.55% total: 100.00%
This is as close to uniform as I need for my purpose (fair dice throw, generating cryptographically strong codebooks for WWII cipher machines such as http://users.telenet.be/d.rijmenants/en/kl-7sim.htm, etc). The output does not show any appreciable bias.
这与我的目标(公平掷骰子,生成二战密码学强码本,如http://users.telenet.be/d.rijmenants/en/kl-7sim)非常接近。htm等)。输出没有显示任何明显的偏差。
Here's the source of cryptographically strong (true) random number generator:Intel Digital Random Number Generatorand a sample code that produces 64-bit (unsigned) random numbers.
这是密码强大(真)随机数生成器的来源:Intel数字随机数生成器和生成64位(未签名)随机数的示例代码。
int rdrand64_step(unsigned long long int *therand){ unsigned long long int foo; int cf_error_status; asm("rdrand %%rax; \ mov $1,%%edx; \ cmovae %%rax,%%rdx; \ mov %%edx,%1; \ mov %%rax, %0;":"=r"(foo),"=r"(cf_error_status)::"%rax","%rdx"); *therand = foo; return cf_error_status;}
I compiled it on Mac OS X with clang-6.0.1 (straight), and with gcc-4.8.3 using "-Wa,q" flag (because GAS does not support these new instructions).
我在Mac OS X上用叮当-6.0.1(直线)编译它,在gcc-4.8.3上使用“-Wa,q”标志(因为GAS不支持这些新指令)。
#10
1
As said before modulo isn't sufficient because it skews the distribution. Heres my code which masks off bits and uses them to ensure the distribution isn't skewed.
如前所述,模块化并不充分,因为它扭曲了分布。下面是我的代码,它屏蔽了位,并使用它们来确保分布不偏倚。
static uint32_t randomInRange(uint32_t a,uint32_t b) { uint32_t v; uint32_t range; uint32_t upper; uint32_t lower; uint32_t mask; if(a == b) { return a; } if(a > b) { upper = a; lower = b; } else { upper = b; lower = a; } range = upper - lower; mask = 0; //XXX calculate range with log and mask? nah, too lazy :). while(1) { if(mask >= range) { break; } mask = (mask << 1) | 1; } while(1) { v = rand() & mask; if(v <= range) { return lower + v; } }}
The following simple code lets you look at the distribution:
下面的简单代码可以让您查看分布:
int main() { unsigned long long int i; unsigned int n = 10; unsigned int numbers[n]; for (i = 0; i < n; i++) { numbers[i] = 0; } for (i = 0 ; i < 10000000 ; i++){ uint32_t rand = random_in_range(0,n - 1); if(rand >= n){ printf("bug: rand out of range %u\n",(unsigned int)rand); return 1; } numbers[rand] += 1; } for(i = 0; i < n; i++) { printf("%u: %u\n",i,numbers[i]); }}
#1
148
All the answers so far are mathematically wrong. Returning rand() % N
does not uniformly give a number in the range [0, N)
unless N
divides the length of the interval into which rand()
returns (i.e. is a power of 2). Furthermore, one has no idea whether the moduli of rand()
are independent: it's possible that they go 0, 1, 2, ...
, which is uniform but not very random. The only assumption it seems reasonable to make is that rand()
puts out a Poisson distribution: any two nonoverlapping subintervals of the same size are equally likely and independent. For a finite set of values, this implies a uniform distribution and also ensures that the values of rand()
are nicely scattered.
到目前为止,所有的答案在数学上都是错误的。返回rand()% N数量不一致就给出一个范围(0,N)除非N区间的长度分为rand()返回(即是2的幂)。此外,一个不知道是否rand()是独立的模:可能他们0,1,2,…它是均匀的,但不是随机的。唯一合理的假设是rand()提出了一个泊松分布:相同大小的任意两个不重叠子区间都是相等的且独立的。对于一组有限的值,这意味着一个统一的分布,并且还确保rand()的值是均匀分散的。
This means that the only correct way of changing the range of rand()
is to divide it into boxes; for example, if RAND_MAX == 11
and you want a range of 1..6
, you should assign {0,1}
to 1, {2,3}
to 2, and so on. These are disjoint, equally-sized intervals and thus are uniformly and independently distributed.
这意味着更改rand()范围的唯一正确方法是将其划分为方框;例如,如果RAND_MAX == 11,您希望范围为1。6,应该将{0,1}赋给1,{2,3}赋给2,以此类推。它们是不连续的,大小相等的间隔,因此它们是均匀的,独立的分布。
The suggestion to use floating-point division is mathematically plausible but suffers from rounding issues in principle. Perhaps double
is high-enough precision to make it work; perhaps not. I don't know and I don't want to have to figure it out; in any case, the answer is system-dependent.
使用浮点除法的建议在数学上是合理的,但在原则上存在四舍五入问题。也许双精度是足够高的,可以让它工作;也许不是。我不知道,我也不想弄清楚;无论如何,答案是系统相关的。
The correct way is to use integer arithmetic. That is, you want something like the following:
正确的方法是使用整数算法。也就是说,你想要以下的东西:
#include <stdlib.h> // For random(), RAND_MAX// Assumes 0 <= max <= RAND_MAX// Returns in the closed interval [0, max]long random_at_most(long max) { unsigned long // max <= RAND_MAX < ULONG_MAX, so this is okay. num_bins = (unsigned long) max + 1, num_rand = (unsigned long) RAND_MAX + 1, bin_size = num_rand / num_bins, defect = num_rand % num_bins; long x; do { x = random(); } // This is carefully written not to overflow while (num_rand - defect <= (unsigned long)x); // Truncated division is intentional return x/bin_size;}
The loop is necessary to get a perfectly uniform distribution. For example, if you are given random numbers from 0 to 2 and you want only ones from 0 to 1, you just keep pulling until you don't get a 2; it's not hard to check that this gives 0 or 1 with equal probability. This method is also described in the link that nos gave in their answer, though coded differently. I'm using random()
rather than rand()
as it has a better distribution (as noted by the man page for rand()
).
循环是得到完全均匀分布的必要条件。例如,如果你得到从0到2的随机数并且你想要从0到1的随机数,你只需要一直拉,直到你没有得到2;我们不难发现,它给出0或1的概率是相等的。这个方法在nos给出答案的链接中也有描述,尽管编码不同。我使用的是random()而不是rand(),因为它有更好的分布(正如rand()的man页面所指出的)。
If you want to get random values outside the default range [0, RAND_MAX]
, then you have to do something tricky. Perhaps the most expedient is to define a function random_extended()
that pulls n
bits (using random_at_most()
) and returns in [0, 2**n)
, and then apply random_at_most()
with random_extended()
in place of random()
(and 2**n - 1
in place of RAND_MAX
) to pull a random value less than 2**n
, assuming you have a numerical type that can hold such a value. Finally, of course, you can get values in [min, max]
using min + random_at_most(max - min)
, including negative values.
如果你想在默认范围[0,RAND_MAX]之外获得随机值,那么你需要做一些棘手的事情。最有利的是定义一个函数random_extended(),把n比特(使用random_at_most()),并返回在[0,2 * * n),然后应用random_at_most与random_extended()()代替随机()(2 * * n - 1代替RAND_MAX)将一个随机值小于2 * * n,假设你有一个数值类型,可以容纳这样的值。最后,当然,您可以在[min, max]中使用min + random_at_most(max - min)获取值,包括负值。
#2
32
Following on from @Ryan Reich's answer, I thought I'd offer my cleaned up version. The first bounds check isn't required given the second bounds check, and I've made it iterative rather than recursive. It returns values in the range [min, max], where max >= min
and 1+max-min < RAND_MAX
.
根据@Ryan Reich的回答,我想我应该提供我的清理版。考虑到第二个边界检查,第一个边界检查不是必需的,我已经使它迭代而不是递归的。它返回范围内的值[min, max],其中max >= min, 1+max-min < RAND_MAX。
unsigned int rand_interval(unsigned int min, unsigned int max){ int r; const unsigned int range = 1 + max - min; const unsigned int buckets = RAND_MAX / range; const unsigned int limit = buckets * range; /* Create equal size buckets all in a row, then fire randomly towards * the buckets until you land in one of them. All buckets are equally * likely. If you land off the end of the line of buckets, try again. */ do { r = rand(); } while (r >= limit); return min + (r / buckets);}
#3
16
unsigned intrandr(unsigned int min, unsigned int max){ double scaled = (double)rand()/RAND_MAX; return (max - min +1)*scaled + min;}
See here for other options.
其他选项请参见这里。
#4
16
Here is a formula if you know the max and min values of a range, and you want to generate numbers inclusive in between the range:
这里有一个公式,如果你知道一个范围的最大值和最小值,并且你想要生成范围之间包含的数字:
r = (rand() % (max + 1 - min)) + min
#5
11
Wouldn't you just do:
难道你只做:
srand(time(NULL));int r = ( rand() % 6 ) + 1;
%
is the modulus operator. Essentially it will just divide by 6 and return the remainder... from 0 - 5
%是模运算符。本质上它会除以6,然后返回余数。从0 - 5
#6
7
For those who understand the bias problem but can't stand the unpredictable run-time of rejection-based methods, this series produces a progressively less biased random integer in the [0, n-1]
interval:
对于那些理解偏差问题但无法忍受基于拒绝的方法的不可预测运行时的人,本系列将在[0,n -1]区间内生成一个逐步减小偏差的随机整数:
r = n / 2;r = (rand() * n + r) / (RAND_MAX + 1);r = (rand() * n + r) / (RAND_MAX + 1);r = (rand() * n + r) / (RAND_MAX + 1);...
It does so by synthesising a high-precision fixed-point random number of i * log_2(RAND_MAX + 1)
bits (where i
is the number of iterations) and performing a long multiplication by n
.
它通过合成高精度定点随机数i * log_2(RAND_MAX + 1)位(其中i是迭代次数)并执行长乘法n来实现。
When the number of bits is sufficiently large compared to n
, the bias becomes immeasurably small.
当比特数比n足够大时,偏差就会变得非常小。
It does not matter if RAND_MAX + 1
is less than n
(as in this question), or if it is not a power of two, but care must be taken to avoid integer overflow if RAND_MAX * n
is large.
RAND_MAX + 1是否小于n(如本问题所示),或者是否不是2的幂,都没有关系,但是如果RAND_MAX * n较大,则必须注意避免整数溢出。
#7
4
In order to avoid the modulo bias (suggested in other answers) you can always use:
为了避免模态偏差(在其他答案中建议),您可以使用:
arc4random_uniform(MAX-MIN)+MIN
Where "MAX" is the upper bound and "MIN" is lower bound. For example, for numbers between 10 and 20:
其中“MAX”为上界,“MIN”为下界。例如,对于10到20之间的数字:
arc4random_uniform(20-10)+10arc4random_uniform(10)+10
Simple solution and better than using "rand() % N".
简单的解决方案,比使用“rand() % N”更好。
#8
4
Here is a slight simpler algorithm than Ryan Reich's solution:
这里有一个比赖安·赖希的解决方案更简单的算法:
/// Begin and end are *inclusive*; => [begin, end]uint32_t getRandInterval(uint32_t begin, uint32_t end) { uint32_t range = (end - begin) + 1; uint32_t limit = ((uint64_t)RAND_MAX + 1) - (((uint64_t)RAND_MAX + 1) % range); /* Imagine range-sized buckets all in a row, then fire randomly towards * the buckets until you land in one of them. All buckets are equally * likely. If you land off the end of the line of buckets, try again. */ uint32_t randVal = rand(); while (randVal >= limit) randVal = rand(); /// Return the position you hit in the bucket + begin as random number return (randVal % range) + begin;}
Example (RAND_MAX := 16, begin := 2, end := 7) => range := 6 (1 + end - begin) => limit := 12 (RAND_MAX + 1) - ((RAND_MAX + 1) % range)The limit is always a multiple of the range,so we can split it into range-sized buckets: Possible-rand-output: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Buckets: [0, 1, 2, 3, 4, 5][0, 1, 2, 3, 4, 5][X, X, X, X, X] Buckets + begin: [2, 3, 4, 5, 6, 7][2, 3, 4, 5, 6, 7][X, X, X, X, X]1st call to rand() => 13 → 13 is not in the bucket-range anymore (>= limit), while-condition is true → retry...2nd call to rand() => 7 → 7 is in the bucket-range (< limit), while-condition is false → Get the corresponding bucket-value 1 (randVal % range) and add begin => 3
#9
2
While Ryan is correct, the solution can be much simpler based on what is known about the source of the randomness. To re-state the problem:
虽然Ryan是对的,但是基于对随机性来源的了解,解决方案可以简单得多。重复的问题:
- There is a source of randomness, outputting integer numbers in range
[0, MAX)
with uniform distribution. - 有一个随机性的来源,输出范围为[0,max]且分布均匀的整数。
- The goal is to produce uniformly distributed random integer numbers in range
[rmin, rmax]
where0 <= rmin < rmax < MAX
. - 目标是在范围[rmin, rmax]中产生均匀分布的随机整数,其中0 <= rmin < rmax < MAX。
In my experience, if the number of bins (or "boxes") is significantly smaller than the range of the original numbers, and the original source is cryptographically strong - there is no need to go through all that rigamarole, and simple modulo division would suffice (like output = rnd.next() % (rmax+1)
, if rmin == 0
), and produce random numbers that are distributed uniformly "enough", and without any loss of speed. The key factor is the randomness source (i.e., kids, don't try this at home with rand()
).
根据我的经验,如果箱子的数量(或“盒子”)的范围明显小于原来的数字,和原始密码地强烈——不需要经过冗长的废话,和简单的模分裂就足够了(如输出= rnd.next()%(征求+ 1),如果rmin = = 0),并生成随机数,分布均匀“足够”,和没有任何损失的速度。关键因素是随机性源(即。孩子们,不要在家里用rand()这个词。
Here's an example/proof of how it works in practice. I wanted to generate random numbers from 1 to 22, having a cryptographically strong source that produced random bytes (based on Intel RDRAND). The results are:
这里有一个例子/证明它在实践中是如何工作的。我想生成从1到22的随机数,有一个加密的强源产生随机字节(基于Intel RDRAND)。结果是:
Rnd distribution test (22 boxes, numbers of entries in each box): 1: 409443 4.55% 2: 408736 4.54% 3: 408557 4.54% 4: 409125 4.55% 5: 408812 4.54% 6: 409418 4.55% 7: 408365 4.54% 8: 407992 4.53% 9: 409262 4.55%10: 408112 4.53%11: 409995 4.56%12: 409810 4.55%13: 409638 4.55%14: 408905 4.54%15: 408484 4.54%16: 408211 4.54%17: 409773 4.55%18: 409597 4.55%19: 409727 4.55%20: 409062 4.55%21: 409634 4.55%22: 409342 4.55% total: 100.00%
This is as close to uniform as I need for my purpose (fair dice throw, generating cryptographically strong codebooks for WWII cipher machines such as http://users.telenet.be/d.rijmenants/en/kl-7sim.htm, etc). The output does not show any appreciable bias.
这与我的目标(公平掷骰子,生成二战密码学强码本,如http://users.telenet.be/d.rijmenants/en/kl-7sim)非常接近。htm等)。输出没有显示任何明显的偏差。
Here's the source of cryptographically strong (true) random number generator:Intel Digital Random Number Generatorand a sample code that produces 64-bit (unsigned) random numbers.
这是密码强大(真)随机数生成器的来源:Intel数字随机数生成器和生成64位(未签名)随机数的示例代码。
int rdrand64_step(unsigned long long int *therand){ unsigned long long int foo; int cf_error_status; asm("rdrand %%rax; \ mov $1,%%edx; \ cmovae %%rax,%%rdx; \ mov %%edx,%1; \ mov %%rax, %0;":"=r"(foo),"=r"(cf_error_status)::"%rax","%rdx"); *therand = foo; return cf_error_status;}
I compiled it on Mac OS X with clang-6.0.1 (straight), and with gcc-4.8.3 using "-Wa,q" flag (because GAS does not support these new instructions).
我在Mac OS X上用叮当-6.0.1(直线)编译它,在gcc-4.8.3上使用“-Wa,q”标志(因为GAS不支持这些新指令)。
#10
1
As said before modulo isn't sufficient because it skews the distribution. Heres my code which masks off bits and uses them to ensure the distribution isn't skewed.
如前所述,模块化并不充分,因为它扭曲了分布。下面是我的代码,它屏蔽了位,并使用它们来确保分布不偏倚。
static uint32_t randomInRange(uint32_t a,uint32_t b) { uint32_t v; uint32_t range; uint32_t upper; uint32_t lower; uint32_t mask; if(a == b) { return a; } if(a > b) { upper = a; lower = b; } else { upper = b; lower = a; } range = upper - lower; mask = 0; //XXX calculate range with log and mask? nah, too lazy :). while(1) { if(mask >= range) { break; } mask = (mask << 1) | 1; } while(1) { v = rand() & mask; if(v <= range) { return lower + v; } }}
The following simple code lets you look at the distribution:
下面的简单代码可以让您查看分布:
int main() { unsigned long long int i; unsigned int n = 10; unsigned int numbers[n]; for (i = 0; i < n; i++) { numbers[i] = 0; } for (i = 0 ; i < 10000000 ; i++){ uint32_t rand = random_in_range(0,n - 1); if(rand >= n){ printf("bug: rand out of range %u\n",(unsigned int)rand); return 1; } numbers[rand] += 1; } for(i = 0; i < n; i++) { printf("%u: %u\n",i,numbers[i]); }}