如何为TSQL Select中的每一行生成随机数?

时间:2020-12-06 09:08:53

I need a different random number for each row in my table. The following seemingly obvious code uses the same random value for each row.

表中的每一行都需要一个不同的随机数。下面看似明显的代码对每一行使用相同的随机值。

SELECT table_name, RAND() magic_number 
FROM information_schema.tables 

I'd like to get a INT or a FLOAT out this. The rest of the story is I'm going to use the random number to create an random date offset from a know date, e.g. 1-14 days offset from a start date.

我想要一个INT或者浮点数。故事的其余部分是,我将使用随机数来创建一个自定义日期的随机日期偏移量,例如从开始日期起1-14天的偏移量。

This is for Microsoft SQL Server 2000.

这是针对Microsoft SQL Server 2000的。

15 个解决方案

#1


389  

Take a look at SQL Server - Set based random numbers which has a very detailed explanation.

看看SQL Server - Set随机数,它有一个非常详细的解释。

To summarize, the following code generates a random number between 0 and 13 inclusive with a normalized distribution:

综上所述,以下代码生成一个0到13之间的随机数,其中包含一个标准化的分布:

ABS(CHECKSUM(NewId())) % 14

To change your range, just change the number at the end of the expression. Be extra careful if you need a range that includes both positive and negative numbers. If you do it wrong, it's possible to double-count the number 0.

要更改范围,只需更改表达式末尾的数字。如果你需要一个包含正数和负数的范围,要格外小心。如果你做错了,有可能重复计算数字0。

A small warning for the math nuts in the room: there is a very slight bias in this code. CHECKSUM() results in numbers that are uniform across the entire range of the sql Int datatype, or at least as near so as my (the editor) testing can show. However, there will be some bias when CHECKSUM() produces a number at the very top end of that range. Any time you get a number between the maximum possible integer and the last exact multiple of the size of your desired range (14 in this case) before that maximum integer, those results are favored over the remaining portion of your range that cannot be produced from that last multiple of 14.

对于房间里的数学迷们来说,这是一个小小的警告:这段代码有一个非常小的偏差。CHECKSUM()的结果是,在sql Int数据类型的整个范围内都是一致的,或者至少与我(编辑器)测试所显示的接近。但是,当CHECKSUM()在该范围的最顶端生成一个数字时,会产生一些偏差。任何时候你得到最大可能的整数之间的数量最后确切的你想要的大小的倍数范围(在本例中14日)之前最大的整数,这些结果支持在剩下的部分范围内无法产生最后14的倍数。

As an example, imagine the entire range of the Int type is only 19. 19 is the largest possible integer you can hold. When CHECKSUM() results in 14-19, these correspond to results 0-5. Those numbers would be heavily favored over 6-13, because CHECKSUM() is twice as likely to generate them. It's easier to demonstrate this visually. Below is the entire possible set of results for our imaginary integer range:

例如,假设Int类型的整个范围只有19。19是你能容纳的最大整数。当CHECKSUM()结果为14-19时,这些结果对应于0-5。这些数字在6-13以上会受到极大的欢迎,因为CHECKSUM()生成这些数字的可能性是6-13的两倍。更容易在视觉上表现出来。下面是我们假想整数范围的全部可能结果集:

Checksum Integer: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Range Result:     0 1 2 3 4 5 6 7 8 9 10 11 12 13  0  1  2  3  4  5

You can see here that there are more chances to produce some numbers than others: bias. Thankfully, the actual range of the Int type is much larger... so much so that in most cases the bias is nearly undetectable. However, it is something to be aware of if you ever find yourself doing this for serious security code.

你可以看到这里有更多的机会产生一些数字:偏见。值得庆幸的是,Int类型的实际范围要大得多……如此之多,以至于在大多数情况下,偏差几乎无法察觉。但是,如果您发现自己正在为严重的安全代码做此工作,则需要注意这一点。

#2


77  

When called multiple times in a single batch, rand() returns the same number.

当在单个批中多次调用时,rand()返回相同的数字。

I'd suggest using convert(varbinary,newid()) as the seed argument:

我建议使用convert(varbinary,newid()作为种子参数:

SELECT table_name, 1.0 + floor(14 * RAND(convert(varbinary, newid()))) magic_number 
FROM information_schema.tables

newid() is guaranteed to return a different value each time it's called, even within the same batch, so using it as a seed will prompt rand() to give a different value each time.

newid()保证在每次调用时返回一个不同的值,即使是在同一批中,因此将它作为种子使用将会提示rand()每次给出不同的值。

Edited to get a random whole number from 1 to 14.

编辑后得到一个从1到14的随机数。

#3


55  

RAND(CHECKSUM(NEWID()))

The above will generate a (pseudo-) random number between 0 and 1, exclusive. If used in a select, because the seed value changes for each row, it will generate a new random number for each row (it is not guaranteed to generate a unique number per row however).

上面将生成一个(伪-)随机数,在0到1之间,独占。如果在select中使用,因为每一行的种子值都在变化,它将为每一行生成一个新的随机数(但不能保证每一行都生成一个唯一的随机数)。

Example when combined with an upper limit of 10 (produces numbers 1 - 10):

当与10的上限结合时(产生数字1 - 10):

CAST(RAND(CHECKSUM(NEWID())) * 10 as INT) + 1

Transact-SQL Documentation:

transact - sql批量文档:

  1. CAST(): https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
  2. 把():https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
  3. RAND(): http://msdn.microsoft.com/en-us/library/ms177610.aspx
  4. RAND():http://msdn.microsoft.com/en-us/library/ms177610.aspx
  5. CHECKSUM(): http://msdn.microsoft.com/en-us/library/ms189788.aspx
  6. 校验和():http://msdn.microsoft.com/en-us/library/ms189788.aspx
  7. NEWID(): https://docs.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql
  8. NEWID():https://docs.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql

#4


21  

Random number generation between 1000 and 9999:

1000到9999之间的随机数生成:

FLOOR(RAND(CHECKSUM(NEWID()))*(9999-1000)+1000)

#5


12  

Answering the old question, but this answer has not been provided previously, and hopefully this will be useful for someone finding this results through a search engine.

回答旧的问题,但是之前没有提供这个答案,希望这对通过搜索引擎找到这个结果的人有用。

With SQL Server 2008, a new function has been introduced, CRYPT_GEN_RANDOM(8), which uses CryptoAPI to produce a cryptographically strong random number, returned as VARBINARY(8000). Here's the documentation page: https://docs.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql

使用SQL Server 2008,引入了一个新的函数CRYPT_GEN_RANDOM(8),它使用CryptoAPI生成一个加密的强随机数,返回为VARBINARY(8000)。这里是文档页面:https://docs.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql

So to get a random number, you can simply call the function and cast it to the necessary type:

要得到随机数,只需调用函数并将其转换为必要的类型:

select CAST(CRYPT_GEN_RANDOM(8) AS bigint)

or to get a float between -1 and +1, you could do something like this:

或者得到-1和+1之间的浮动,你可以这样做:

select CAST(CRYPT_GEN_RANDOM(8) AS bigint) % 1000000000 / 1000000000.0

#6


11  

The Rand() function will generate the same random number, if used in a table SELECT query. Same applies if you use a seed to the Rand function. An alternative way to do it, is using this:

如果在表选择查询中使用,那么Rand()函数将生成相同的随机数。如果在Rand函数中使用种子,也同样适用。另一种方法是:

SELECT ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS [RandomNumber]

Got the information from here, which explains the problem very well.

从这里得到的信息可以很好地解释这个问题。

#7


5  

Do you have an integer value in each row that you could pass as a seed to the RAND function?

是否在每一行中都有一个可以作为种子传递给RAND函数的整数值?

To get an integer between 1 and 14 I believe this would work:

要得到1到14之间的整数,我相信这是可行的:

FLOOR( RAND(<yourseed>) * 14) + 1

#8


4  

try using a seed value in the RAND(seedInt). RAND() will only execute once per statement that is why you see the same number each time.

尝试在RAND(seedInt)中使用种子值。RAND()将只执行每个语句一次,这就是为什么每次都看到相同的数字。

#9


4  

If you don't need it to be an integer, but any random unique identifier, you can use newid()

如果您不需要它是整数,而是任何随机唯一标识符,那么可以使用newid()

SELECT table_name, newid() magic_number 
FROM information_schema.tables

#10


4  

You would need to call RAND() for each row. Here is a good example

您需要为每一行调用RAND()。这里有一个很好的例子

https://web.archive.org/web/20090216200320/http://dotnet.org.za/calmyourself/archive/2007/04/13/sql-rand-trap-same-value-per-row.aspx

https://web.archive.org/web/20090216200320/http:/ / dotnet.org.za / calmyourself /归档/ 2007/04/13 / sql-rand-trap-same-value-per-row.aspx

#11


4  

If you need to preserve your seed so that it generates the "same" random data every time, you can do the following:

如果您需要保存您的种子,以便它每次都生成“相同”的随机数据,您可以执行以下操作:

1. Create a view that returns select rand()

1。创建返回select rand()的视图

if object_id('cr_sample_randView') is not null
begin
    drop view cr_sample_randView
end
go

create view cr_sample_randView
as
select rand() as random_number
go

2. Create a UDF that selects the value from the view.

2。创建一个UDF,从视图中选择值。

if object_id('cr_sample_fnPerRowRand') is not null
begin
    drop function cr_sample_fnPerRowRand
end
go

create function cr_sample_fnPerRowRand()
returns float
as
begin
    declare @returnValue float
    select @returnValue = random_number from cr_sample_randView
    return @returnValue
end
go

3. Before selecting your data, seed the rand() function, and then use the UDF in your select statement.

3所示。在选择数据之前,输入rand()函数,然后在select语句中使用UDF。

select rand(200);   -- see the rand() function
with cte(id) as
(select row_number() over(order by object_id) from sys.all_objects)
select 
    id,
    dbo.cr_sample_fnPerRowRand()
from cte
where id <= 1000    -- limit the results to 1000 random numbers

#12


3  

select round(rand(checksum(newid()))*(10)+20,2)

Here the random number will come in between 20 and 30. round will give two decimal place maximum.

这里的随机数在20到30之间。四舍五入将给出两个十进制的最大值。

If you want negative numbers you can do it with

如果你想要负数,你可以用它

select round(rand(checksum(newid()))*(10)-60,2)

Then the min value will be -60 and max will be -50.

最小值是-60,最大值是-50。

#13


2  

select newid()

选择newid()

or possibly this select binary_checksum(newid())

或者这个select binary_checksum(newid())

#14


2  

The problem I sometimes have with the selected "Answer" is that the distribution isn't always even. If you need a very even distribution of random 1 - 14 among lots of rows, you can do something like this (my database has 511 tables, so this works. If you have less rows than you do random number span, this does not work well):

对于所选的“答案”,我有时遇到的问题是,分布并不总是均匀的。如果您需要在许多行中随机分布1 - 14,那么您可以这样做(我的数据库有511个表,所以这是有效的。如果你的行数比你的随机数跨度要少,这就不能很好地工作):

SELECT table_name, ntile(14) over(order by newId()) randomNumber 
FROM information_schema.tables

This kind of does the opposite of normal random solutions in the sense that it keeps the numbers sequenced and randomizes the other column.

这种方法与常规随机解相反,因为它保持数字的顺序,并随机化另一列。

Remember, I have 511 tables in my database (which is pertinent only b/c we're selecting from the information_schema). If I take the previous query and put it into a temp table #X, and then run this query on the resulting data:

请记住,我的数据库中有511个表(这只与我们从information_schema中选择的b/c有关)。如果我将前面的查询放到一个临时表#X中,然后在结果数据上运行这个查询:

select randomNumber, count(*) ct from #X
group by randomNumber

I get this result, showing me that my random number is VERY evenly distributed among the many rows:

我得到这个结果,告诉我我的随机数在很多行中分布很均匀

如何为TSQL Select中的每一行生成随机数?

#15


1  

select ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) as [Randomizer]

has always worked for me

一直为我工作吗

#1


389  

Take a look at SQL Server - Set based random numbers which has a very detailed explanation.

看看SQL Server - Set随机数,它有一个非常详细的解释。

To summarize, the following code generates a random number between 0 and 13 inclusive with a normalized distribution:

综上所述,以下代码生成一个0到13之间的随机数,其中包含一个标准化的分布:

ABS(CHECKSUM(NewId())) % 14

To change your range, just change the number at the end of the expression. Be extra careful if you need a range that includes both positive and negative numbers. If you do it wrong, it's possible to double-count the number 0.

要更改范围,只需更改表达式末尾的数字。如果你需要一个包含正数和负数的范围,要格外小心。如果你做错了,有可能重复计算数字0。

A small warning for the math nuts in the room: there is a very slight bias in this code. CHECKSUM() results in numbers that are uniform across the entire range of the sql Int datatype, or at least as near so as my (the editor) testing can show. However, there will be some bias when CHECKSUM() produces a number at the very top end of that range. Any time you get a number between the maximum possible integer and the last exact multiple of the size of your desired range (14 in this case) before that maximum integer, those results are favored over the remaining portion of your range that cannot be produced from that last multiple of 14.

对于房间里的数学迷们来说,这是一个小小的警告:这段代码有一个非常小的偏差。CHECKSUM()的结果是,在sql Int数据类型的整个范围内都是一致的,或者至少与我(编辑器)测试所显示的接近。但是,当CHECKSUM()在该范围的最顶端生成一个数字时,会产生一些偏差。任何时候你得到最大可能的整数之间的数量最后确切的你想要的大小的倍数范围(在本例中14日)之前最大的整数,这些结果支持在剩下的部分范围内无法产生最后14的倍数。

As an example, imagine the entire range of the Int type is only 19. 19 is the largest possible integer you can hold. When CHECKSUM() results in 14-19, these correspond to results 0-5. Those numbers would be heavily favored over 6-13, because CHECKSUM() is twice as likely to generate them. It's easier to demonstrate this visually. Below is the entire possible set of results for our imaginary integer range:

例如,假设Int类型的整个范围只有19。19是你能容纳的最大整数。当CHECKSUM()结果为14-19时,这些结果对应于0-5。这些数字在6-13以上会受到极大的欢迎,因为CHECKSUM()生成这些数字的可能性是6-13的两倍。更容易在视觉上表现出来。下面是我们假想整数范围的全部可能结果集:

Checksum Integer: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Range Result:     0 1 2 3 4 5 6 7 8 9 10 11 12 13  0  1  2  3  4  5

You can see here that there are more chances to produce some numbers than others: bias. Thankfully, the actual range of the Int type is much larger... so much so that in most cases the bias is nearly undetectable. However, it is something to be aware of if you ever find yourself doing this for serious security code.

你可以看到这里有更多的机会产生一些数字:偏见。值得庆幸的是,Int类型的实际范围要大得多……如此之多,以至于在大多数情况下,偏差几乎无法察觉。但是,如果您发现自己正在为严重的安全代码做此工作,则需要注意这一点。

#2


77  

When called multiple times in a single batch, rand() returns the same number.

当在单个批中多次调用时,rand()返回相同的数字。

I'd suggest using convert(varbinary,newid()) as the seed argument:

我建议使用convert(varbinary,newid()作为种子参数:

SELECT table_name, 1.0 + floor(14 * RAND(convert(varbinary, newid()))) magic_number 
FROM information_schema.tables

newid() is guaranteed to return a different value each time it's called, even within the same batch, so using it as a seed will prompt rand() to give a different value each time.

newid()保证在每次调用时返回一个不同的值,即使是在同一批中,因此将它作为种子使用将会提示rand()每次给出不同的值。

Edited to get a random whole number from 1 to 14.

编辑后得到一个从1到14的随机数。

#3


55  

RAND(CHECKSUM(NEWID()))

The above will generate a (pseudo-) random number between 0 and 1, exclusive. If used in a select, because the seed value changes for each row, it will generate a new random number for each row (it is not guaranteed to generate a unique number per row however).

上面将生成一个(伪-)随机数,在0到1之间,独占。如果在select中使用,因为每一行的种子值都在变化,它将为每一行生成一个新的随机数(但不能保证每一行都生成一个唯一的随机数)。

Example when combined with an upper limit of 10 (produces numbers 1 - 10):

当与10的上限结合时(产生数字1 - 10):

CAST(RAND(CHECKSUM(NEWID())) * 10 as INT) + 1

Transact-SQL Documentation:

transact - sql批量文档:

  1. CAST(): https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
  2. 把():https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
  3. RAND(): http://msdn.microsoft.com/en-us/library/ms177610.aspx
  4. RAND():http://msdn.microsoft.com/en-us/library/ms177610.aspx
  5. CHECKSUM(): http://msdn.microsoft.com/en-us/library/ms189788.aspx
  6. 校验和():http://msdn.microsoft.com/en-us/library/ms189788.aspx
  7. NEWID(): https://docs.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql
  8. NEWID():https://docs.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql

#4


21  

Random number generation between 1000 and 9999:

1000到9999之间的随机数生成:

FLOOR(RAND(CHECKSUM(NEWID()))*(9999-1000)+1000)

#5


12  

Answering the old question, but this answer has not been provided previously, and hopefully this will be useful for someone finding this results through a search engine.

回答旧的问题,但是之前没有提供这个答案,希望这对通过搜索引擎找到这个结果的人有用。

With SQL Server 2008, a new function has been introduced, CRYPT_GEN_RANDOM(8), which uses CryptoAPI to produce a cryptographically strong random number, returned as VARBINARY(8000). Here's the documentation page: https://docs.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql

使用SQL Server 2008,引入了一个新的函数CRYPT_GEN_RANDOM(8),它使用CryptoAPI生成一个加密的强随机数,返回为VARBINARY(8000)。这里是文档页面:https://docs.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql

So to get a random number, you can simply call the function and cast it to the necessary type:

要得到随机数,只需调用函数并将其转换为必要的类型:

select CAST(CRYPT_GEN_RANDOM(8) AS bigint)

or to get a float between -1 and +1, you could do something like this:

或者得到-1和+1之间的浮动,你可以这样做:

select CAST(CRYPT_GEN_RANDOM(8) AS bigint) % 1000000000 / 1000000000.0

#6


11  

The Rand() function will generate the same random number, if used in a table SELECT query. Same applies if you use a seed to the Rand function. An alternative way to do it, is using this:

如果在表选择查询中使用,那么Rand()函数将生成相同的随机数。如果在Rand函数中使用种子,也同样适用。另一种方法是:

SELECT ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS [RandomNumber]

Got the information from here, which explains the problem very well.

从这里得到的信息可以很好地解释这个问题。

#7


5  

Do you have an integer value in each row that you could pass as a seed to the RAND function?

是否在每一行中都有一个可以作为种子传递给RAND函数的整数值?

To get an integer between 1 and 14 I believe this would work:

要得到1到14之间的整数,我相信这是可行的:

FLOOR( RAND(<yourseed>) * 14) + 1

#8


4  

try using a seed value in the RAND(seedInt). RAND() will only execute once per statement that is why you see the same number each time.

尝试在RAND(seedInt)中使用种子值。RAND()将只执行每个语句一次,这就是为什么每次都看到相同的数字。

#9


4  

If you don't need it to be an integer, but any random unique identifier, you can use newid()

如果您不需要它是整数,而是任何随机唯一标识符,那么可以使用newid()

SELECT table_name, newid() magic_number 
FROM information_schema.tables

#10


4  

You would need to call RAND() for each row. Here is a good example

您需要为每一行调用RAND()。这里有一个很好的例子

https://web.archive.org/web/20090216200320/http://dotnet.org.za/calmyourself/archive/2007/04/13/sql-rand-trap-same-value-per-row.aspx

https://web.archive.org/web/20090216200320/http:/ / dotnet.org.za / calmyourself /归档/ 2007/04/13 / sql-rand-trap-same-value-per-row.aspx

#11


4  

If you need to preserve your seed so that it generates the "same" random data every time, you can do the following:

如果您需要保存您的种子,以便它每次都生成“相同”的随机数据,您可以执行以下操作:

1. Create a view that returns select rand()

1。创建返回select rand()的视图

if object_id('cr_sample_randView') is not null
begin
    drop view cr_sample_randView
end
go

create view cr_sample_randView
as
select rand() as random_number
go

2. Create a UDF that selects the value from the view.

2。创建一个UDF,从视图中选择值。

if object_id('cr_sample_fnPerRowRand') is not null
begin
    drop function cr_sample_fnPerRowRand
end
go

create function cr_sample_fnPerRowRand()
returns float
as
begin
    declare @returnValue float
    select @returnValue = random_number from cr_sample_randView
    return @returnValue
end
go

3. Before selecting your data, seed the rand() function, and then use the UDF in your select statement.

3所示。在选择数据之前,输入rand()函数,然后在select语句中使用UDF。

select rand(200);   -- see the rand() function
with cte(id) as
(select row_number() over(order by object_id) from sys.all_objects)
select 
    id,
    dbo.cr_sample_fnPerRowRand()
from cte
where id <= 1000    -- limit the results to 1000 random numbers

#12


3  

select round(rand(checksum(newid()))*(10)+20,2)

Here the random number will come in between 20 and 30. round will give two decimal place maximum.

这里的随机数在20到30之间。四舍五入将给出两个十进制的最大值。

If you want negative numbers you can do it with

如果你想要负数,你可以用它

select round(rand(checksum(newid()))*(10)-60,2)

Then the min value will be -60 and max will be -50.

最小值是-60,最大值是-50。

#13


2  

select newid()

选择newid()

or possibly this select binary_checksum(newid())

或者这个select binary_checksum(newid())

#14


2  

The problem I sometimes have with the selected "Answer" is that the distribution isn't always even. If you need a very even distribution of random 1 - 14 among lots of rows, you can do something like this (my database has 511 tables, so this works. If you have less rows than you do random number span, this does not work well):

对于所选的“答案”,我有时遇到的问题是,分布并不总是均匀的。如果您需要在许多行中随机分布1 - 14,那么您可以这样做(我的数据库有511个表,所以这是有效的。如果你的行数比你的随机数跨度要少,这就不能很好地工作):

SELECT table_name, ntile(14) over(order by newId()) randomNumber 
FROM information_schema.tables

This kind of does the opposite of normal random solutions in the sense that it keeps the numbers sequenced and randomizes the other column.

这种方法与常规随机解相反,因为它保持数字的顺序,并随机化另一列。

Remember, I have 511 tables in my database (which is pertinent only b/c we're selecting from the information_schema). If I take the previous query and put it into a temp table #X, and then run this query on the resulting data:

请记住,我的数据库中有511个表(这只与我们从information_schema中选择的b/c有关)。如果我将前面的查询放到一个临时表#X中,然后在结果数据上运行这个查询:

select randomNumber, count(*) ct from #X
group by randomNumber

I get this result, showing me that my random number is VERY evenly distributed among the many rows:

我得到这个结果,告诉我我的随机数在很多行中分布很均匀

如何为TSQL Select中的每一行生成随机数?

#15


1  

select ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) as [Randomizer]

has always worked for me

一直为我工作吗