从SQL Server表中获取随机行数

时间:2021-11-08 01:36:18

I am trying to get 5 random number of rows from a large table (over 1 million rows) with a fast method.

我试图用一个快速的方法从一个大表(超过100万行)中获取5个随机行数。

So far what I have tested with these SQL queries:

到目前为止,我使用这些SQL查询进行了测试:

Method 1

Select top 5 customer_id, customer_name 
from Customer TABLESAMPLE(1000 rows) 
order by newid()

This method estimated I/O cost is 0.0127546 so this is very fast (index scan nonclustered)

这种方法估计的I / O成本是0.0127546所以这非常快(索引扫描非聚集)

Method 2

select top 5 customer_id, customer_name 
from Customer 
order by newid()

This method's sort estimated I/O cost is 117.21189 and index scan nonclustered estimated I/O cost is 2.8735, so this is affecting performance

此方法的排序估计I / O成本为117.21189,索引扫描非聚集估计I / O成本为2.8735,因此这会影响性能

Method 3

select top 5 customer_id, customer_name 
from Customer 
order by rand(checksum(*))

This method's sort estimated I/O cost is 117.212 and index scan nonclustered estimated I/O cost is 213.149, this query is slower than all because estimated subtree cost is 213.228 so it's very slow.

此方法的排序估计I / O成本为117.212,索引扫描非聚集估计I / O成本为213.149,此查询比所有查询慢,因为估计的子树成本为213.228,所以它非常慢。

UPDATE:

Method 4

select top 5 customer_id, customer_name, product_id
from Customer 
Join Product on product_id = product_id
where (customer_active = 'TRUE')
order by checksum(newid())

This approach is better and very fast. All the benchmark testing is fine.

这种方法更好,速度更快。所有基准测试都没问题。

QUESTION

How can I convert Method 4 to LINQ-to-SQL? Thanks

如何将方法4转换为LINQ-to-SQL?谢谢

2 个解决方案

#1


2  

If you want to convert Method 2 into Linq To Entities just use the solution answered by jitender which look like this:

如果你想将方法2转换为Linq To Entities,只需使用jitender回答的解决方案,如下所示:

var randomCoustmers = context.Customers.OrderBy(x => Guid.NewGuid()).Take(5);

But for Method 1 which is very fast following your benchmarking, you need to do the following C# code because Linq To Entities doesn't have a LINQ equivalent for this SQL statement TABLESAMPLE(1000 rows).

但是对于在基准测试后速度非常快的方法1,您需要执行以下C#代码,因为Linq To Entities没有此SQL语句TABLESAMPLE(1000行)的LINQ等效项。

var randomCoustmers = context.Customers.SqlQuery("Select TOP 5 customer_id, customer_name from Customer TABLESAMPLE(1000 rows) order by newid()").ToList();

You can move the SQL statements into a SQL View or Stored Procedure which will receive the number of customers to take.

您可以将SQL语句移动到SQL视图或存储过程中,该过程将接收要占用的客户数。

UPDATE

For Method 4 which seems to be very fast (always by following your benchmark), you can do the following Linq To Entities:

对于似乎非常快的方法4(始终遵循您的基准),您可以执行以下Linq To Entities:

var randomCoustmers = context.Customers.OrderBy(c => SqlFunctions.Checksum(Guid.NewGuid()).Take(5);

Entity Framework can translate into SQL all functions that are defined into SqlFunctions class. In those functions we have Checksum function which will do what you want.

实体框架可以将所有定义为SqlFunctions类的函数转换为SQL。在这些函数中,我们有Checksum函数,可以执行您想要的操作。

If you want to join with other tables you can do it without difficulty with Linq To Entites so I just simplified my version by querying only the Customers DbSets.

如果你想加入其他表,你可以毫不费力地使用Linq To Entites,所以我只通过查询Customers DbSets简化了我的版本。

#2


0  

As stated Here's the best way:

如上所述这是最好的方式:

var randomCoustmers = Customers.OrderBy(x => Guid.NewGuid()).Take(5);

#1


2  

If you want to convert Method 2 into Linq To Entities just use the solution answered by jitender which look like this:

如果你想将方法2转换为Linq To Entities,只需使用jitender回答的解决方案,如下所示:

var randomCoustmers = context.Customers.OrderBy(x => Guid.NewGuid()).Take(5);

But for Method 1 which is very fast following your benchmarking, you need to do the following C# code because Linq To Entities doesn't have a LINQ equivalent for this SQL statement TABLESAMPLE(1000 rows).

但是对于在基准测试后速度非常快的方法1,您需要执行以下C#代码,因为Linq To Entities没有此SQL语句TABLESAMPLE(1000行)的LINQ等效项。

var randomCoustmers = context.Customers.SqlQuery("Select TOP 5 customer_id, customer_name from Customer TABLESAMPLE(1000 rows) order by newid()").ToList();

You can move the SQL statements into a SQL View or Stored Procedure which will receive the number of customers to take.

您可以将SQL语句移动到SQL视图或存储过程中,该过程将接收要占用的客户数。

UPDATE

For Method 4 which seems to be very fast (always by following your benchmark), you can do the following Linq To Entities:

对于似乎非常快的方法4(始终遵循您的基准),您可以执行以下Linq To Entities:

var randomCoustmers = context.Customers.OrderBy(c => SqlFunctions.Checksum(Guid.NewGuid()).Take(5);

Entity Framework can translate into SQL all functions that are defined into SqlFunctions class. In those functions we have Checksum function which will do what you want.

实体框架可以将所有定义为SqlFunctions类的函数转换为SQL。在这些函数中,我们有Checksum函数,可以执行您想要的操作。

If you want to join with other tables you can do it without difficulty with Linq To Entites so I just simplified my version by querying only the Customers DbSets.

如果你想加入其他表,你可以毫不费力地使用Linq To Entites,所以我只通过查询Customers DbSets简化了我的版本。

#2


0  

As stated Here's the best way:

如上所述这是最好的方式:

var randomCoustmers = Customers.OrderBy(x => Guid.NewGuid()).Take(5);