I am trying to get 5 random number of rows from a large table (over 1 million rows) with a fast method.
我试图用一个快速的方法从一个大表(超过100万行)中获取5个随机行数。
So far what I have tested with these SQL queries:
到目前为止,我使用这些SQL查询进行了测试:
Method 1
Select top 5 customer_id, customer_name
from Customer TABLESAMPLE(1000 rows)
order by newid()
This method estimated I/O cost is 0.0127546
so this is very fast (index scan nonclustered)
这种方法估计的I / O成本是0.0127546所以这非常快(索引扫描非聚集)
Method 2
select top 5 customer_id, customer_name
from Customer
order by newid()
This method's sort estimated I/O cost is 117.21189
and index scan nonclustered estimated I/O cost is 2.8735
, so this is affecting performance
此方法的排序估计I / O成本为117.21189,索引扫描非聚集估计I / O成本为2.8735,因此这会影响性能
Method 3
select top 5 customer_id, customer_name
from Customer
order by rand(checksum(*))
This method's sort estimated I/O cost is 117.212
and index scan nonclustered estimated I/O cost is 213.149
, this query is slower than all because estimated subtree cost is 213.228
so it's very slow.
此方法的排序估计I / O成本为117.212,索引扫描非聚集估计I / O成本为213.149,此查询比所有查询慢,因为估计的子树成本为213.228,所以它非常慢。
UPDATE:
Method 4
select top 5 customer_id, customer_name, product_id
from Customer
Join Product on product_id = product_id
where (customer_active = 'TRUE')
order by checksum(newid())
This approach is better and very fast. All the benchmark testing is fine.
这种方法更好,速度更快。所有基准测试都没问题。
QUESTION
How can I convert Method 4 to LINQ-to-SQL? Thanks
如何将方法4转换为LINQ-to-SQL?谢谢
2 个解决方案
#1
2
If you want to convert Method 2 into Linq To Entities just use the solution answered by jitender
which look like this:
如果你想将方法2转换为Linq To Entities,只需使用jitender回答的解决方案,如下所示:
var randomCoustmers = context.Customers.OrderBy(x => Guid.NewGuid()).Take(5);
But for Method 1 which is very fast following your benchmarking, you need to do the following C# code because Linq To Entities doesn't have a LINQ equivalent for this SQL statement TABLESAMPLE(1000 rows)
.
但是对于在基准测试后速度非常快的方法1,您需要执行以下C#代码,因为Linq To Entities没有此SQL语句TABLESAMPLE(1000行)的LINQ等效项。
var randomCoustmers = context.Customers.SqlQuery("Select TOP 5 customer_id, customer_name from Customer TABLESAMPLE(1000 rows) order by newid()").ToList();
You can move the SQL statements into a SQL View or Stored Procedure which will receive the number of customers to take.
您可以将SQL语句移动到SQL视图或存储过程中,该过程将接收要占用的客户数。
UPDATE
For Method 4 which seems to be very fast (always by following your benchmark), you can do the following Linq To Entities:
对于似乎非常快的方法4(始终遵循您的基准),您可以执行以下Linq To Entities:
var randomCoustmers = context.Customers.OrderBy(c => SqlFunctions.Checksum(Guid.NewGuid()).Take(5);
Entity Framework can translate into SQL all functions that are defined into SqlFunctions class. In those functions we have Checksum
function which will do what you want.
实体框架可以将所有定义为SqlFunctions类的函数转换为SQL。在这些函数中,我们有Checksum函数,可以执行您想要的操作。
If you want to join with other tables you can do it without difficulty with Linq To Entites so I just simplified my version by querying only the Customers
DbSets
.
如果你想加入其他表,你可以毫不费力地使用Linq To Entites,所以我只通过查询Customers DbSets简化了我的版本。
#2
0
As stated Here's the best way:
如上所述这是最好的方式:
var randomCoustmers = Customers.OrderBy(x => Guid.NewGuid()).Take(5);
#1
2
If you want to convert Method 2 into Linq To Entities just use the solution answered by jitender
which look like this:
如果你想将方法2转换为Linq To Entities,只需使用jitender回答的解决方案,如下所示:
var randomCoustmers = context.Customers.OrderBy(x => Guid.NewGuid()).Take(5);
But for Method 1 which is very fast following your benchmarking, you need to do the following C# code because Linq To Entities doesn't have a LINQ equivalent for this SQL statement TABLESAMPLE(1000 rows)
.
但是对于在基准测试后速度非常快的方法1,您需要执行以下C#代码,因为Linq To Entities没有此SQL语句TABLESAMPLE(1000行)的LINQ等效项。
var randomCoustmers = context.Customers.SqlQuery("Select TOP 5 customer_id, customer_name from Customer TABLESAMPLE(1000 rows) order by newid()").ToList();
You can move the SQL statements into a SQL View or Stored Procedure which will receive the number of customers to take.
您可以将SQL语句移动到SQL视图或存储过程中,该过程将接收要占用的客户数。
UPDATE
For Method 4 which seems to be very fast (always by following your benchmark), you can do the following Linq To Entities:
对于似乎非常快的方法4(始终遵循您的基准),您可以执行以下Linq To Entities:
var randomCoustmers = context.Customers.OrderBy(c => SqlFunctions.Checksum(Guid.NewGuid()).Take(5);
Entity Framework can translate into SQL all functions that are defined into SqlFunctions class. In those functions we have Checksum
function which will do what you want.
实体框架可以将所有定义为SqlFunctions类的函数转换为SQL。在这些函数中,我们有Checksum函数,可以执行您想要的操作。
If you want to join with other tables you can do it without difficulty with Linq To Entites so I just simplified my version by querying only the Customers
DbSets
.
如果你想加入其他表,你可以毫不费力地使用Linq To Entites,所以我只通过查询Customers DbSets简化了我的版本。
#2
0
As stated Here's the best way:
如上所述这是最好的方式:
var randomCoustmers = Customers.OrderBy(x => Guid.NewGuid()).Take(5);