我可以用什么构造代替Contains?

时间:2022-09-05 13:46:17

I have a list with ids:

我有一个包含id的列表:

var myList = new List<int>();

I want to select all objects from db with ids from myList:

我想从db中选择来自myList的id的所有对象:

var objList= myContext.MyObjects.Where(t => myList.Contains(t.Id)).ToList();

But when myList.Count > 8000 i get an error:

但是当myList.Count> 8000时出现错误:

The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

查询处理器耗尽内部资源,无法生成查询计划。这是一种罕见的事件,仅适用于引用大量表或分区的极其复杂的查询或查询。请简化查询。如果您认为错误地收到了此消息,请与客户支持服务联系以获取更多信息。

I think that it's because i used Contains(). What can I use instead of Contains?

我认为这是因为我使用了Contains()。我可以使用什么而不是包含?

5 个解决方案

#1


4  

You could split the list in several sub-lists, and run separate queries:

您可以将列表拆分为多个子列表,并运行单独的查询:

int start = 0;
int count = 0;
const int chunk_size = 1000;
do {
    count = Math.Min(chunk_size, myList.Count - start);
    var tmpList = myList.GetRange(start, count);
    // run query with tmpList
    var objList= myContext.MyObjects.Where(t => tmpList.Contains(t.Id)).ToList();
    // do something with results...
    start += count;
} while (start < myList.Count);

Of course, you need to find out the good "chunk size" in some way that works for you. Depending on the size of the table and of the list, it might be more convenient to load the entire table and filter in the code, as suggested in other answers.

当然,你需要以适合自己的某种方式找出好的“块大小”。根据表和列表的大小,在代码中加载整个表和过滤器可能更方便,如其他答案中所建议的那样。

#2


14  

You can perform the query on the client side by adding AsEnumerable() to "hide" the Where clause from Entity Framework:

您可以通过添加AsEnumerable()来“隐藏”Entity Framework中的Where子句来在客户端执行查询:

var objList = myContext
  .MyObjects
  .AsEnumerable()
  .Where(t => myList.Contains(t.Id))
  .ToList();

To improve performance you can replace the list with a HashSet:

要提高性能,可以使用HashSet替换列表:

var myHashSet = new HashSet<int>(myList);

and then modify the predicate in Where accordingly:

然后相应地修改Where中的谓词:

  .Where(t => myHashSet.Contains(t.Id))

This is the "easy" solution in terms of time to implement. However, because the query is running client side you may get poor performance because all MyObjects rows are pulled to the client side before they are filtered.

就实施时间而言,这是“简单”的解决方案。但是,由于查询正在运行客户端,因此可能会导致性能不佳,因为所有MyObjects行在被过滤之前都会被拉到客户端。

The reason you get the error is because Entity Framework converts you query into something like this:

您收到错误的原因是因为Entity Framework将您的查询转换为以下内容:

SELECT ...
FROM ...
WHERE column IN (ID1, ID2, ... , ID8000)

So bascially all 8000 ID's from the list is included in the generated SQL which exceeds the limit of what SQL Server can handle.

因此,基本上,列表中的所有8000 ID都包含在生成的SQL中,超出了SQL Server可以处理的限制。

What Entity Framework "looks for" to generate this SQL is ICollection<T> which is implemented by both List<T> and HashSet<T> so if you try to keep the query on the server side you get no improved performance by using HashSet<T>. However, on the client side the story is different where Contains is O(1) for HashSet<T> and O(N) for List<T>.

实体框架“查找”生成此SQL的是ICollection ,它由List 和HashSet 实现,因此如果您尝试在服务器端保留查询,则使用HashSet不会提高性能 。但是,在客户端,故事是不同的,其中包含HashSet 的O(1)和List 的O(N)。

#3


8  

If you wan't this to perform well I'd suggest you use table valued parameters and a stored procedure.

如果你不想这样做我建议你使用表值参数和存储过程。

in your database, using TSQL,

在您的数据库中,使用TSQL,

CREATE TYPE [dbo].[IdSet] AS TABLE
(
    [Id] INT
);
GO

CREATE PROCEDURE [dbo].[Get<table>]
    @ids [dbo].[IdSet] READONLY
AS
    SET NOCOUNT ON;

    SELECT
                <Column List>
        FROM
                [dbo].[<table>] [T]
        WHERE
                [T].[Id] IN (SELECT [Id] FROM @ids);
RETURN 0;
GO

Then, in C#

然后,在C#中

var ids = new DataTable()
ids.Columns.Add("Id", typeof(int));

foreach (var id in myList)
{
    ids.Rows.Add(id);
}

var objList = myContext.SqlQuery<<entity>>(
    "[dbo].[Get<table>] @ids",
    new SqlParameter("@ids", SqDbType.Structured)
        { 
            Value = ids,
            TypeName = "[dbo].[IdSet]"
        }));

#4


5  

You could create a temporary database table which represents myList and refactor your query to a JOIN with that temporary List.

您可以创建一个临时数据库表来表示myList,并将您的查询重构为具有该临时List的JOIN。

The reason for the error is that the actual query produced contains all elements of myList.

出错的原因是生成的实际查询包含myList的所有元素。

Basically the DB (the query processor) needs to see both lists to do the filtering. If the second list is too large to fit inside the query you have to provide it otherwise (for example as a temp table)

基本上,DB(查询处理器)需要查看两个列表才能进行过滤。如果第二个列表太大而无法容纳在查询中,则必须另外提供(例如作为临时表)

#5


0  

Why not try

为什么不尝试

var objList= from obj in myContext.MyObjects
     join myId in myList on obj.Id equals myId
     select obj;

#1


4  

You could split the list in several sub-lists, and run separate queries:

您可以将列表拆分为多个子列表,并运行单独的查询:

int start = 0;
int count = 0;
const int chunk_size = 1000;
do {
    count = Math.Min(chunk_size, myList.Count - start);
    var tmpList = myList.GetRange(start, count);
    // run query with tmpList
    var objList= myContext.MyObjects.Where(t => tmpList.Contains(t.Id)).ToList();
    // do something with results...
    start += count;
} while (start < myList.Count);

Of course, you need to find out the good "chunk size" in some way that works for you. Depending on the size of the table and of the list, it might be more convenient to load the entire table and filter in the code, as suggested in other answers.

当然,你需要以适合自己的某种方式找出好的“块大小”。根据表和列表的大小,在代码中加载整个表和过滤器可能更方便,如其他答案中所建议的那样。

#2


14  

You can perform the query on the client side by adding AsEnumerable() to "hide" the Where clause from Entity Framework:

您可以通过添加AsEnumerable()来“隐藏”Entity Framework中的Where子句来在客户端执行查询:

var objList = myContext
  .MyObjects
  .AsEnumerable()
  .Where(t => myList.Contains(t.Id))
  .ToList();

To improve performance you can replace the list with a HashSet:

要提高性能,可以使用HashSet替换列表:

var myHashSet = new HashSet<int>(myList);

and then modify the predicate in Where accordingly:

然后相应地修改Where中的谓词:

  .Where(t => myHashSet.Contains(t.Id))

This is the "easy" solution in terms of time to implement. However, because the query is running client side you may get poor performance because all MyObjects rows are pulled to the client side before they are filtered.

就实施时间而言,这是“简单”的解决方案。但是,由于查询正在运行客户端,因此可能会导致性能不佳,因为所有MyObjects行在被过滤之前都会被拉到客户端。

The reason you get the error is because Entity Framework converts you query into something like this:

您收到错误的原因是因为Entity Framework将您的查询转换为以下内容:

SELECT ...
FROM ...
WHERE column IN (ID1, ID2, ... , ID8000)

So bascially all 8000 ID's from the list is included in the generated SQL which exceeds the limit of what SQL Server can handle.

因此,基本上,列表中的所有8000 ID都包含在生成的SQL中,超出了SQL Server可以处理的限制。

What Entity Framework "looks for" to generate this SQL is ICollection<T> which is implemented by both List<T> and HashSet<T> so if you try to keep the query on the server side you get no improved performance by using HashSet<T>. However, on the client side the story is different where Contains is O(1) for HashSet<T> and O(N) for List<T>.

实体框架“查找”生成此SQL的是ICollection ,它由List 和HashSet 实现,因此如果您尝试在服务器端保留查询,则使用HashSet不会提高性能 。但是,在客户端,故事是不同的,其中包含HashSet 的O(1)和List 的O(N)。

#3


8  

If you wan't this to perform well I'd suggest you use table valued parameters and a stored procedure.

如果你不想这样做我建议你使用表值参数和存储过程。

in your database, using TSQL,

在您的数据库中,使用TSQL,

CREATE TYPE [dbo].[IdSet] AS TABLE
(
    [Id] INT
);
GO

CREATE PROCEDURE [dbo].[Get<table>]
    @ids [dbo].[IdSet] READONLY
AS
    SET NOCOUNT ON;

    SELECT
                <Column List>
        FROM
                [dbo].[<table>] [T]
        WHERE
                [T].[Id] IN (SELECT [Id] FROM @ids);
RETURN 0;
GO

Then, in C#

然后,在C#中

var ids = new DataTable()
ids.Columns.Add("Id", typeof(int));

foreach (var id in myList)
{
    ids.Rows.Add(id);
}

var objList = myContext.SqlQuery<<entity>>(
    "[dbo].[Get<table>] @ids",
    new SqlParameter("@ids", SqDbType.Structured)
        { 
            Value = ids,
            TypeName = "[dbo].[IdSet]"
        }));

#4


5  

You could create a temporary database table which represents myList and refactor your query to a JOIN with that temporary List.

您可以创建一个临时数据库表来表示myList,并将您的查询重构为具有该临时List的JOIN。

The reason for the error is that the actual query produced contains all elements of myList.

出错的原因是生成的实际查询包含myList的所有元素。

Basically the DB (the query processor) needs to see both lists to do the filtering. If the second list is too large to fit inside the query you have to provide it otherwise (for example as a temp table)

基本上,DB(查询处理器)需要查看两个列表才能进行过滤。如果第二个列表太大而无法容纳在查询中,则必须另外提供(例如作为临时表)

#5


0  

Why not try

为什么不尝试

var objList= from obj in myContext.MyObjects
     join myId in myList on obj.Id equals myId
     select obj;