哪个方法的性能更好:.Any() vs .Count() > 0?

in the System.Linq namespace, we can now extend our IEnumerable's to have theAny() and Count() extension methods.

在系统中。Linq命名空间，我们现在可以扩展IEnumerable，它有any()和Count()扩展方法。

I was told recently that if i want to check that a collection contains 1 or more items inside it, I should use the .Any() extension method instead of the .Count() > 0 extension method because the .Count() extension method has to iterate through all the items.

最近有人告诉我，如果我想检查一个集合中包含了一个或多个条目，那么我应该使用. any()扩展方法，而不是。count() > 0扩展方法，因为。count()扩展方法必须遍历所有项。

Secondly, some collections have a property (not an extension method) that is Count or Length. Would it be better to use those, instead of .Any() or .Count() ?

其次，有些集合具有计数或长度的属性(而不是扩展方法)。使用它们会比使用.Any()或.Count()更好吗?

yea / nae ?

是的/美国国家吗?

8 个解决方案

#1

588

If you are starting with something that has a .Length or .Count (such as ICollection<T>, IList<T>, List<T>, etc) - then this will be the fastest option, since it doesn't need to go through the GetEnumerator()/MoveNext()/Dispose() sequence required by Any() to check for a non-empty IEnumerable<T> sequence.

如果您从具有. length或. count(例如ICollection ， IList ， List ， etc)的内容开始——那么这将是最快的选项，因为它不需要通过GetEnumerator()/MoveNext()/Dispose()序列来检查一个非空的ibb3。

For just IEnumerable<T>, then Any() will generally be quicker, as it only has to look at one iteration. However, note that the LINQ-to-Objects implementation of Count() does check for ICollection<T> (using .Count as an optimisation) - so if your underlying data-source is directly a list/collection, there won't be a huge difference. Don't ask me why it doesn't use the non-generic ICollection...

对于IEnumerable ，那么Any()通常会更快，因为它只需要查看一次迭代。但是，请注意，Count()的linqto - object实现确实检查了ICollection (使用.Count作为优化)——因此，如果底层数据源直接是列表/集合，则不会有太大的差异。不要问我为什么它不使用非通用的ICollection…

Of course, if you have used LINQ to filter it etc (Where etc), you will have an iterator-block based sequence, and so this ICollection<T> optimisation is useless.

当然，如果您使用LINQ来过滤它(在哪里等等)，您将有一个基于迭代器块的序列，所以这个ICollection 优化是没有用的。

In general with IEnumerable<T> : stick with Any() ;-p

对于IEnumerable ，一般用Any();-p

#2

Note: I wrote this answer when Entity Framework 4 was actual. The point of this answer was not to get into trivial .Any() vs .Count() performance testing. The point was to signal that EF is far from perfect. Newer versions are better... but if you have part of code that's slow and it uses EF, test with direct TSQL and compare performance rather than relying on assumptions (that .Any() is ALWAYS faster than .Count() > 0).

注意:当实体框架4是真实的时，我编写了这个答案。这个答案的意义不是简单的. any()和. count()性能测试。关键是要表明英孚远非十全十美。新版本更好……但是，如果代码中有一部分很慢，并且使用EF，那么使用直接的TSQL进行测试并比较性能，而不是依赖于假设(. any()总是比. count() >快)。

While I agree with most up-voted answer and comments - especially on the point Any signals developer intent better than Count() > 0 - I've had situation in which Count is faster by order of magnitude on SQL Server (EntityFramework 4).

虽然我同意大多数向上投票的答案和评论——尤其是在任何一个开发者意图优于Count() >的信号上——但我也遇到过这样的情况，即SQL Server (EntityFramework 4)上的计数按大小排序更快。

Here is query with Any that thew timeout exception (on ~200.000 records):

这里有任何关于w超时异常的查询(在~200.000条记录中):

con = db.Contacts.
    Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated
        && !a.NewsletterLogs.Any(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr)
    ).OrderBy(a => a.ContactId).
    Skip(position - 1).
    Take(1).FirstOrDefault();

Count version executed in matter of milliseconds:

以毫秒为单位执行的计数版本:

con = db.Contacts.
    Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated
        && a.NewsletterLogs.Count(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr) == 0
    ).OrderBy(a => a.ContactId).
    Skip(position - 1).
    Take(1).FirstOrDefault();

I need to find a way to see what exact SQL both LINQs produce - but it's obvious there is a huge performance difference between Count and Any in some cases, and unfortunately it seems you can't just stick with Any in all cases.

我需要找到一种方法来查看这两个LINQs产生的确切的SQL——但是很明显，在某些情况下Count和Any之间存在巨大的性能差异，不幸的是，您似乎不能在所有情况下都坚持使用它们。

EDIT: Here are generated SQLs. Beauties as you can see ;)

编辑:这里是生成的sql。你看到的美女;)

ANY:

任何:

exec sp_executesql N'SELECT TOP (1) 
[Project2].[ContactId] AS [ContactId], 
[Project2].[CompanyId] AS [CompanyId], 
[Project2].[ContactName] AS [ContactName], 
[Project2].[FullName] AS [FullName], 
[Project2].[ContactStatusId] AS [ContactStatusId], 
[Project2].[Created] AS [Created]
FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]
    FROM ( SELECT 
        [Extent1].[ContactId] AS [ContactId], 
        [Extent1].[CompanyId] AS [CompanyId], 
        [Extent1].[ContactName] AS [ContactName], 
        [Extent1].[FullName] AS [FullName], 
        [Extent1].[ContactStatusId] AS [ContactStatusId], 
        [Extent1].[Created] AS [Created]
        FROM [dbo].[Contact] AS [Extent1]
        WHERE ([Extent1].[CompanyId] = @p__linq__0) AND ([Extent1].[ContactStatusId] <= 3) AND ( NOT EXISTS (SELECT 
            1 AS [C1]
            FROM [dbo].[NewsletterLog] AS [Extent2]
            WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])
        ))
    )  AS [Project2]
)  AS [Project2]
WHERE [Project2].[row_number] > 99
ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4

COUNT:

数:

exec sp_executesql N'SELECT TOP (1) 
[Project2].[ContactId] AS [ContactId], 
[Project2].[CompanyId] AS [CompanyId], 
[Project2].[ContactName] AS [ContactName], 
[Project2].[FullName] AS [FullName], 
[Project2].[ContactStatusId] AS [ContactStatusId], 
[Project2].[Created] AS [Created]
FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]
    FROM ( SELECT 
        [Project1].[ContactId] AS [ContactId], 
        [Project1].[CompanyId] AS [CompanyId], 
        [Project1].[ContactName] AS [ContactName], 
        [Project1].[FullName] AS [FullName], 
        [Project1].[ContactStatusId] AS [ContactStatusId], 
        [Project1].[Created] AS [Created]
        FROM ( SELECT 
            [Extent1].[ContactId] AS [ContactId], 
            [Extent1].[CompanyId] AS [CompanyId], 
            [Extent1].[ContactName] AS [ContactName], 
            [Extent1].[FullName] AS [FullName], 
            [Extent1].[ContactStatusId] AS [ContactStatusId], 
            [Extent1].[Created] AS [Created], 
            (SELECT 
                COUNT(1) AS [A1]
                FROM [dbo].[NewsletterLog] AS [Extent2]
                WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])) AS [C1]
            FROM [dbo].[Contact] AS [Extent1]
        )  AS [Project1]
        WHERE ([Project1].[CompanyId] = @p__linq__0) AND ([Project1].[ContactStatusId] <= 3) AND (0 = [Project1].[C1])
    )  AS [Project2]
)  AS [Project2]
WHERE [Project2].[row_number] > 99
ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4

Seems that pure Where with EXISTS works much worse than calculating Count and then doing Where with Count == 0.

看起来纯粹的在存在的地方比计算计数和在Count = 0的地方做要糟糕得多。

Let me know if you guys see some error in my findings. What can be taken out of all this regardless of Any vs Count discussion is that any more complex LINQ is way better off when rewritten as Stored Procedure ;).

如果你们发现我的发现有错误，请告诉我。不管vs Count的讨论是什么，从所有这些讨论中可以得出的结论是，如果将任何更复杂的LINQ重写为存储过程，效果会更好;)

#3

Since this is rather popular topic and answers differ I had to take a fresh look on problem.

由于这是一个相当流行的话题，答案也各不相同，我不得不重新审视这个问题。

Testing env: EF 6.1.3, SQL Server, 300k records

测试环境:EF 6.1.3, SQL Server, 300k记录

Table model:

表模型:

class TestTable
{
    [Key]
    public int Id { get; set; }

    public string Name { get; set; }

    public string Surname { get; set; }
}

Test code:

测试代码:

class Program
{
    static void Main()
    {
        using (var context = new TestContext())
        {
            context.Database.Log = Console.WriteLine;

            context.TestTables.Where(x => x.Surname.Contains("Surname")).Any(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Any(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname")).Count(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Count(x => x.Id > 1000);

            Console.ReadLine();
        }
    }
}

Results:

结果:

Any() ~ 3ms

任何()~ 3毫秒

Count() ~ 230ms for first query, ~ 400ms for second

第一次查询计数()~ 230ms，第二次查询计数~ 400ms

Remarks:

备注:

For my case EF didn't generate SQL like @Ben mentioned in his post.

在我的案例中，EF并没有生成像在他的文章中提到的@Ben那样的SQL。

#4

EDIT: it was fixed in EF version 6.1.1. and this answer is no more actual

编辑:修复在EF版本6.1.1中。这个答案不再真实

For SQL Server and EF4-6, Count() performs about two times faster than Any().

对于SQL Server和EF4-6, Count()的执行速度是任何()的两倍。

When you run Table.Any(), it will generate something like(alert: don't hurt the brain trying to understand it)

当你运行Table.Any()时，它会产生类似(警告:不要伤害试图理解它的大脑)的东西

SELECT 
CASE WHEN ( EXISTS (SELECT 
    1 AS [C1]
    FROM [Table] AS [Extent1]
)) THEN cast(1 as bit) WHEN ( NOT EXISTS (SELECT 
    1 AS [C1]
    FROM [Table] AS [Extent2]
)) THEN cast(0 as bit) END AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]

that requires 2 scans of rows with your condition.

这需要对符合条件的行进行两次扫描。

I don't like to write Count() > 0 because it hides my intention. I prefer to use custom predicate for this:

我不喜欢写Count() >，因为它隐藏了我的意图。我更喜欢使用自定义谓词:

public static class QueryExtensions
{
    public static bool Exists<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
    {
        return source.Count(predicate) > 0;
    }
}

#5

Well, the .Count() extension method won't use the .Count property, but I would assume you wouldn't use the .Count() method for a simple collection, but rather at the end of a LINQ statement with filtering criteria, etc.

当然，. count()扩展方法不会使用. count属性，但我假设您不会将. count()方法用于简单的集合，而是在带有过滤条件的LINQ语句的末尾。

In that context, .Any() will be faster than .Count() > 0.

在此上下文中，. any()将比. count() >快。

#6

It depends, how big is the data set and what are your performance requirements?

这取决于数据集有多大以及性能要求是什么?

If it's nothing gigantic use the most readable form, which for myself is any, because it's shorter and readable rather than an equation.

如果它不是巨大的，使用最易读的形式，对我来说是任意的，因为它更短更易读，而不是方程。

#7

You can make a simple test to figure this out:

你可以做一个简单的测试来解决这个问题:

var query = //make any query here
var timeCount = new Stopwatch();
timeCount.Start();
if (query.Count > 0)
{
}
timeCount.Stop();
var testCount = timeCount.Elapsed;

var timeAny = new Stopwatch();
timeAny.Start();
if (query.Any())
{
}
timeAny.Stop();
var testAny = timeAny.Elapsed;

Check the values of testCount and testAny.

检查testCount和testAny的值。

#8

About the Count() method, if the IEnumarable is an ICollection, then we can't iterate across all items because we can retrieve the Count field of ICollection, if the IEnumerable is not an ICollection we must iterate across all items using a while with a MoveNext, take a look the .NET Framework Code:

Count()方法,如果IEnumarable ICollection,那么我们不能遍历所有物品,因为我们可以检索统计ICollection领域,如果没有一个ICollection IEnumerable我们必须遍历所有物品与MoveNext使用一段时间,看一看。net框架代码:

public static int Count<TSource>(this IEnumerable<TSource> source)
{
    if (source == null) 
        throw Error.ArgumentNull("source");

    ICollection<TSource> collectionoft = source as ICollection<TSource>;
    if (collectionoft != null) 
        return collectionoft.Count;

    ICollection collection = source as ICollection;
    if (collection != null) 
        return collection.Count;

    int count = 0;
    using (IEnumerator<TSource> e = source.GetEnumerator())
    {
        checked
        {
            while (e.MoveNext()) count++;
        }
    }
    return count;
}

Reference: Reference Source Enumerable

引用:引用可列举的来源

#1

588