LINQ to Entities相当于sql“TOP(n)WITH TIES”

时间:2022-09-04 02:14:44

I have been searcing for LINQ equivalent of WITH TIES in sql server lately, I came across a couple things, which couldn't proove to be useful.

我最近在sql server中一直在寻找LINQ等同于WITH TIES,我遇到了一些事情,这些事情无法发挥作用。

I know this question was asked before and has an accepted answer, but it doesn't work the way with ties does. The solution using GroupBy() doesn't result as expected for TOP(3) WITH TIES considering a data set consisting of {3 2 2 1 1 0} the result set will be {3 2 2 1 1} where it should be {3 2 2}

我知道这个问题之前被问过并且有一个已接受的答案,但它并不像领带那样有效。使用GroupBy()的解决方案不会出现TOP(3)WITH TIES的预期结果,考虑到由{3 2 2 1 1 0}组成的数据集,结果集将是{3 2 2 1 1},其应该是{ 3 2 2}

Using the following sample data (taken from this question):

使用以下示例数据(取自此问题):

CREATE TABLE Person
(
    Id int primary key,
    Name nvarchar(50),
    Score float
)    

INSERT INTO Person VALUES (1, 'Tom',8.9)
INSERT INTO Person VALUES (2, 'Jerry',8.9)
INSERT INTO Person VALUES (3, 'Sharti',7)
INSERT INTO Person VALUES (4, 'Mamuzi',9)
INSERT INTO Person VALUES (5, 'Kamala',9)

Traditional OrderByDescending(p => p.Score).Take(3) will result with: Mamuzi, Kamala and one of Tom (or Jerry) where it should include BOTH

传统的OrderByDescending(p => p.Score).Take(3)将导致:Mamuzi,Kamala和Tom(或Jerry)之一应该包括两者

I know there is no built-in equivalent of it and i've found a way to implement it. I don't know if it is the best way to do it and open for alternative solutions.

我知道没有内置的等价物,我找到了实现它的方法。我不知道这是否是最好的方式,并开放替代解决方案。

5 个解决方案

#1


4  

var query = (from q in list.OrderByDescending(s => s.Score).Take(3).Select(s => s.Score).Distinct()
             from i in list
             where q == i.Score
             select i).ToList();

Edit:

@Zefnus

I wasn't sure in which order you wanted it but to change the order you can put a OrderBy(s => s.Score) between select i and ToList()

我不确定你想要它的顺序,但是要更改顺序,你可以在select i和ToList()之间放置一个OrderBy(s => s.Score)

I don't have the possibility to check what sql statement my linq clause would produce. But your answer is much better i think. And your question was also really good. I never thought about top with ties in linq. ;)

我没有办法检查我的linq子句会产生什么sql语句。但我认为你的答案要好得多。你的问题也非常好。我从没想过在linq中有关系。 ;)

Basically it only takes top 3 scores from the first list and compares them with the whole list and i takes only those scores which are equal to the scores of the first list.

基本上它只需要从第一个列表中获得前3个分数,并将它们与整个列表进行比较,并且我只获取那些等于第一个列表的分数的分数。

#2


2  

Do not use IEnumerable<T> with anything touching a database!

A solution aimed at LinqToSql and LinqToEntities should not be using IEnumerable<T>. Your current self answer will result in every single person being selected from the database and then being queried in memory using LinqToObjects.

针对LinqToSql和LinqToEntities的解决方案不应该使用IEnumerable 。您当前的自我答案将导致从数据库中选择每个人,然后使用LinqToObjects在内存中查询。

To make a solution that is translated to SQL and executed by the database you have to use IQueryable<T> and Expressions instead.

要创建一个转换为SQL并由数据库执行的解决方案,您必须使用IQueryable 和Expressions。

public static class QueryableExtensions
{
    public static IQueryable<T> TopWithTies<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (topBy == null) throw new ArgumentNullException("topBy");
        if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));

        var topValues = source.OrderBy(topBy)
                              .Select(topBy)
                              .Take(topCount);

        var queryableMaxMethod = typeof(Queryable).GetMethods()
                                                  .Single(mi => mi.Name == "Max" &&
                                                                mi.GetParameters().Length == 1 &&
                                                                mi.IsGenericMethod)
                                                  .MakeGenericMethod(typeof(TComparand));

        var lessThanOrEqualToMaxTopValue = Expression.Lambda<Func<T, bool>>(
            Expression.LessThanOrEqual(
                topBy.Body,
                Expression.Call(
                    queryableMaxMethod,
                    topValues.Expression)),
            new[] { topBy.Parameters.Single() });

        var topNRowsWithTies = source.Where(lessThanOrEqualToMaxTopValue)
                                     .OrderBy(topBy);
        return topNRowsWithTies;
    }

    public static IQueryable<T> TopWithTiesDescending<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (topBy == null) throw new ArgumentNullException("topBy");
        if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));

        var topValues = source.OrderByDescending(topBy)
                              .Select(topBy)
                              .Take(topCount);

        var queryableMinMethod = typeof(Queryable).GetMethods()
                                                  .Single(mi => mi.Name == "Min" &&
                                                                mi.GetParameters().Length == 1 &&
                                                                mi.IsGenericMethod)
                                                  .MakeGenericMethod(typeof(TComparand));

        var greaterThanOrEqualToMinTopValue = Expression.Lambda<Func<T, bool>>(
            Expression.GreaterThanOrEqual(
                topBy.Body,
                Expression.Call(queryableMinMethod,
                                topValues.Expression)),
            new[] { topBy.Parameters.Single() });

        var topNRowsWithTies = source.Where(greaterThanOrEqualToMinTopValue)
                                     .OrderByDescending(topBy);
        return topNRowsWithTies;
    }
}

This creates queries of the following form:

这将创建以下形式的查询:

SELECT [t0].[Id], [t0].[Name], [t0].[Score]
FROM [Person] AS [t0]
WHERE [t0].[Score] >= ((
    SELECT MIN([t2].[Score])
    FROM (
        SELECT TOP (3) [t1].[Score]
        FROM [Person] AS [t1]
        ORDER BY [t1].[Score] DESC
        ) AS [t2]
    ))
ORDER BY [t0].[Score] DESC

That query is only about 50% worse than the baseline query:

该查询仅比基线查询差50%:

SELECT TOP (3) WITH TIES
    [t0].[Id], 
    [t0].[Name], 
    [t0].[Score]
FROM 
    [Person] AS [t0]
ORDER BY [t0].[Score] desc

With a data set consisting of your original 5 records and an additional 10000 records all with scores less than the original both of these are more or less instant (less than 20 milliseconds).

使用由原始5条记录和另外10000条记录组成的数据集,所有记录的得分均小于原始记录,这些记录或多或少都是即时的(小于20毫秒)。

The IEnumerable<T> approach took a whole 2 minutes!

IEnumerable 方法花了整整2分钟!


If the expression building and reflection seems scary the same thing can be achieved with a join:

如果表达式构建和反射看起来很可怕,那么使用连接可以实现同样的目的:

public static IQueryable<T> TopWithTiesDescendingJoin<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
{
    if (source == null) throw new ArgumentNullException("source");
    if (topBy == null) throw new ArgumentNullException("topBy");
    if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));

    var orderedByValue = source.OrderByDescending(topBy);
    var topNValues = orderedByValue.Select(topBy).Take(topCount).Distinct();
    var topNRowsWithTies = topNValues.Join(source, value => value, topBy, (x, row) => row);
    return topNRowsWithTies.OrderByDescending(topBy);
}

With the following query as the result (with about the same performance):

使用以下查询作为结果(具有大致相同的性能):

SELECT [t3].[Id], [t3].[Name], [t3].[Score]
FROM (
    SELECT DISTINCT [t1].[Score]
    FROM (
        SELECT TOP (3) [t0].[Score]
        FROM [Person] AS [t0]
        ORDER BY [t0].[Score] DESC
        ) AS [t1]
    ) AS [t2]
INNER JOIN [Person] AS [t3] ON [t2].[Score] = [t3].[Score]
ORDER BY [t3].[Score] DESC

#3


1  

Another solution - which probably is not as efficient as the other solution - is to get TOP(3) Scores and get the rows with Score values contained in the TOP(3).

另一个解决方案 - 可能不如其他解决方案那么高效 - 是获得TOP(3)分数并获得TOP(3)中包含得分值的行。

We can use Contains() as follows;

我们可以使用Contains()如下;

orderedPerson = datamodel.People.OrderByDescending(p => p.Score);

topPeopleList =
(
    from p in orderedPerson 
    let topNPersonScores = orderedPerson.Take(n).Select(p => p.Score).Distinct()
    where topNPersonScores.Contains(p.Score)
    select p
).ToList();

What's good about this implementation is that it's extension method TopWithTies() can be implemented easly as;

这个实现有什么好处,它的扩展方法TopWithTies()可以很容易地实现;

public static IEnumerable<T> TopWithTies<T, TResult>(this IEnumerable<T> enumerable, Func<T, TResult> selector, int n)
{
    IEnumerable<T> orderedEnumerable = enumerable.OrderByDescending(selector);

    return
    (
        from p in orderedEnumerable
        let topNValues = orderedEnumerable.Take(n).Select(selector).Distinct()
        where topNValues.Contains(selector(p))
        select p
    );
}

#4


0  

I think that maybe you can do something like:

我想也许你可以这样做:

OrderByDescending(p => p.Score).Skip(2).Take(1)

Count the number of occurrences of this element, and then:

计算此元素的出现次数,然后:

OrderByDescending(p => p.Score).Take(2 + "The select with the number of occurrences for the third element")

I think that maybe this works ;) It´s only an idea!

我认为这可能有效;)这只是一个想法!

#5


0  

I've found a solution taking the Score field value of the Nth row (3rd row in this case) using .Skip(n-1).Take(1) and selecting all rows with score value greater or equal to that as follows:

我找到了一个解决方案,使用.Skip(n-1)获取第N行的得分字段值(本例中为第3行)。选择(1)并选择得分值大于或等于所有的行,如下所示:

qryPeopleOrderedByScore = datamodel.People.OrderByDescending(p => p.Score);

topPeopleList =
(
    from p in qryPeopleOrderedByScore
    let lastPersonInList = qryPeopleOrderedByScore.Skip(2).Take(1).FirstOrDefault()
    where lastPersonInList == null || p.Score >= lastPersonInList.Score
    select p
).ToList();

#1


4  

var query = (from q in list.OrderByDescending(s => s.Score).Take(3).Select(s => s.Score).Distinct()
             from i in list
             where q == i.Score
             select i).ToList();

Edit:

@Zefnus

I wasn't sure in which order you wanted it but to change the order you can put a OrderBy(s => s.Score) between select i and ToList()

我不确定你想要它的顺序,但是要更改顺序,你可以在select i和ToList()之间放置一个OrderBy(s => s.Score)

I don't have the possibility to check what sql statement my linq clause would produce. But your answer is much better i think. And your question was also really good. I never thought about top with ties in linq. ;)

我没有办法检查我的linq子句会产生什么sql语句。但我认为你的答案要好得多。你的问题也非常好。我从没想过在linq中有关系。 ;)

Basically it only takes top 3 scores from the first list and compares them with the whole list and i takes only those scores which are equal to the scores of the first list.

基本上它只需要从第一个列表中获得前3个分数,并将它们与整个列表进行比较,并且我只获取那些等于第一个列表的分数的分数。

#2


2  

Do not use IEnumerable<T> with anything touching a database!

A solution aimed at LinqToSql and LinqToEntities should not be using IEnumerable<T>. Your current self answer will result in every single person being selected from the database and then being queried in memory using LinqToObjects.

针对LinqToSql和LinqToEntities的解决方案不应该使用IEnumerable 。您当前的自我答案将导致从数据库中选择每个人,然后使用LinqToObjects在内存中查询。

To make a solution that is translated to SQL and executed by the database you have to use IQueryable<T> and Expressions instead.

要创建一个转换为SQL并由数据库执行的解决方案,您必须使用IQueryable 和Expressions。

public static class QueryableExtensions
{
    public static IQueryable<T> TopWithTies<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (topBy == null) throw new ArgumentNullException("topBy");
        if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));

        var topValues = source.OrderBy(topBy)
                              .Select(topBy)
                              .Take(topCount);

        var queryableMaxMethod = typeof(Queryable).GetMethods()
                                                  .Single(mi => mi.Name == "Max" &&
                                                                mi.GetParameters().Length == 1 &&
                                                                mi.IsGenericMethod)
                                                  .MakeGenericMethod(typeof(TComparand));

        var lessThanOrEqualToMaxTopValue = Expression.Lambda<Func<T, bool>>(
            Expression.LessThanOrEqual(
                topBy.Body,
                Expression.Call(
                    queryableMaxMethod,
                    topValues.Expression)),
            new[] { topBy.Parameters.Single() });

        var topNRowsWithTies = source.Where(lessThanOrEqualToMaxTopValue)
                                     .OrderBy(topBy);
        return topNRowsWithTies;
    }

    public static IQueryable<T> TopWithTiesDescending<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (topBy == null) throw new ArgumentNullException("topBy");
        if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));

        var topValues = source.OrderByDescending(topBy)
                              .Select(topBy)
                              .Take(topCount);

        var queryableMinMethod = typeof(Queryable).GetMethods()
                                                  .Single(mi => mi.Name == "Min" &&
                                                                mi.GetParameters().Length == 1 &&
                                                                mi.IsGenericMethod)
                                                  .MakeGenericMethod(typeof(TComparand));

        var greaterThanOrEqualToMinTopValue = Expression.Lambda<Func<T, bool>>(
            Expression.GreaterThanOrEqual(
                topBy.Body,
                Expression.Call(queryableMinMethod,
                                topValues.Expression)),
            new[] { topBy.Parameters.Single() });

        var topNRowsWithTies = source.Where(greaterThanOrEqualToMinTopValue)
                                     .OrderByDescending(topBy);
        return topNRowsWithTies;
    }
}

This creates queries of the following form:

这将创建以下形式的查询:

SELECT [t0].[Id], [t0].[Name], [t0].[Score]
FROM [Person] AS [t0]
WHERE [t0].[Score] >= ((
    SELECT MIN([t2].[Score])
    FROM (
        SELECT TOP (3) [t1].[Score]
        FROM [Person] AS [t1]
        ORDER BY [t1].[Score] DESC
        ) AS [t2]
    ))
ORDER BY [t0].[Score] DESC

That query is only about 50% worse than the baseline query:

该查询仅比基线查询差50%:

SELECT TOP (3) WITH TIES
    [t0].[Id], 
    [t0].[Name], 
    [t0].[Score]
FROM 
    [Person] AS [t0]
ORDER BY [t0].[Score] desc

With a data set consisting of your original 5 records and an additional 10000 records all with scores less than the original both of these are more or less instant (less than 20 milliseconds).

使用由原始5条记录和另外10000条记录组成的数据集,所有记录的得分均小于原始记录,这些记录或多或少都是即时的(小于20毫秒)。

The IEnumerable<T> approach took a whole 2 minutes!

IEnumerable 方法花了整整2分钟!


If the expression building and reflection seems scary the same thing can be achieved with a join:

如果表达式构建和反射看起来很可怕,那么使用连接可以实现同样的目的:

public static IQueryable<T> TopWithTiesDescendingJoin<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
{
    if (source == null) throw new ArgumentNullException("source");
    if (topBy == null) throw new ArgumentNullException("topBy");
    if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));

    var orderedByValue = source.OrderByDescending(topBy);
    var topNValues = orderedByValue.Select(topBy).Take(topCount).Distinct();
    var topNRowsWithTies = topNValues.Join(source, value => value, topBy, (x, row) => row);
    return topNRowsWithTies.OrderByDescending(topBy);
}

With the following query as the result (with about the same performance):

使用以下查询作为结果(具有大致相同的性能):

SELECT [t3].[Id], [t3].[Name], [t3].[Score]
FROM (
    SELECT DISTINCT [t1].[Score]
    FROM (
        SELECT TOP (3) [t0].[Score]
        FROM [Person] AS [t0]
        ORDER BY [t0].[Score] DESC
        ) AS [t1]
    ) AS [t2]
INNER JOIN [Person] AS [t3] ON [t2].[Score] = [t3].[Score]
ORDER BY [t3].[Score] DESC

#3


1  

Another solution - which probably is not as efficient as the other solution - is to get TOP(3) Scores and get the rows with Score values contained in the TOP(3).

另一个解决方案 - 可能不如其他解决方案那么高效 - 是获得TOP(3)分数并获得TOP(3)中包含得分值的行。

We can use Contains() as follows;

我们可以使用Contains()如下;

orderedPerson = datamodel.People.OrderByDescending(p => p.Score);

topPeopleList =
(
    from p in orderedPerson 
    let topNPersonScores = orderedPerson.Take(n).Select(p => p.Score).Distinct()
    where topNPersonScores.Contains(p.Score)
    select p
).ToList();

What's good about this implementation is that it's extension method TopWithTies() can be implemented easly as;

这个实现有什么好处,它的扩展方法TopWithTies()可以很容易地实现;

public static IEnumerable<T> TopWithTies<T, TResult>(this IEnumerable<T> enumerable, Func<T, TResult> selector, int n)
{
    IEnumerable<T> orderedEnumerable = enumerable.OrderByDescending(selector);

    return
    (
        from p in orderedEnumerable
        let topNValues = orderedEnumerable.Take(n).Select(selector).Distinct()
        where topNValues.Contains(selector(p))
        select p
    );
}

#4


0  

I think that maybe you can do something like:

我想也许你可以这样做:

OrderByDescending(p => p.Score).Skip(2).Take(1)

Count the number of occurrences of this element, and then:

计算此元素的出现次数,然后:

OrderByDescending(p => p.Score).Take(2 + "The select with the number of occurrences for the third element")

I think that maybe this works ;) It´s only an idea!

我认为这可能有效;)这只是一个想法!

#5


0  

I've found a solution taking the Score field value of the Nth row (3rd row in this case) using .Skip(n-1).Take(1) and selecting all rows with score value greater or equal to that as follows:

我找到了一个解决方案,使用.Skip(n-1)获取第N行的得分字段值(本例中为第3行)。选择(1)并选择得分值大于或等于所有的行,如下所示:

qryPeopleOrderedByScore = datamodel.People.OrderByDescending(p => p.Score);

topPeopleList =
(
    from p in qryPeopleOrderedByScore
    let lastPersonInList = qryPeopleOrderedByScore.Skip(2).Take(1).FirstOrDefault()
    where lastPersonInList == null || p.Score >= lastPersonInList.Score
    select p
).ToList();