LINQ除了运算符和对象相等

时间:2022-12-12 22:59:48

Here is an interesting issue I noticed when using the Except Operator: I have list of users from which I want to exclude some users:

这是我在使用Except运算符时注意到的一个有趣问题:我有一些用户列表,我想从中排除一些用户:

The list of users is coming from an XML file:

用户列表来自XML文件:

The code goes like this:

代码如下:

interface IUser
{
     int ID { get; set; }
     string Name { get; set; }
}

class User: IUser
{

    #region IUser Members

    public int ID
    {
        get;
        set;
    }

    public string Name
    {
        get;
        set;
    }

    #endregion

    public override string ToString()
    {
        return ID + ":" +Name;
    }


    public static IEnumerable<IUser> GetMatchingUsers(IEnumerable<IUser> users)
    {
         IEnumerable<IUser> localList = new List<User>
         {
            new User{ ID=4, Name="James"},
            new User{ ID=5, Name="Tom"}

         }.OfType<IUser>();
         var matches = from u in users
                       join lu in localList
                           on u.ID equals lu.ID
                       select u;
         return matches;
    }
}

class Program
{
    static void Main(string[] args)
    {
        XDocument doc = XDocument.Load("Users.xml");
        IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
            (u => new User
                { ID = (int)u.Attribute("id"),
                  Name = (string)u.Attribute("name")
                }
            ).OfType<IUser>();       //still a query, objects have not been materialized


        var matches = User.GetMatchingUsers(users);
        var excludes = users.Except(matches);    // excludes should contain 6 users but here it contains 8 users

    }
}

When I call User.GetMatchingUsers(users) I get 2 matches as expected. The issue is that when I call users.Except(matches) The matching users are not being excluded at all! I am expecting 6 users ut "excludes" contains all 8 users instead.

当我调用User.GetMatchingUsers(用户)时,我按预期得到2个匹配。问题是,当我呼叫users.Except(匹配)匹配的用户根本没有被排除!我期待6个用户ut“排除”包含所有8个用户。

Since all I'm doing in GetMatchingUsers(IEnumerable<IUser> users) is taking the IEnumerable<IUser> and just returning the IUsers whose ID's match( 2 IUsers in this case), my understanding is that by default Except will use reference equality for comparing the objects to be excluded. Is this not how Except behaves?

因为我在GetMatchingUsers(IEnumerable users)中所做的就是获取IEnumerable 并返回其ID匹配的IUsers(在本例中为2 IUsers),我的理解是默认情况下将使用引用相等性比较要排除的对象。这不是Except的表现吗?

What is even more interesting is that if I materialize the objects using .ToList() and then get the matching users, and call Except, everything works as expected!

更有趣的是,如果我使用.ToList()实现对象,然后获取匹配的用户,并调用Except,一切都按预期工作!

Like so:

IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
            (u => new User
                { ID = (int)u.Attribute("id"),
                  Name = (string)u.Attribute("name")
                }
            ).OfType<IUser>().ToList();   //explicity materializing all objects by calling ToList()

var matches = User.GetMatchingUsers(users);
var excludes = users.Except(matches);   // excludes now contains 6 users as expected

I don't see why I should need to materialize objects for calling Except given that its defined on IEnumerable<T>?

我不明白为什么我需要实现对象调用Except,因为它在IEnumerable 上定义了?

Any suggesstions / insights would be much appreciated.

任何建议/见解将非常感激。

3 个解决方案

#1


I think I know why this fails to work as expected. Because the initial user list is a LINQ expression, it is re-evaluated each time it is iterated (once when used in GetMatchingUsers and again when doing the Except operation) and so, new user objects are created. This would lead to different references and so no matches. Using ToList fixes this because it iterates the LINQ query once only and so the references are fixed.

我想我知道为什么这不能按预期工作。因为初始用户列表是LINQ表达式,所以每次迭代时都会重新评估它(一次在GetMatchingUsers中使用时再次执行Except操作时),因此会创建新的用户对象。这将导致不同的引用,因此没有匹配。使用ToList修复此问题,因为它只迭代LINQ查询一次,因此修复了引用。

I've been able to reproduce the problem you have and having investigated the code, this seems like a very plausible explanation. I haven't proved it yet, though.

我已经能够重现你所遇到的问题并调查了代码,这似乎是一个非常合理的解释。不过,我还没有证明这一点。

Update
I just ran the test but outputting the users collection before the call to GetMatchingUsers, in that call, and after it. Each time the hash code for the object was output and they do indeed have different values each time indicating new objects, as I suspected.

更新我刚刚运行测试但在调用GetMatchingUsers之前输出了用户集合,在该调用之后,以及之后。每次输出对象的哈希码时,每次指示新对象时确实都有不同的值,正如我怀疑的那样。

Here is the output for each of the calls:

以下是每个调用的输出:

==> Start
ID=1, Name=Jeff, HashCode=39086322
ID=2, Name=Alastair, HashCode=36181605
ID=3, Name=Anthony, HashCode=28068188
ID=4, Name=James, HashCode=33163964
ID=5, Name=Tom, HashCode=14421545
ID=6, Name=David, HashCode=35567111
<== End
==> Start
ID=1, Name=Jeff, HashCode=65066874
ID=2, Name=Alastair, HashCode=34160229
ID=3, Name=Anthony, HashCode=63238509
ID=4, Name=James, HashCode=11679222
ID=5, Name=Tom, HashCode=35410979
ID=6, Name=David, HashCode=57416410
<== End
==> Start
ID=1, Name=Jeff, HashCode=61940669
ID=2, Name=Alastair, HashCode=15193904
ID=3, Name=Anthony, HashCode=6303833
ID=4, Name=James, HashCode=40452378
ID=5, Name=Tom, HashCode=36009496
ID=6, Name=David, HashCode=19634871
<== End

And, here is the modified code to show the problem:

并且,这是修改后的代码来显示问题:

using System.Xml.Linq;
using System.Collections.Generic;
using System.Linq;
using System;

interface IUser
{
    int ID
    {
        get;
        set;
    }
    string Name
    {
        get;
        set;
    }
}

class User : IUser
{

    #region IUser Members

    public int ID
    {
        get;
        set;
    }

    public string Name
    {
        get;
        set;
    }

    #endregion

    public override string ToString()
    {
        return ID + ":" + Name;
    }


    public static IEnumerable<IUser> GetMatchingUsers(IEnumerable<IUser> users)
    {
        IEnumerable<IUser> localList = new List<User>
         {
            new User{ ID=4, Name="James"},
            new User{ ID=5, Name="Tom"}

         }.OfType<IUser>();

        OutputUsers(users);
        var matches = from u in users
                      join lu in localList
                          on u.ID equals lu.ID
                      select u;
        return matches;
    }

    public static void OutputUsers(IEnumerable<IUser> users)
    {
        Console.WriteLine("==> Start");
        foreach (IUser user in users)
        {
            Console.WriteLine("ID=" + user.ID.ToString() + ", Name=" + user.Name + ", HashCode=" + user.GetHashCode().ToString());
        }
        Console.WriteLine("<== End");
    }
}

class Program
{
    static void Main(string[] args)
    {
        XDocument doc = new XDocument(
            new XElement(
                "Users",
                new XElement("User", new XAttribute("id", "1"), new XAttribute("name", "Jeff")),
                new XElement("User", new XAttribute("id", "2"), new XAttribute("name", "Alastair")),
                new XElement("User", new XAttribute("id", "3"), new XAttribute("name", "Anthony")),
                new XElement("User", new XAttribute("id", "4"), new XAttribute("name", "James")),
                new XElement("User", new XAttribute("id", "5"), new XAttribute("name", "Tom")),
                new XElement("User", new XAttribute("id", "6"), new XAttribute("name", "David"))));
        IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
            (u => new User
            {
                ID = (int)u.Attribute("id"),
                Name = (string)u.Attribute("name")
            }
            ).OfType<IUser>();       //still a query, objects have not been materialized


        User.OutputUsers(users);
        var matches = User.GetMatchingUsers(users);
        User.OutputUsers(users);
        var excludes = users.Except(matches);    // excludes should contain 6 users but here it contains 8 users

    }
}

#2


a) You need to override GetHashCode function. It MUST return equal values for equal IUser objects. For example:

a)您需要覆盖GetHashCode函数。它必须为相同的IUser对象返回相等的值。例如:

public override int GetHashCode()
{
    return ID.GetHashCode() ^ Name.GetHashCode();
}

b) You need to override object.Equals(object obj) function in classes that implement IUser.

b)您需要在实现IUser的类中覆盖object.Equals(object obj)函数。

public override bool Equals(object obj)
{
    IUser other = obj as IUser;
    if (object.ReferenceEquals(obj, null)) // return false if obj is null OR if obj doesn't implement IUser
        return false;
    return (this.ID == other.ID) && (this.Name == other.Name);
}

c) As an alternative to (b) IUser may inherit IEquatable:

c)作为(b)的替代,IUser可以继承IEquatable:

interface IUser : IEquatable<IUser>
...

User class will need to provide bool Equals(IUser other) method in that case.

在这种情况下,用户类需要提供bool Equals(IUser other)方法。

That's all. Now it works without calling .ToList() method.

就这样。现在它无需调用.ToList()方法。

#3


I think you should implement IEquatable<T> to provide your own Equals and GetHashCode methods.

我认为你应该实现IEquatable 来提供你自己的Equals和GetHashCode方法。

From MSDN (Enumerable.Except):

来自MSDN(Enumerable.Except):

If you want to compare sequences of objects of some custom data type, you have to implement the IEqualityComparer<(Of <(T>)>) generic interface in your class. The following code example shows how to implement this interface in a custom data type and provide GetHashCode and Equals methods.

如果要比较某些自定义数据类型的对象序列,则必须在类中实现IEqualityComparer <(Of <(T>)>)泛型接口。以下代码示例演示如何在自定义数据类型中实现此接口,并提供GetHashCode和Equals方法。

#1


I think I know why this fails to work as expected. Because the initial user list is a LINQ expression, it is re-evaluated each time it is iterated (once when used in GetMatchingUsers and again when doing the Except operation) and so, new user objects are created. This would lead to different references and so no matches. Using ToList fixes this because it iterates the LINQ query once only and so the references are fixed.

我想我知道为什么这不能按预期工作。因为初始用户列表是LINQ表达式,所以每次迭代时都会重新评估它(一次在GetMatchingUsers中使用时再次执行Except操作时),因此会创建新的用户对象。这将导致不同的引用,因此没有匹配。使用ToList修复此问题,因为它只迭代LINQ查询一次,因此修复了引用。

I've been able to reproduce the problem you have and having investigated the code, this seems like a very plausible explanation. I haven't proved it yet, though.

我已经能够重现你所遇到的问题并调查了代码,这似乎是一个非常合理的解释。不过,我还没有证明这一点。

Update
I just ran the test but outputting the users collection before the call to GetMatchingUsers, in that call, and after it. Each time the hash code for the object was output and they do indeed have different values each time indicating new objects, as I suspected.

更新我刚刚运行测试但在调用GetMatchingUsers之前输出了用户集合,在该调用之后,以及之后。每次输出对象的哈希码时,每次指示新对象时确实都有不同的值,正如我怀疑的那样。

Here is the output for each of the calls:

以下是每个调用的输出:

==> Start
ID=1, Name=Jeff, HashCode=39086322
ID=2, Name=Alastair, HashCode=36181605
ID=3, Name=Anthony, HashCode=28068188
ID=4, Name=James, HashCode=33163964
ID=5, Name=Tom, HashCode=14421545
ID=6, Name=David, HashCode=35567111
<== End
==> Start
ID=1, Name=Jeff, HashCode=65066874
ID=2, Name=Alastair, HashCode=34160229
ID=3, Name=Anthony, HashCode=63238509
ID=4, Name=James, HashCode=11679222
ID=5, Name=Tom, HashCode=35410979
ID=6, Name=David, HashCode=57416410
<== End
==> Start
ID=1, Name=Jeff, HashCode=61940669
ID=2, Name=Alastair, HashCode=15193904
ID=3, Name=Anthony, HashCode=6303833
ID=4, Name=James, HashCode=40452378
ID=5, Name=Tom, HashCode=36009496
ID=6, Name=David, HashCode=19634871
<== End

And, here is the modified code to show the problem:

并且,这是修改后的代码来显示问题:

using System.Xml.Linq;
using System.Collections.Generic;
using System.Linq;
using System;

interface IUser
{
    int ID
    {
        get;
        set;
    }
    string Name
    {
        get;
        set;
    }
}

class User : IUser
{

    #region IUser Members

    public int ID
    {
        get;
        set;
    }

    public string Name
    {
        get;
        set;
    }

    #endregion

    public override string ToString()
    {
        return ID + ":" + Name;
    }


    public static IEnumerable<IUser> GetMatchingUsers(IEnumerable<IUser> users)
    {
        IEnumerable<IUser> localList = new List<User>
         {
            new User{ ID=4, Name="James"},
            new User{ ID=5, Name="Tom"}

         }.OfType<IUser>();

        OutputUsers(users);
        var matches = from u in users
                      join lu in localList
                          on u.ID equals lu.ID
                      select u;
        return matches;
    }

    public static void OutputUsers(IEnumerable<IUser> users)
    {
        Console.WriteLine("==> Start");
        foreach (IUser user in users)
        {
            Console.WriteLine("ID=" + user.ID.ToString() + ", Name=" + user.Name + ", HashCode=" + user.GetHashCode().ToString());
        }
        Console.WriteLine("<== End");
    }
}

class Program
{
    static void Main(string[] args)
    {
        XDocument doc = new XDocument(
            new XElement(
                "Users",
                new XElement("User", new XAttribute("id", "1"), new XAttribute("name", "Jeff")),
                new XElement("User", new XAttribute("id", "2"), new XAttribute("name", "Alastair")),
                new XElement("User", new XAttribute("id", "3"), new XAttribute("name", "Anthony")),
                new XElement("User", new XAttribute("id", "4"), new XAttribute("name", "James")),
                new XElement("User", new XAttribute("id", "5"), new XAttribute("name", "Tom")),
                new XElement("User", new XAttribute("id", "6"), new XAttribute("name", "David"))));
        IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
            (u => new User
            {
                ID = (int)u.Attribute("id"),
                Name = (string)u.Attribute("name")
            }
            ).OfType<IUser>();       //still a query, objects have not been materialized


        User.OutputUsers(users);
        var matches = User.GetMatchingUsers(users);
        User.OutputUsers(users);
        var excludes = users.Except(matches);    // excludes should contain 6 users but here it contains 8 users

    }
}

#2


a) You need to override GetHashCode function. It MUST return equal values for equal IUser objects. For example:

a)您需要覆盖GetHashCode函数。它必须为相同的IUser对象返回相等的值。例如:

public override int GetHashCode()
{
    return ID.GetHashCode() ^ Name.GetHashCode();
}

b) You need to override object.Equals(object obj) function in classes that implement IUser.

b)您需要在实现IUser的类中覆盖object.Equals(object obj)函数。

public override bool Equals(object obj)
{
    IUser other = obj as IUser;
    if (object.ReferenceEquals(obj, null)) // return false if obj is null OR if obj doesn't implement IUser
        return false;
    return (this.ID == other.ID) && (this.Name == other.Name);
}

c) As an alternative to (b) IUser may inherit IEquatable:

c)作为(b)的替代,IUser可以继承IEquatable:

interface IUser : IEquatable<IUser>
...

User class will need to provide bool Equals(IUser other) method in that case.

在这种情况下,用户类需要提供bool Equals(IUser other)方法。

That's all. Now it works without calling .ToList() method.

就这样。现在它无需调用.ToList()方法。

#3


I think you should implement IEquatable<T> to provide your own Equals and GetHashCode methods.

我认为你应该实现IEquatable 来提供你自己的Equals和GetHashCode方法。

From MSDN (Enumerable.Except):

来自MSDN(Enumerable.Except):

If you want to compare sequences of objects of some custom data type, you have to implement the IEqualityComparer<(Of <(T>)>) generic interface in your class. The following code example shows how to implement this interface in a custom data type and provide GetHashCode and Equals methods.

如果要比较某些自定义数据类型的对象序列,则必须在类中实现IEqualityComparer <(Of <(T>)>)泛型接口。以下代码示例演示如何在自定义数据类型中实现此接口,并提供GetHashCode和Equals方法。

相关文章