C#:为什么字典比列表快得多?

时间:2022-04-05 17:21:48

I am testing the speed of getting data from Dictionary VS list.
I've used this code to test :

我正在测试从Dictionary VS列表中获取数据的速度。我用这段代码来测试:

    internal class Program
{
    private static void Main(string[] args)
    {
        var stopwatch = new Stopwatch();
        List<Grade> grades = Grade.GetData().ToList();
        List<Student> students = Student.GetStudents().ToList();

        stopwatch.Start();
        foreach (Student student in students)
        {
            student.Grade = grades.Single(x => x.StudentId == student.Id).Value;
        }
        stopwatch.Stop();
        Console.WriteLine("Using list {0}", stopwatch.Elapsed);
        stopwatch.Reset();
        students = Student.GetStudents().ToList();
        stopwatch.Start();
        Dictionary<Guid, string> dic = Grade.GetData().ToDictionary(x => x.StudentId, x => x.Value);
        foreach (Student student in students)
        {
            student.Grade = dic[student.Id];
        }
        stopwatch.Stop();
        Console.WriteLine("Using dictionary {0}", stopwatch.Elapsed);
        Console.ReadKey();
    }
}

public class GuidHelper
{
    public static List<Guid> ListOfIds=new List<Guid>();

    static GuidHelper()
    {
        for (int i = 0; i < 10000; i++)
        {
            ListOfIds.Add(Guid.NewGuid());
        }
    }
}


public class Grade
{
    public Guid StudentId { get; set; }
    public string Value { get; set; }

    public static IEnumerable<Grade> GetData()
    {
        for (int i = 0; i < 10000; i++)
        {
            yield return new Grade
                             {
                                 StudentId = GuidHelper.ListOfIds[i], Value = "Value " + i
                             };
        }
    }
}

public class Student
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public string Grade { get; set; }

    public static IEnumerable<Student> GetStudents()
    {
        for (int i = 0; i < 10000; i++)
        {
            yield return new Student
                             {
                                 Id = GuidHelper.ListOfIds[i],
                                 Name = "Name " + i
                             };
        }
    }
}

There is list of students and grades in memory they have StudentId in common.
In first way I tried to find Grade of a student using LINQ on a list that takes near 7 seconds on my machine and in another way first I converted List into dictionary then finding grades of student from dictionary using key that takes less than a second . C#:为什么字典比列表快得多?

有记忆的学生和成绩列表,他们有共同的StudentId。在第一种方式中,我尝试使用LINQ找到一个学生的成绩,在我的机器上花了将近7秒的时间,另一方面我先将List转换为字典,然后使用不到一秒的密钥从字典中查找学生成绩。

7 个解决方案

#1


105  

When you do this:

当你这样做:

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

As written it has to enumerate the entire List until it finds the entry in the List that has the correct studentId (does entry 0 match the lambda? No... Does entry 1 match the lambda? No... etc etc). This is O(n). Since you do it once for every student, it is O(n^2).

如上所述,它必须枚举整个List,直到它在List中找到具有正确studentId的条目(条目0与lambda匹配?否......条目1是否匹配lambda?否......等等)。这是O(n)。既然你为每个学生做了一次,那就是O(n ^ 2)。

However when you do this:

但是当你这样做时:

student.Grade = dic[student.Id];

student.Grade = dic [student.Id];

If you want to find a certain element by key in a dictionary, it can instantly jump to where it is in the dictionary - this is O(1). O(n) for doing it for every student. (If you want to know how this is done - Dictionary runs a mathematical operation on the key, which turns it into a value that is a place inside the dictionary, which is the same place it put it when it was inserted)

如果你想在字典中按键找到某个元素,它可以立即跳转到字典中的位置 - 这是O(1)。 O(n)为每个学生做这件事。 (如果你想知道这是怎么做的 - Dictionary对键运行一个数学运算,它将它变成一个值,它是字典里面的一个位置,它与插入时的位置相同)

So, dictionary is faster because you used a better algorithm.

因此,字典更快,因为您使用了更好的算法。

#2


11  

When using Dictionary you are using a key to retrieve your information, which enables it to find it more efficiently, with List you are using Single Linq expression, which since it is a list, has no other option other than to look in entire list for wanted the item.

使用Dictionary时,你使用一个键来检索你的信息,这使得它能够更有效地找到它,使用List你正在使用Single Linq表达式,因为它是一个列表,除了查看整个列表之外没有其他选项想要这个项目。

#3


10  

The reason is because a dictionary is a lookup, while a list is an iteration.

原因是因为字典是查找,而列表是迭代。

Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.

Dictionary使用哈希查找,而您的列表需要遍历列表,直到每次从结果开始到结果为止。

to put it another way. The list will be faster than the dictionary on the first item, because there's nothing to look up. it's the first item, boom.. it's done. but the second time the list has to look through the first item, then the second item. The third time through it has to look through the first item, then the second item, then the third item.. etc..

换一种方式。该列表将比第一个项目上的字典更快,因为没有任何东西可以查找。这是第一项,热潮......它已经完成了。但第二次列表必须查看第一个项目,然后是第二个项目。第三次通过它必须查看第一项,然后第二项,然后第三项......等。

So each iteration the lookup takes more and more time. The larger the list, the longer it takes. While the dictionary is always a more or less fixed lookup time (it also increases as the dictionary gets larger, but at a much slower pace, so by comparison it's almost fixed).

因此,每次迭代查找都会花费越来越多的时间。列表越大,所需的时间越长。虽然字典总是或多或少固定的查找时间(它也随着字典变大而增加,但速度要慢得多,所以通过比较它几乎是固定的)。

#4


8  

Dictionary uses hashing to search for the data. Each item in the dictionary is stored in buckets of items that contain the same hash. It's a lot quicker.

Dictionary使用散列来搜索数据。字典中的每个项目都存储在包含相同哈希的项目桶中。它快得多。

Try sorting your list, it will be a a bit quicker then.

尝试对列表进行排序,然后会更快一些。

#5


5  

A dictionary uses a hash table, it is a great data structure as it maps an input to a corresponding output almost instantaneously, it has a complexity of O(1) as already pointed out which means more or less immediate retrieval.

字典使用哈希表,它是一个很好的数据结构,因为它几乎瞬间将输入映射到相应的输出,它具有已经指出的O(1)的复杂性,这意味着或多或少的立即检索。

The cons of it is that for the sake of performance you need lots of space in advance (depending on the implementation be it separate chaining or linear/quadratic probing you may need at least as much as you're planning to store, probably double in the latter case) and you need a good hashing algorithm that maps uniquely your input ("John Smith") to a corresponding output such as a position in an array (hash_array[34521]).

它的缺点是,为了性能,你需要提前大量的空间(取决于实现,它是单独的链接或线性/二次探测,你可能需要至少与你计划存储的一样多,可能是两倍后一种情况)你需要一个良好的散列算法,将输入(“John Smith”)唯一映射到相应的输出,例如数组中的位置(hash_array [34521])。

Also listing the entries in a sorted order is a problem. If I may quote Wikipedia:

同样按排序顺序列出条目是个问题。如果我引用*:

Listing all n entries in some specific order generally requires a separate sorting step, whose cost is proportional to log(n) per entry.

按特定顺序列出所有n个条目通常需要单独的排序步骤,其成本与每个条目的log(n)成比例。

Have a read on linear probing and separate chaining for some gorier details :)

阅读有关线性探测和单独链接的一些gorier细节:)

#6


3  

Dictionary is based on a hash table which is a rather efficient algorithm to look up things. In a list you have to go element by element in order to find something.

字典基于哈希表,这是一种查找事物的相当有效的算法。在列表中,您必须逐个元素才能找到某些内容。

It's all a matter of data organization...

这都是数据组织的问题......

#7


2  

When it comes to lookup of data, a keyed collection is always faster than a non-keyed collection. This is because a non-keyed collection will have to enumerate its elements to find what you are looking for. While in a keyed collection you can just access the element directly via the key.

在查找数据时,键控集合总是比非键控集合更快。这是因为非键控集合必须枚举其元素以找到您要查找的内容。在键控集合中,您可以直接通过键访问元素。

These are some nice articles for comparing list to dictionary.

这些是用于将列表与字典进行比较的一些好文章。

Here. And this one.

这里。和这个。

#1


105  

When you do this:

当你这样做:

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

As written it has to enumerate the entire List until it finds the entry in the List that has the correct studentId (does entry 0 match the lambda? No... Does entry 1 match the lambda? No... etc etc). This is O(n). Since you do it once for every student, it is O(n^2).

如上所述,它必须枚举整个List,直到它在List中找到具有正确studentId的条目(条目0与lambda匹配?否......条目1是否匹配lambda?否......等等)。这是O(n)。既然你为每个学生做了一次,那就是O(n ^ 2)。

However when you do this:

但是当你这样做时:

student.Grade = dic[student.Id];

student.Grade = dic [student.Id];

If you want to find a certain element by key in a dictionary, it can instantly jump to where it is in the dictionary - this is O(1). O(n) for doing it for every student. (If you want to know how this is done - Dictionary runs a mathematical operation on the key, which turns it into a value that is a place inside the dictionary, which is the same place it put it when it was inserted)

如果你想在字典中按键找到某个元素,它可以立即跳转到字典中的位置 - 这是O(1)。 O(n)为每个学生做这件事。 (如果你想知道这是怎么做的 - Dictionary对键运行一个数学运算,它将它变成一个值,它是字典里面的一个位置,它与插入时的位置相同)

So, dictionary is faster because you used a better algorithm.

因此,字典更快,因为您使用了更好的算法。

#2


11  

When using Dictionary you are using a key to retrieve your information, which enables it to find it more efficiently, with List you are using Single Linq expression, which since it is a list, has no other option other than to look in entire list for wanted the item.

使用Dictionary时,你使用一个键来检索你的信息,这使得它能够更有效地找到它,使用List你正在使用Single Linq表达式,因为它是一个列表,除了查看整个列表之外没有其他选项想要这个项目。

#3


10  

The reason is because a dictionary is a lookup, while a list is an iteration.

原因是因为字典是查找,而列表是迭代。

Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.

Dictionary使用哈希查找,而您的列表需要遍历列表,直到每次从结果开始到结果为止。

to put it another way. The list will be faster than the dictionary on the first item, because there's nothing to look up. it's the first item, boom.. it's done. but the second time the list has to look through the first item, then the second item. The third time through it has to look through the first item, then the second item, then the third item.. etc..

换一种方式。该列表将比第一个项目上的字典更快,因为没有任何东西可以查找。这是第一项,热潮......它已经完成了。但第二次列表必须查看第一个项目,然后是第二个项目。第三次通过它必须查看第一项,然后第二项,然后第三项......等。

So each iteration the lookup takes more and more time. The larger the list, the longer it takes. While the dictionary is always a more or less fixed lookup time (it also increases as the dictionary gets larger, but at a much slower pace, so by comparison it's almost fixed).

因此,每次迭代查找都会花费越来越多的时间。列表越大,所需的时间越长。虽然字典总是或多或少固定的查找时间(它也随着字典变大而增加,但速度要慢得多,所以通过比较它几乎是固定的)。

#4


8  

Dictionary uses hashing to search for the data. Each item in the dictionary is stored in buckets of items that contain the same hash. It's a lot quicker.

Dictionary使用散列来搜索数据。字典中的每个项目都存储在包含相同哈希的项目桶中。它快得多。

Try sorting your list, it will be a a bit quicker then.

尝试对列表进行排序,然后会更快一些。

#5


5  

A dictionary uses a hash table, it is a great data structure as it maps an input to a corresponding output almost instantaneously, it has a complexity of O(1) as already pointed out which means more or less immediate retrieval.

字典使用哈希表,它是一个很好的数据结构,因为它几乎瞬间将输入映射到相应的输出,它具有已经指出的O(1)的复杂性,这意味着或多或少的立即检索。

The cons of it is that for the sake of performance you need lots of space in advance (depending on the implementation be it separate chaining or linear/quadratic probing you may need at least as much as you're planning to store, probably double in the latter case) and you need a good hashing algorithm that maps uniquely your input ("John Smith") to a corresponding output such as a position in an array (hash_array[34521]).

它的缺点是,为了性能,你需要提前大量的空间(取决于实现,它是单独的链接或线性/二次探测,你可能需要至少与你计划存储的一样多,可能是两倍后一种情况)你需要一个良好的散列算法,将输入(“John Smith”)唯一映射到相应的输出,例如数组中的位置(hash_array [34521])。

Also listing the entries in a sorted order is a problem. If I may quote Wikipedia:

同样按排序顺序列出条目是个问题。如果我引用*:

Listing all n entries in some specific order generally requires a separate sorting step, whose cost is proportional to log(n) per entry.

按特定顺序列出所有n个条目通常需要单独的排序步骤,其成本与每个条目的log(n)成比例。

Have a read on linear probing and separate chaining for some gorier details :)

阅读有关线性探测和单独链接的一些gorier细节:)

#6


3  

Dictionary is based on a hash table which is a rather efficient algorithm to look up things. In a list you have to go element by element in order to find something.

字典基于哈希表,这是一种查找事物的相当有效的算法。在列表中,您必须逐个元素才能找到某些内容。

It's all a matter of data organization...

这都是数据组织的问题......

#7


2  

When it comes to lookup of data, a keyed collection is always faster than a non-keyed collection. This is because a non-keyed collection will have to enumerate its elements to find what you are looking for. While in a keyed collection you can just access the element directly via the key.

在查找数据时,键控集合总是比非键控集合更快。这是因为非键控集合必须枚举其元素以找到您要查找的内容。在键控集合中,您可以直接通过键访问元素。

These are some nice articles for comparing list to dictionary.

这些是用于将列表与字典进行比较的一些好文章。

Here. And this one.

这里。和这个。