树:链接列表与数组(效率)

时间:2022-10-09 07:21:00

This is an assignment question that I am having trouble wording an answer to.

这是一个任务问题,我在回答问题时遇到了麻烦。

"Suppose a tree may have up to k children per node. Let v be the average number of children per node. For what value(s) of v is it more efficient (in terms of space used) to store the child nodes in a linked list versus storage in an array? Why?"

“假设一个树每个节点可能有多达k个子节点。令v为每个节点的平均子节点数。对于v的值,将子节点存储在一个节点中的效率更高(就使用的空间而言)链表与数组中的存储?为什么?“

I believe I can answer the "why?" more or less in plain English -- it will be more efficient to use the linked list because rather than having a bunch of empty nodes (ie empty indexes in the array if your average is lower than the max) taking up memory you only alloc space for a node in a linked list when you're actually filling in a value.

我相信我能回答“为什么?”或多或少用简单的英语 - 使用链表更有效率,因为而不是拥有一堆空节点(如果你的平均值低于最大值,数组中的空索引)占用内存你只需要分配空间当您实际填写值时,链接列表中的节点。

So if you've got an average of 6 children when your maximum is 200, the array will be creating space for all 200 children of each node when the tree is created, but the linked list will only alloc space for the nodes as needed. So, with the linked list, space used will be approximately(?) the average; with the array, spaced used will be the max.

因此,如果在最大值为200时平均有6个子节点,则在创建树时,阵列将为每个节点的所有200个子节点创建空间,但链接列表将仅根据需要为节点分配空间。因此,使用链表,使用的空间大约是(?)平均值;使用数组,间隔使用将是最大值。

...I don't see when it would ever be more efficient to use the array. Is this a trick question? Do I have to take into account the fact that the array needs to have a limit on total number of nodes when it's created?

...我不知道何时使用该阵列会更有效率。这是一个棘手的问题吗?我是否必须考虑到数组在创建时需要对节点总数进行限制的事实?

5 个解决方案

#1


5  

For many commonly used languages, the array will require allocating storage k memory addresses (of the data). A singly-linked list will require 2 addresses per node (data & next). A doubly-linked list would require 3 addresses per node.

对于许多常用语言,该阵列将需要分配存储k个存储器地址(数据)。单链表将需要每个节点2个地址(数据和下一个)。双链表将需要每个节点3个地址。

Let n be the actual number of children of a particular node A:

设n是特定节点A的实际子节点数:

  • The array uses k memory addresses
  • 该阵列使用k个内存地址

  • The singly-linked list uses 2n addresses
  • 单链表使用2n个地址

  • The doubly-linked list uses 3n addresses
  • 双向链表使用3n个地址

The value k allows you to determine if 2n or 3n addresses will average to a gain or loss compared to simply storing the addresses in an array.

值k允许您确定2n或3n地址是否平均为增益或损失,而不是简单地将地址存储在数组中。

#2


5  

...I don't see when it would ever be more efficient to use the array. Is this a trick question?

...我不知道何时使用该阵列会更有效率。这是一个棘手的问题吗?

It’s not a trick question. Think of the memory overhead that a linked list has. How is a linked list implemented (vs. an array)?

这不是一个棘手的问题。想一想链表有的内存开销。如何实现链表(与数组相比)?

Also (though this is beyond the scope of the question!), space consumption isn’t the only deciding factor in practice. Caching plays an important role in modern CPUs and storing the individual child nodes in an array instead of a linked list can improve the cache locality (and consequently the tree’s performance) drastically.

另外(虽然这超出了问题的范围!),空间消耗并不是实践中唯一的决定因素。缓存在现代CPU中起着重要作用,将各个子节点存储在数组中而不是链表中可以大大改善缓存局部性(从而提高树的性能)。

#3


3  

Arrays must pre-allocate space but we can use them to access very fast any entry.
Lists allocate memory whenever they create a new node, and that isn't ideal because
memory allocation costs on cpu time.

数组必须预先分配空间,但我们可以使用它们来快速访问任何条目。列表在创建新节点时分配内存,这并不理想,因为内存分配会花费cpu时间。

Tip: you can allocate the whole array at once, if you want to, but usually we allocate,
lets say, 4 entries and we resize it by doubling the size when we need more space.

提示:如果需要,您可以一次分配整个数组,但通常我们分配,比方说,4个条目,我们通过在需要更多空间时将大小加倍来调整大小。

#4


1  

I can imagine it could be a very good idea and very efficient in many scenarios to use a LinkedList if the data item one uses has intrinsic logic for a previous and next item anyway, for example a TimeTableItem or anything that is somehow time-related. These should implement an interface, so the LinkedList implementation can leverage that and doesn't have to wrap the items into its own node objects. Inserting and removing would be much more efficient here than using a List implementation which internally juggles arrays around.

我可以想象,如果一个使用的数据项无论如何都具有前一个和下一个项的内在逻辑,例如TimeTableItem或某种与时间相关的任何东西,在许多场景中使用LinkedList可能是一个非常好的想法并且非常有效。这些应该实现一个接口,因此LinkedList实现可以利用它,而不必将项目包装到自己的节点对象中。插入和删除比使用内部处理数组的List实现更有效。

#5


1  

You're assuming the array can't be dynamically re-allocated. If it can, the array wins, because it need be no bigger than k items (plus a constant overhead), and because it doesn't need per-item storage for pointers.

您假设无法动态重新分配阵列。如果可以,则数组获胜,因为它不需要大于k项(加上常量开销),并且因为它不需要指针的每项存储。

#1


5  

For many commonly used languages, the array will require allocating storage k memory addresses (of the data). A singly-linked list will require 2 addresses per node (data & next). A doubly-linked list would require 3 addresses per node.

对于许多常用语言,该阵列将需要分配存储k个存储器地址(数据)。单链表将需要每个节点2个地址(数据和下一个)。双链表将需要每个节点3个地址。

Let n be the actual number of children of a particular node A:

设n是特定节点A的实际子节点数:

  • The array uses k memory addresses
  • 该阵列使用k个内存地址

  • The singly-linked list uses 2n addresses
  • 单链表使用2n个地址

  • The doubly-linked list uses 3n addresses
  • 双向链表使用3n个地址

The value k allows you to determine if 2n or 3n addresses will average to a gain or loss compared to simply storing the addresses in an array.

值k允许您确定2n或3n地址是否平均为增益或损失,而不是简单地将地址存储在数组中。

#2


5  

...I don't see when it would ever be more efficient to use the array. Is this a trick question?

...我不知道何时使用该阵列会更有效率。这是一个棘手的问题吗?

It’s not a trick question. Think of the memory overhead that a linked list has. How is a linked list implemented (vs. an array)?

这不是一个棘手的问题。想一想链表有的内存开销。如何实现链表(与数组相比)?

Also (though this is beyond the scope of the question!), space consumption isn’t the only deciding factor in practice. Caching plays an important role in modern CPUs and storing the individual child nodes in an array instead of a linked list can improve the cache locality (and consequently the tree’s performance) drastically.

另外(虽然这超出了问题的范围!),空间消耗并不是实践中唯一的决定因素。缓存在现代CPU中起着重要作用,将各个子节点存储在数组中而不是链表中可以大大改善缓存局部性(从而提高树的性能)。

#3


3  

Arrays must pre-allocate space but we can use them to access very fast any entry.
Lists allocate memory whenever they create a new node, and that isn't ideal because
memory allocation costs on cpu time.

数组必须预先分配空间,但我们可以使用它们来快速访问任何条目。列表在创建新节点时分配内存,这并不理想,因为内存分配会花费cpu时间。

Tip: you can allocate the whole array at once, if you want to, but usually we allocate,
lets say, 4 entries and we resize it by doubling the size when we need more space.

提示:如果需要,您可以一次分配整个数组,但通常我们分配,比方说,4个条目,我们通过在需要更多空间时将大小加倍来调整大小。

#4


1  

I can imagine it could be a very good idea and very efficient in many scenarios to use a LinkedList if the data item one uses has intrinsic logic for a previous and next item anyway, for example a TimeTableItem or anything that is somehow time-related. These should implement an interface, so the LinkedList implementation can leverage that and doesn't have to wrap the items into its own node objects. Inserting and removing would be much more efficient here than using a List implementation which internally juggles arrays around.

我可以想象,如果一个使用的数据项无论如何都具有前一个和下一个项的内在逻辑,例如TimeTableItem或某种与时间相关的任何东西,在许多场景中使用LinkedList可能是一个非常好的想法并且非常有效。这些应该实现一个接口,因此LinkedList实现可以利用它,而不必将项目包装到自己的节点对象中。插入和删除比使用内部处理数组的List实现更有效。

#5


1  

You're assuming the array can't be dynamically re-allocated. If it can, the array wins, because it need be no bigger than k items (plus a constant overhead), and because it doesn't need per-item storage for pointers.

您假设无法动态重新分配阵列。如果可以,则数组获胜,因为它不需要大于k项(加上常量开销),并且因为它不需要指针的每项存储。