我什么时候想要使用堆?

时间:2021-11-02 20:56:12

Besides the obvious answer of a Priority Queue, when would a heap be useful in my programming adventures?

除了优先级队列的明显答案之外,什么时候堆在我的编程冒险中会有用?

6 个解决方案

#1


Use it whenever you need quick access to the largest (or smallest) item, because that item will always be the first element in the array or at the root of the tree.

只要您需要快速访问最大(或最小)项,就可以使用它,因为该项始终是数组中的第一个元素或树的根。

However, the remainder of the array is kept partially unsorted. Thus, instant access is only possible to the largest (smallest) item. Insertions are fast, so it's a good way to deal with incoming events or data and always have access to the earliest/biggest.

但是,数组的其余部分保持部分未排序。因此,只能对最大(最小)的项目进行即时访问。插入速度很快,因此它是处理传入事件或数据的好方法,并且始终可以访问最早/最大的事件。

Useful for priority queues, schedulers (where the earliest item is desired), etc...

对优先级队列,调度程序(需要最早的项目)等有用...

A heap is a tree where a parent node's value is larger than that of any of its descendant nodes.

堆是一个树,其父节点的值大于其任何后代节点的值。

If you think of a heap as a binary tree stored in linear order by depth, with the root node first (then the children of that node next, then the children of those nodes next); then the children of a node at index N are at 2N+1 and 2N+2. This property allows quick access-by-index. And since heaps are manipulated by swapping nodes, this allows for in-place sorting.

如果你认为堆是按深度线性顺序存储的二叉树,首先是根节点(然后是该节点的子节点,然后是那些节点的子节点);然后索引N处的节点的子节点为2N + 1和2N + 2。此属性允许快速访问索引。并且由于堆叠是通过交换节点来操纵的,因此可以进行就地排序。

#2


Heaps are structures meant to allow quick access to the min or the max.

堆是允许快速访问最小或最大的结构。

But why would you want that? You could just check every entry on add to see if it's the smallest or the biggest. This way you always have the smallest or the biggest in constant time O(1).

但你为什么要这样呢?您可以检查添加的每个条目,看它是最小的还是最大的。这样,您在恒定时间O(1)中始终具有最小值或最大值。

The answer is because heaps allow you to pull the smallest or the biggest and quickly know the NEXT smallest or biggest. That's why it's called a Priority Queue.

答案是因为堆可以让你拉最小或最大,并迅速知道NEXT最小或最大。这就是为什么它被称为优先级队列。

Real world example (not very fair world, though):

Suppose you have a hospital in which patients are attended based on their ages. The oldest are always attended first, no matter when he/she got in the queue.

假设您有一家医院,患者根据其年龄参加。无论他/她何时排队,最老的人总是先出席。

You can't just keep track of the oldest one because if you pull he/she out, you don't know the next oldest one. In order to solve this hospital problem, you implement a max heap. This heap is, by definition, partially ordered. This means you cannot sort the patients by their age, but you know that the oldest ones are always in the top, so you can pull a patient out in constant time O(1) and re-balance the heap in log time O(log N).

你不能只跟踪最老的那个,因为如果你拉出他/她,你就不知道下一个最老的那个。为了解决这个医院问题,您实现了最大堆。根据定义,此堆是部分排序的。这意味着您无法按年龄对患者进行排序,但您知道最老的患者总是在顶部,因此您可以在恒定时间O(1)内将患者拉出来并在日志时间O(log)中重新平衡堆N)。

More sophisticated example:

Suppose you have a sequence of integers and you want to keep track of the median. The median is the number that is in the middle of an ordered array.

假设你有一个整数序列,你想跟踪中位数。中位数是有序数组中间的数字。

Example:

[1, 2, 5, 7, 23, 27, 31]

In the above case, 7 is the median because the array containing the smaller numbers [1, 2, 5] is of the same size of the one containing the bigger numbers [23, 27, 31]. Normally, if the array has an even number of elements, the median is the arithmetic average of the 2 elements in the middle, e.g (5 + 7)/2.

在上面的情况中,7是中位数,因为包含较小数字[1,2,5]的数组与包含较大数字[23,27,31]的数组相同。通常,如果数组具有偶数个元素,则中值是中间2个元素的算术平均值,例如(5 + 7)/ 2。

Now, how do you keep track of the median? By having 2 heaps, one min heap containing the numbers smaller than the current median and a max heap containing the numbers bigger than the current median. Now, if these heaps are always balanced, the 2 heaps will contain the same number of elements or one will have 1 element more than the other, the most.

现在,你如何跟踪中位数?通过有2个堆,一个包含小于当前中位数的数字的最小堆和包含大于当前中位数的数字的最大堆。现在,如果这些堆总是平衡的,那么2个堆将包含相同数量的元素,或者一个元素将比另一个元素更多,最多。

When you add a new element to the sequence, if the number is smaller than the current median, you add it to the min heap, otherwise, you add it to the max heap. Now, if the heaps are unbalanced (one heap has more than 1 element more than the other), you pull an element from the biggest heap and add to the smallest. Now they're balanced.

向序列添加新元素时,如果该数字小于当前中位数,则将其添加到最小堆中,否则,将其添加到最大堆中。现在,如果堆是不平衡的(一个堆比另一个堆多1个元素),则从最大的堆中拉出一个元素并添加到最小的堆中。现在他们是平衡的。

#3


The characteristic of a heap is that it is a structure that maintains data semiordered; thus, it is a good tradeoff between the cost of maintaining a complete order ant the cost of seaching through random chaos. That characteristic is used on many algorithms, such as selection, ordering, or classification.

堆的特征是它是一个维护数据半结构的结构;因此,在维持完整订单的成本与通过随机混乱进行搜索的成本之间进行良好的权衡。该特征用于许多算法,例如选择,排序或分类。

Another useful characteristic of a heap is that it can be created in-place from an array!

堆的另一个有用特性是它可以从数组就地创建!

#4


Also good for selection algorithms (finding the min or max)

也适用于选择算法(找到最小值或最大值)

#5


anytime when you sort a temporary list, you should consider heaps.

在你对临时列表进行排序时,你应该考虑堆。

#6


You can use a minHeap or maxHeap when you want to access the smallest and largest elements respectively.

如果要分别访问最小和最大元素,可以使用minHeap或maxHeap。

#1


Use it whenever you need quick access to the largest (or smallest) item, because that item will always be the first element in the array or at the root of the tree.

只要您需要快速访问最大(或最小)项,就可以使用它,因为该项始终是数组中的第一个元素或树的根。

However, the remainder of the array is kept partially unsorted. Thus, instant access is only possible to the largest (smallest) item. Insertions are fast, so it's a good way to deal with incoming events or data and always have access to the earliest/biggest.

但是,数组的其余部分保持部分未排序。因此,只能对最大(最小)的项目进行即时访问。插入速度很快,因此它是处理传入事件或数据的好方法,并且始终可以访问最早/最大的事件。

Useful for priority queues, schedulers (where the earliest item is desired), etc...

对优先级队列,调度程序(需要最早的项目)等有用...

A heap is a tree where a parent node's value is larger than that of any of its descendant nodes.

堆是一个树,其父节点的值大于其任何后代节点的值。

If you think of a heap as a binary tree stored in linear order by depth, with the root node first (then the children of that node next, then the children of those nodes next); then the children of a node at index N are at 2N+1 and 2N+2. This property allows quick access-by-index. And since heaps are manipulated by swapping nodes, this allows for in-place sorting.

如果你认为堆是按深度线性顺序存储的二叉树,首先是根节点(然后是该节点的子节点,然后是那些节点的子节点);然后索引N处的节点的子节点为2N + 1和2N + 2。此属性允许快速访问索引。并且由于堆叠是通过交换节点来操纵的,因此可以进行就地排序。

#2


Heaps are structures meant to allow quick access to the min or the max.

堆是允许快速访问最小或最大的结构。

But why would you want that? You could just check every entry on add to see if it's the smallest or the biggest. This way you always have the smallest or the biggest in constant time O(1).

但你为什么要这样呢?您可以检查添加的每个条目,看它是最小的还是最大的。这样,您在恒定时间O(1)中始终具有最小值或最大值。

The answer is because heaps allow you to pull the smallest or the biggest and quickly know the NEXT smallest or biggest. That's why it's called a Priority Queue.

答案是因为堆可以让你拉最小或最大,并迅速知道NEXT最小或最大。这就是为什么它被称为优先级队列。

Real world example (not very fair world, though):

Suppose you have a hospital in which patients are attended based on their ages. The oldest are always attended first, no matter when he/she got in the queue.

假设您有一家医院,患者根据其年龄参加。无论他/她何时排队,最老的人总是先出席。

You can't just keep track of the oldest one because if you pull he/she out, you don't know the next oldest one. In order to solve this hospital problem, you implement a max heap. This heap is, by definition, partially ordered. This means you cannot sort the patients by their age, but you know that the oldest ones are always in the top, so you can pull a patient out in constant time O(1) and re-balance the heap in log time O(log N).

你不能只跟踪最老的那个,因为如果你拉出他/她,你就不知道下一个最老的那个。为了解决这个医院问题,您实现了最大堆。根据定义,此堆是部分排序的。这意味着您无法按年龄对患者进行排序,但您知道最老的患者总是在顶部,因此您可以在恒定时间O(1)内将患者拉出来并在日志时间O(log)中重新平衡堆N)。

More sophisticated example:

Suppose you have a sequence of integers and you want to keep track of the median. The median is the number that is in the middle of an ordered array.

假设你有一个整数序列,你想跟踪中位数。中位数是有序数组中间的数字。

Example:

[1, 2, 5, 7, 23, 27, 31]

In the above case, 7 is the median because the array containing the smaller numbers [1, 2, 5] is of the same size of the one containing the bigger numbers [23, 27, 31]. Normally, if the array has an even number of elements, the median is the arithmetic average of the 2 elements in the middle, e.g (5 + 7)/2.

在上面的情况中,7是中位数,因为包含较小数字[1,2,5]的数组与包含较大数字[23,27,31]的数组相同。通常,如果数组具有偶数个元素,则中值是中间2个元素的算术平均值,例如(5 + 7)/ 2。

Now, how do you keep track of the median? By having 2 heaps, one min heap containing the numbers smaller than the current median and a max heap containing the numbers bigger than the current median. Now, if these heaps are always balanced, the 2 heaps will contain the same number of elements or one will have 1 element more than the other, the most.

现在,你如何跟踪中位数?通过有2个堆,一个包含小于当前中位数的数字的最小堆和包含大于当前中位数的数字的最大堆。现在,如果这些堆总是平衡的,那么2个堆将包含相同数量的元素,或者一个元素将比另一个元素更多,最多。

When you add a new element to the sequence, if the number is smaller than the current median, you add it to the min heap, otherwise, you add it to the max heap. Now, if the heaps are unbalanced (one heap has more than 1 element more than the other), you pull an element from the biggest heap and add to the smallest. Now they're balanced.

向序列添加新元素时,如果该数字小于当前中位数,则将其添加到最小堆中,否则,将其添加到最大堆中。现在,如果堆是不平衡的(一个堆比另一个堆多1个元素),则从最大的堆中拉出一个元素并添加到最小的堆中。现在他们是平衡的。

#3


The characteristic of a heap is that it is a structure that maintains data semiordered; thus, it is a good tradeoff between the cost of maintaining a complete order ant the cost of seaching through random chaos. That characteristic is used on many algorithms, such as selection, ordering, or classification.

堆的特征是它是一个维护数据半结构的结构;因此,在维持完整订单的成本与通过随机混乱进行搜索的成本之间进行良好的权衡。该特征用于许多算法,例如选择,排序或分类。

Another useful characteristic of a heap is that it can be created in-place from an array!

堆的另一个有用特性是它可以从数组就地创建!

#4


Also good for selection algorithms (finding the min or max)

也适用于选择算法(找到最小值或最大值)

#5


anytime when you sort a temporary list, you should consider heaps.

在你对临时列表进行排序时,你应该考虑堆。

#6


You can use a minHeap or maxHeap when you want to access the smallest and largest elements respectively.

如果要分别访问最小和最大元素,可以使用minHeap或maxHeap。