CPS222 Lecture: Heaps; Priority Queues Last revised 1/25/2015
Objectives
1. To show how a complete binary tree can me mapped straight-forwardly to an array.
2. To define a heap, and show how a heap can be maintained.
3. To show how a heap can be used to implement a priority queue.
I. Heaps
- -----
A. In today's lecture, we're going to cover the same ground as the
assigned section in the book, but in a different order.
1. We will begin by talking about a special kind of binary trees
known as a heap. We are then show a special use of heaps - to
implement a data structure known as a priority queue.
2. Your text starts by discussing priority queues, and then introduces
heaps as a way of implementing them.
B. Recall that in talking about binary trees we defined the notion of a
complete binary tree.
1. A complete binary tree (called "almost-complete" by some writers) is
a binary tree having the following properties:
a. If the height of the tree is h, then all leaves lie at level h or
at level h - 1.
b. If any node has a descendant at level h in its right subtree, then
all of the leaves in its left subtree are at level h.
Ex: A A
/ \ / \
B B C
/ \ /
D E F
Recall: a perfect binary tree can be converted to a complete, but not perfect,
binary tree of the same height by removing nodes on the lowest level, starting
from the right and working toward the left. If all the nodes on the lowest level
are removed this way, one ends up with another perfect tree of height one less.
2. We also showed that, in a complete binary tree of height h, there are
at least 2^(h-1) nodes and at most 2^h - 1 nodes.
C. There is a correspondence between an array and a COMPLETE binary tree.
1. Consider what happens if we number the nodes in a complete binary
tree, using level order - e.g:
1
/ \
2 3
/ \ / \
4 5 6 7
/ \
8 9
2. Observe that the following relationship holds between the number
of a node and the number of its children:
if m is the number of a node, then 2m is the number of its left
child (unless 2m exceeds the number of nodes in the tree, in which
case it has no left child.) Likewise, 2m+1 is the number of its
right child, unless 2m+1 exceeds the number of nodes.
3. Likewise, if m is the number of a node, then m / 2 is the number of
its parent - unless m / 2 = 0 (m = 1) - in which case the node is the
root of the tree and has no parent.
4. A complete binary tree, then, can be represented by an array without
using any pointers. Furthermore, in such a representation it is easily
possible to go from a node to its children and also from a child back
to its parent. (When implementing such an array in C/C++/Java, it is
convenient to not use slot 0 in the array, storing the nodes in slots
1 .. size of tree, which means the total space allocated for the tree
is one more slot than actually used. There are ways to use slot 0 as a
header slot for certain operations, or to it can be used to store
information about the total number of nodes in the tree.)
5. Example: the tree
APPLE
/ \
BANANA CHERRY
/
DOGWOOD
can be represented by the array:
[1] [2] [3] [4]
APPLE BANANA CHERRY DOGWOOD
and the array
[1] [2] [3] [4] [5] [6] [7] [8]
A C F G I M Q Z
represents the tree
A
/ \
C F
/ \ / \
G I M Q
/
Z
D. One apecial kind of complete binary tree is known as a
HEAP. A heap is a binary tree with the following properties:
1. The STRUCTURE PROPERTY: it is complete
2. The HEAP PROPERTY: The key at each node is <= the key at either of
its children (if it has any.)
3. Examples
a. Both of the above trees are heaps
b. Example: the following is not a heap
CAT
/ \
EEL AARDVARK
/ \ / \
ZEBRA RACCOON FOX SNAKE
Why?
ASK
The heap order property is violated by AARDVARK, because it is
not true that CAT <= AARDVARK
c. Example: the following is not a heap
CAT
/ \
EEL FOX
/ \ \
ZEBRA RACCOON SNAKE
Why?
ASK
The heap structure property is violated by SNAKE.
4. Note: nothing is said about the relative order of the keys of the
children - only the relationship between the parent and the
child. Thus, both of the following are heaps:
CAT CAT
/ \ and / \
DOG FOX FOX DOG
5. Note that this definition defines what is sometimes called a
"minheap" because the key at the root is the minimum of all the
keys in the tree. It is also possible to define a "maxheap" by
changing the <= requirement in the heap order property to >=.
II. Maintaining a heap
-- ------------------
A. We now consider the basic strategy for maintaining a heap. We
need to support two basic operations:
1. Construction: inserting new items into the heap either incrementally,
or enmasse (creating a heap from scratch from a mass of data.)
a. This can be done in O(log n) time incrementally.
b. It can be done in amortized O(1) time enmasse.
2. Removing the item with smallest value from the heap. (Finding
it is easy - it is always the top of the heap - what's a bit
more complicated is replacing it with the next smallest value.)
This can be done in O(log n) time.
3. We do NOT consider an operation for removing a SPECIFIC item
from the heap. As it turns out, such an operation is not needed
for the uses of heaps we have discussed, and would take O(n)
time just to FIND the specific item, since a heap is not intended
as a search structure.
4. Three preliminary remarks:
a. We represent the heap by a data structure consisting of a count
of the number of items currently in the heap (n) and an array
of actual items (in slots [1] .. [n]). We assume that the array
has additional space available for adding new items - so to
add an item we can increment n, which makes slot [n+1] part of
the heap, and then adjust the information in the heap appropriately.
b. Because a heap is a complete binary tree, we know that its height
is <= ceiling(log n). Hence, any operation that performs at most
one operation at each level in the tree takes time O(log n)
c. The algorithms I'm presenting differ in some details from the
ones in the book, but are essentially the same.
B. Constructing a Heap
1. The strategy for incremental construction is this: to add a new node
node to a heap:
a. Declare slot n+1 to be part of the heap. Call this the vacant
slot.
b. Perform the following operation repeatedly:
i. Consider the parent of the vacant slot. (Slot (vacant slot / 2)).
It the parent does not exist (vacant slot is 1) or the current
contents of the parent slot <= the new item, quit this loop.
ii. Otherwise, move the contents of the parent slot into the vacant
slot and declare the parent slot to be the vacant slot
c. When the loop is done, insert the new entry in the vacant slot.
d. Example: Add 3 to the following heap
1
4 2
7 5 10 9
8
- Initially, vacant slot is right child of 7.
1
4 2
7 5 10 9
8 _
- Since 7 > 3, move 7 into the vacant slot and declare its slot
the vacant slot.
1
4 2
_ 5 10 9
8 7
- 4 is the parent of the new vacant slot. Since 4 > 3, move
4 into the vacant slot and declare its slot vacant.
1
_ 2
4 5 10 9
8 7
- 1 is the parent of the new vacant slot. Since 1 <= 3, stop.
- Put 3 into the vacant slot
1
3 2
4 5 10 9
8 7
e. Clearly, this process is O(h) = O(log n)
2. If we have all the entries available to us at the outset, we can
build the heap more efficiently as follows:
a. Initially just put the entries into the array representation
in any order. The result, viewed as a binary tree, will satisfy
the heap structure property, but not the heap order property.
b. Convert this to a structure satisfying the heap order property -
the algorithm for this is given in section 8.3.6 of the book
(where it is called bottom-up heap construction.)
c. The book gives an analysis that shows that the cost of building the
entire heap this way is O(n), which makes the amortized cost per
entry O(1).
C. Removing the minimum item from a heap (removeMin)
1. The algorithm is similar to that for incremental construction.
Since the minimum item is to be removed from the heap, we
consider its slot (the root) to be vacant. Likewise, since
the size of the heap is to be decreased by 1, we must find a new
home for the item currently in slot n (the displaced item),
since the size of the heap is being reduced to n-1.
a. Perform the following process repeatedly:
i. Consider the child or children of the vacant slot (slots
2 * (vacant slot) and 2 * (vacant slot) + 1.
- If neither is part of the heap (2 * vacant slot) > new heap
size, quit this loop.
- If there are two children, consider the child item with
the smallest value - we call the slot where this occurs
child slot.
- If the displaced item is <= than this child, quit this loop.
ii. Otherwise, move the child item into the vacant slot, and
consider the child slot to be the new vacant slot.
b. When the loop is done, put the displaced item in the vacant slot.
2. Example: Remove the smallest item from the following heap:
1
3 2
4 5 10 9
8 7
- Initially, the displaced item is 7. The vacant slot is the
one that contained 1
_ Displaced item = 7
3 2
4 5 10 9
8 (Note that the slot that contained
7 is no longer considered part of
the heap)
- Since 2 is the smallest child of the vacant slot, and 7 > 2,
move 2 into the vacant slot and make its slot the new vacant
slot.
2 Displaced item = 7
3 _
4 5 10 9
8
- Since 9 is the smallest child of the vacant slot, and 7 <= 9,
stop. Put the displaced item - 7 - into the vacated slot
2
3 7
4 5 10 9
8
3. Clearly, this process is O(h) = O(log n) - Why?
ASK
III. Uses for Heaps
--- ---- --- -----
A. One use discussed in the book is to represent a priority queue.
(We assume here that smaller numbers mean higher priority - e.g
"priority 1" beats "priority 2").
1. A priority queue is often used in conjunction with some kind of
server that provides services on a priority basis - e.g.
a. A priority CPU scheduler in an operating system assigns the CPU to
the process with the smallest priority value.
b. The scheduler associated with a print queue might print the
shortest job (in terms of number of pages) first.
2. The principal operation a priority queue needs to support is to
find the entry with the smallest priority value and remove it from
the queue. (The book calls this removeMin).
3. Note that, with a heap based on priority values, the smallest
value is always found "at the top of the heap". Because we can
remove this entry and replace it with the one having the next
smallest priority value easily (as we shall see shortly) we can use
a heap as a priority queue.
B. Another use of heaps is in event-driven simulations of some system.
1. Example: simulate the operation of a bank. Events are
a. New customer arrives and gets in line
b. Finish processing a customer transaction
2. The heart of such a simulation is the "event list" which maintains
a list of simulated events in the order in which they occur.
3. The principal operation the event list needs to support is the
ability to find the next event that is scheduled to occur and
remove it from the event list.
Again, a heap based on the scheduled time for events works well
for this.