CPS222 Lecture: Threaded Binary Trees last revised 1/25/2013
Objectives:
1. To introduce inorder-threading of binary trees, and to show how inthreaded
trees can be traversed easily.
2. To discuss threading schemes based on pre or post-order traversal instead
Materials:
1. Code for recursive and non-recursive versions of inorder traversal
(excerpted from prior lecture) to project
2. Transparency + Handout of a completely in-threaded binary tree
I. Inorder Threading of Binary Trees
- ------- --------- -- ------ -----
A. When we talked about preorder, inorder, and postorder traversal of binary
trees earlier, we saw that a stack is always needed to accomplish the
traversal - either explicitly or implicitly because of recursion.
1. PROJECT code for inorder traversal (recursive and non-recursive)
excerpted from earlier lecture.
Observe that implementing the non-recursive version would use a stack
whose size is equal to the height of the tree; the recursive version
would use an implicit stack of the same size.
2. Since these traversals are used often, we would like to avoid the
space and time overhead of the stack if we can. It turns out that there
is a simple tree representation that allows us to do this, while also
allowing us to define an iterator for the tree that allows us to
easily move from one node to the next in the appropriate order.
B. Consider inorder traversal. We will define the inorder successor of
a node n as being the next node that will be visited in doing an
inorder traversal of the tree - or some sentinel value (e.g. NULL or
a pointer back to the header node if there is one) if the node is the
last one visited in inorder.
1. Suppose we were able to define a function
/* Return a pointer to the inorder successor of p */
Node * insucc(Node * p);
Suppose, further, that we arranged for there to be a header node for
the tree, whose insucc is the first node in the inorder traversal
order. if so, we could implement inorder traversal of a tree as
follows, without the use of a stack or recursion:
p = insucc(header);
while (p != header)
{ // Do whatever it means to visit the data at this node
p = insucc(p);
}
2. If a node has a non-NULL right child, insucc is easy to define:
Node * c = p -> _rchild;
while (c -> _lchild != NULL)
c = c -> lchild;
return c;
3. However, if a node has a NULL right child, then its inorder successor
is "above" it in the tree. This is what the stack does for us - note
that, in the non-recursive inorder traversal algorithm, when
p -> _rchild is NULL then we fall through the while loop and pop the
stack again, getting the node that was the parent of p. If its right
child is also NULL we end up popping the stack again, going further
up the tree.
4. We have also noted that a binary tree with n nodes contains n+1 NULL
pointers. It would be nice to do something useful with these. One
thing we could do with a NULL rchild pointer is to use it to point to
the inorder successor of the node. We will call such a pointer a
thread, and a tree containing such pointers a right-inthreaded binary
tree.
5. Of course, we must have some way of tagging the pointers to distinguish
between a regular child pointer and a thread. Since this requires just
one bit, it can generally be done at no additional cost by using a bit
somewhere that is otherwise unused.
a. For example, on some machines a pointer must be even, since words
in memory begin on even address boundaries. Therefore, the
low-order bit of a pointer must be zero. We can differentiate
threads from regular pointers by setting this bit to 1.
b. On most machines, the number of bits used to store an address far
exceeds the number needed to represent the range of addresses
needed for the physical memory installed; hence, the high order
bit is normally 0. We can differentiate threads from regular
pointers by setting this bit to 1.
C. We can now implement insucc - and hence inorder traversal - as follows:
Node * insucc(Node * p)
/* returns a pointer to the inorder successor of p */
{
if (isthread(p -> _rchild)
return makepointer(p -> _rchild);
else
{ Node * c = p -> _rchild;
while (c -> _lchild != NULL)
c = c -> _lchild;
return c;
}
}
- where isthread tests the extra bit of a pointer to see if it is a
regular pointer or a thread, and makepointer clears the extra bit
so that the thread can be used like a poionter.
1. What is the time efficiency of this algorithm? Clearly, any one
application of insucc can require time proportional to the height
of the tree. But what is of more interest is the average cost of
applying insucc n times in order to visit all the nodes of the
tree. We call the average cost per use, averaged over all cases,
the AMORTIZED COST.
2. Note that a tree of n nodes contains n lchild pointers and n rchild
pointers. In the process of traversing the tree, insucc follows each
non-NULL lchild pointer exactly once, and each rchild pointer (normal
or thread) exactly once. Therefore, the total time for traversing
a right-intreaded tree of n nodes in inorder is O(n)! This, of
course, is optimal - since we must visit all n nodes.
3. From this, it follows that the amortized cost of one use is O(n/n) =
O(1).
D. Note that this trick only made use of the NULL rchild pointers. What
about the NULL lchild pointers? Suppose we define inpred as the
inorder predecessor. By symmetry, it turns out that inpred looks like
insucc with lchild and rchild pointers interchanged. Thus, we can
replace NULL lchild pointers by threads to the inorder predecessor. If we
do so, then we can perform inorder traversal in either direction without
the use of recursion or a stack.
1. Such a tree is called completely inthreaded.
2. A tree which contained only left-threads would allow reverse inorder
traversal only. Such a tree is called left-inthreaded.
3. Note that threading is possible with any kind of binary tree. (We
will deal with threading of a binary search tree for a project, but
a threaded tree does not have to be a binary search tree.)
E. A completely inthreaded binary tree might look like the following.
TRANSPARENCY + HANDOUT
Note that we make use of a header node to simplify some of the algorithms
to follow. The header convention is this:
1. If the tree is empty, then the header's left child is a thread back
to the header. Otherwise, it points to the root of the tree.
2. The header's right child is a pointer (not a thread) to itself.
3. The first node (in inorder) in the tree has an lchild thread back to
the header. (Note that our insertion algorithm will ensure this.)
Likewise, the last node has an rchild thread back to the header.
(Insert will also do this.)
4. Note how this choice causes our insucc algorithm, when applied to the
header, to yield the first node of the tree. Our inpred algorithm
also works correctly. Finally, both algorithms return a pointer to
the header when applied to the first/last node in the tree (as the
case may be.)
F. How can be build such a tree? If we always insert new nodes in place
of previously NULL pointers, then the following approach will work:
1. If the new node is the lchild of its parent, then it lies between
its parent's inpred and its parent in inorder traversal. Therefore,
let the lchild of the new node be the original lchild (thread) of
the parent, and let the rchild of the new node be a thread to its
parent.
2. If the new node is the rchild of its parent, then it lies between
its parent and its parent's insucc in inorder traversal. Therefore,
let the rchild of the new node be the original rchild (thread) of
the parent, and let the lchild of the new node be a thread to its
parent.
G. As a further consideration, note that while threads as we have
implemented them are based on inorder traversal, they can assist the
other traversals as well:
1. Preorder - define the function presucc. Note that:
a. If a node has an lchild, then its lchild is its presucc.
Ex: node 1 in diagram.
b. Otherwise, if it has an rchild, then its rchild is its presucc.
Ex: node 19
c. If it has no children, then it is the last node to be visited in
preorder in the left subtree of some node Q. Let Q be the nearest
such node having a non-empty right child. (If all else fails, the
header qualifies.) Then Q's rchild is the presucc.
Ex: node 17 Q is 2 presucc is 5
27 1 3
d. From a node P having no actual rchild, this node Q can be found as
follows:
i. Follow P's rchild thread to a node above it. Clearly, P is in
the left subtree of this node. If this node has a non-thread
rchild, then it is node Q.
ii. If this node's rchild is a thread, then repeat the process
as many times as necessary until a node is found having a
non-thread rchild. This is node Q.
iii. Having found this node Q, P's presucc is Q's rchild.
Time complexity for a complete traversal: note that each non-thread
lchild is followed exactly once, and that each rchild is followed
exactly once - therefore, the traversal is O(n), and the amortized cost
of presucc is O(1).
2. Reverse preorder - define the function prepred. This is not quite as
easy, since we must always go through the parent of the node. Note:
a. If a node is the lchild of its parent, then its parent is its
prepred. Ex: node 2.
b. If a node is the rchild of its parent and the parent has no lchild,
then the parent is the prepred. Ex: node 25.
c. Otherwise, its prepred is the last node (in preorder) in the left
subtree of its parent. This can be found by going down the left
subtree of the parent as far as possible - going right whenever
possible, otherwise left.
Ex: node 13 - prepred is 38.
d. Thus, we must first define a function parent (which is useful in
its own right and also for postsucc, it turns out.)
For any node P, there exists a nearest ancestor Q, such that
P is in its right subtree. (If all else fails, the header is
such.) We can find this node by following lchild pointers
until we have followed a thread. Then, if P is the rchild of Q,
then Q is its parent - otherwise, we follow lchild pointers in
the right subtree of Q until we hit P.
ex: node 3 - Q = 1 and is its parent
node 12 - Q = 1. Note that we can find the parent
(node 6) by going right from Q, then continuing left.
node 13 - Q = 6 and is its parent
e. Given the parent function, prepred is easily defined as discussed
above. Note that reverse preorder traversal using prepred will
not be O(n) for the whole tree, but rather O(n*h), since parent
potentially involves visiting one node on each level of the tree,
and in subsequent applications of parent the same path can be
retraced. Thus, the amortized cost of prepred is O(h) =
O(log n) if the tree is well balanced. (But O(n) worst case).
3. Postorder traversal - define a function postsucc.
a. By symmetry, this turns out to be similar to prepred, but with
the roles of lchild and rchild interchanged. To find the postsucc,
we first find the parent of the node in question.
b. If the node is the rchild of its parent, or if it is the only
child of its parent, then the parent is the postucc.
Ex: nodes 3, 24.
c. Otherwise, we find the first node in postorder in the right subtree
of the parent. This can be found by going down the subtree as far
as possible, preferring to go left whenever possible, otherwise
right.
Ex: node 2 - postsucc = 32.
d. As with prepred, postorder traversal using postsucc is O(n*h);
amortized cost of postsucc is O(h) = O(log n).
e. A caution on implementation: with the previous algorithms, our
header convention has worked to our advantage to produce desired
results - e.g. we could apply inpred, insucc, prepred, or presucc
to the header and get a correct node, and in each case applying
the function to the last node would lead us back to the header.
With postsucc, some special cases are needed when leaving or
coming back to the header due to our trick of making the header
its own right child. (However, the fact that the header is its
own right child makes it easy to recognize the header.)
4. Reverse postorder traversal - define a function postpred.
a. By symmetry, this is analgous to presucc, but with lchild and
rchild roles reversed. Reverse postorder traversal using postpred
is therefore O(n), so the amortized cost of postpred is O(1).
b. As with postsucc, some special cases are needed around the
header.
II. Preorder and postorder threading
-- -------- --- --------- ---------
A. The threading scheme we have discussed has been based on the inorder
traversal of the tree. However, as we have seen, the inorder threads
can also be used to accomplish other traversals (though not necessarily
in O(1) amortized time.)
B. If some other traversal is going to be used regularly instead of inorder,
then an alternate threading scheme might be considered. We could, for
example, base a scheme on pre-order:
1. We might build a threading scheme on the fact that if a node has a
left child, then that child is its preorder successor. If it has no
left child, then its lchild pointer could be made into a thread to
its pre-order successor. (In this case, the rchild pointer is used
as in an unthreaded tree.) Presucc now becomes simply:
if (! isthread(p -> _lchild))
return p -> _lchild;
else
return makepointer(p -> _lchild);
2. Alternately, we could adopt the following scheme for pre-order:
a. If the node has no lchild, then make its lchild pointer a thread
to its pre-order predecessor.
b. If the node has no rchild, then make its rchild pointer a thread
to its pre-order successor.
c. This scheme, like the previous one, makes forward pre-order
traversal fairly easy:
if (! isthread(p -> _lchild))
return p -> _lchild;
else if (! isthread(p -> _rchild))
return p -> _rchild;
else
return makepointer(p -> _rchild);
d. Reverse pre-order is also possible with this scheme, THOUGH WE WOULD
OCCASSIONALLY HAVE TO GO TO THE HEADER AND APPLY PRESUCC REPEATEDLY.
(This is because a node's prepred is never below it in the tree.)
C. We could also base a scheme on post-order. Unfortunately, forward
post-order will always be hard, because a node's post-order successor
is never below it in the tree. However, a scheme to support reverse
post-order would be somewhat easier!