Design and Analysis of Algorithms: Binary Search Trees

12.1 What is a binary search tree?

Binary Search Tree Property

Let x be a node in a binary search tree. If y is a node in the left subtree of x, then y.key ≤ x.key. If y is a node in the right subtree of x, then y.key ≥ x.key.

Inorder Tree Walk

                
                    Inorder-Tree-Walk(x)
                        if x != NIL
                            Inorder-Tree-Walk(x.left)
                            print x.key
                            Inorder-Tree-Walk(x.right)

Compare with:

                
                    Postorder-Tree-Walk(x)
                        if x != NIL
                            Postorder-Tree-Walk(x.left)
                            Postorder-Tree-Walk(x.right)
                            print x.key

12.2 Querying a binary search tree

Why do we pass in a node rather than the tree itself?
Note why CLRS pseudo-code for max and min contains mistake from practical point of view.
Library code versus application code.
Library must protect itself against both malicious and merely inept users. So, need to check for NIL input!

Recursive versus loop: compare the two searches.

All searching functions (search, minimum, maximum, predecessor, successor) run in O(h) time, where h is the height of the tree.

We search by starting with the root node. If it's key is equal to the key for which we are searching, we are done: that is the key we want.
On the other hand, if we have a NIL node at hand, we are also done, but have failed to find the key.
Finally, if neither is true, we check the key at hand against the key sought for.

If k < curr.key, recursively call tree search on curr.left.
Else recursively call tree search on curr.right.

Example:
Let's search for 4 in the tree above.

4 < 8; move left
4 > 3; move right
4 < 6; move left
Voila!

Running time

Worst case: Θ(n)
This occurs when the tree simply makes one long chain.

Average case: Θ(lg n)
That is because the expected height of a randomly built binary search tree is O(lg n).

Minimum and maximum

To find the minimum, we simply walk down the left side of the tree.
To find the maximum, we simply walk down the right side of the tree.

Successor

Let's walk through successor for our tree above.

Imagine we are seeking the successor of 3. The right tree of three is non-empty. So we simply seek the minimum of that tree, which is the leftmost node in the tree, in this case, 4.

On the other hand, take this tree, and start with node 10.

10 has no right child, so its successor must lie further up the tree. But it is the right hand node of its parent, so its parent can't be it. However, at the next move back a generation we move left... and that's the successor!

12.3 Insertion and deletion

Insert

Let's walk through insert for tree 1 above.

We will insert 5. (When I say "Insert x," you should take that to be shorthand for "Insert a node with key x.")

5 < 8, so we move left.
5 > 3, so we move right.
5 < 6, so we move left.
5 > 4, so we move right and insert.

Delete

Deletion is by far the most complicated coding of any of these functions. There are four cases (z is the node to delete):

z has no left child. (This handles two cases!)
Action: Replace z by its right child, which might be NIL.
z has only a left child.
Action: Replace z by its left child.
z has two children.
Action: Find z's successor y, which lies in its right sub-tree.
- If y is z's right child, replace z by y.
- Otherwise, replace y by its own right child, then replace z by y.

Walk through of delete

Let's walk through deleting 8 in the tree above.

8 has two children, so we are in case 4.
The successor of 8 is 9.
9 is not the right child of 8.
So we replace 9 by 10.
Then we replace 8 by 9.

The right half of tree 2 now looks like this:

12.4 Randomly built binary search trees

We insert the keys randomly. (We would have to have all keys at hand at the start, and then do a random shuffle on the set of keys.) The expected height of such a tree is O(lg n).

This handles the situation where we believe we might get handed input that is already sorted, which would create a worst-case scenario.

Run the Python code

In the console below, type or paste:
!git clone https://gist.github.com/gcallah/f0d36f8c107e6c1e888d58aefcb3a5aa cd f0d36f8c107e6c1e888d58aefcb3a5aa from binary_search_trees import *

Python console

Source Code

Java
Ruby
Python

For Further Study

Homework

For the first tree above, write the steps that will take place when we insert 11.
Write recursive versions of TREE-MINIMUM and TREE-MAXIMUM.
Compare the deletion of the node with key == 8 in Tree 2 that takes place following our textbook's algorithm with what the first visualizer linked to above actually does in this situation. Clearly, two different binary trees result. What algorithm is being used by the visualizer? (You could build different trees and see what it does: this is called reverse engineering.) If both algorithms always build valid BSTs, is there some reason we should prefer one algorithm or the other?
Keys are input to a binary search tree (BST) in the following order: 5, 10, 15, 20, 25. Draw the resulting BST.
Keys are input to a binary search tree (BST) in the following order: 50, 40, 35, 20, 15. Draw the resulting BST.
Keys are input in the following order: 10, 7, 15, 2, 25, 33, 1, 8, 9. Draw the resulting BST. Trace the process that occurs when inserting key 18.
Keys are input in the following order: 6, 98, 4, 3, 5, 2, 1, 13, 106, 14, 18, 80. Trace the process by which the successor of 80 will be found.