Tuesday, October 11, 2016

Self-Adjusting Heaps

I am a big fan of self-adjusting data structures. The whole idea behind self-adjusting data structures is that one can use a simple heuristic to adjust the data structure (perform rotations in a tree for example) and get superb bounds on the running time (amortized over a sequence of operations on the data structure). These data structures can be difficult to analyze but it is rather satisfying that simple ideas can lead to practical data structures. 

Pairing heaps are very easy to understand and implement and they perform really well in practice. If you are going to use Dijkstra's algorithm in production code, then you better use pairing heaps to implement it.

We are going to implement a min-heap. Here are the operations that will be used to interface with the data structure:

  • Top: Top element (or minimum).
  • Push: Insert new element.
  • Pop: Remove top element.
  • Join: Merge two heaps together.


We are going to use the left child next sibling tree representation. Each node in the tree stores a key, a pointer to its leftmost child and a pointer to its next sibling (right sibling). An empty heap is just a null pointer. With this representation, adding a new child is easy and fast.

Here is the code for what we have covered so far:

Getting the Top Element

With the above representation, getting the minimum element is easy to implement. We just return the key of the root node.

Merging Heaps Together

This operation is not so hard to implement either and we are going to use it again to implement the rest of the pairing heap interface. Let us imagine that we have two non-empty heaps $A$ and $B$. If the key of the root node of $A$ is smaller than the key of the root node of $B$ we simply add the root node of $B$ as a child of $A$ and vice versa. Note that such an operation moves the entire heap and not only the root node (we are using pointers in the representation).

Inserting a New Element

Again, this is really easy to implement. We can create a new node for the element to be inserted and merge this node with the existing heap by using the merge routine.

Removing the Top Element

So far, all of the operations were easy to implement and they are fast as well; each operations takes $\cal O(1)$ time to execute. Now removing the top element requires some re-adjusting. This is where we use a heuristic to re-adjust the tree. The heuristic we are going to use is called two-pass merge. It is actually convenient to have some figures to explain how this works. So let us imagine that our heap (or tree really) looks like the figure below.

Pairing heap before pop

The two-pass merge heuristic re-adjusts the tree by first removing the root node and then merging the children in pairs. This is a two-pass merge because we have to follow the next pointers in a first pass and then merge on the way back.

Pairing heap after pop

Recursion comes in handy here, we can implement the above idea in a single function.

This operation takes logarithmic time amortized over a sequence of operation.

Wrap Up

We have been working with nodes but it would be nice to have a type for pairing heaps themselves. This is not hard to accomplish. We add a new type for pairing heaps with its own interface. This is a clean way of doing it and in a real world implementation we can hide the node-based implementation and expose the pairing heap interface only.

This is all for this post. The entire code listing can be found here. I hope that it was a good read and as always comments and questions are welcome.