This post is one in a series of stuff formally trained programmers know – the rest of the series can be found in the series index.
Building off of the Binary Search Tree, we get the red/black tree which aims to solve the problem that a BST can become unbalanced.
The short version of a red/black tree is that is a BST with a set of rules that help us keep it performing well.
Thinking back to the BST, when we are doing the inserts and deletes at some point we need to rebalance the tree to ensure we keep that sweet O(log n) performance for searching. When is the right time to do that rebalancing?
A red/black tree extends a BST by adding one bit of information to each node; for those keeping track, our node now has the data, a key and a flag.
The flag is either red/black and there are 7 rules a BST (there are 5 official ones that relate to the colours, the first 5 below, but there are two more for the BST itself):
- Each node is either red or black
- The root node is black
- All leaves are black. These leaves can be the null points off of a node or they can be explicit nodes.
- A red node can only have black children
- Any path from a node to the leaves contains the same number of black nodes
- New nodes added are always Red.
- If the depth of a path is more than twice that of the shorted path we need to do a rotation.
So with these rules, insert & delete get more complex because you need to check these rules and, if the rules do not work you start to correct the issue. What is great with this is that because of the rules you become really efficient at correcting the problems.
So let's look at a really simple BST:
Next, let's make it a red/black tree following our rules above. I am also going to add the leave nodes in this image to help explain it but will take them out the remaining ones.
- that from the root node you need to go through 1 black node to reach a leaf node regardless of path.
- All red nodes only have black children.
- A black node can have black children.
Now we are going to add a node with value 11. It is added in the correct place and as a red node.
However 9 is red and has a red child, so we need to trigger a repaint of it to black. That causes a new issue, that from the root node to the leaves you may go through 1 black node by going left or 2 black nodes by going right, so we need to paint the parent of 9 (i.e. 7) to red. Lastly, since 7 is now red, 6 must be repainted to black. Finally, we have a correct red/black tree again.
Lastly let us add 10, which goes under 11. Immediately we note that the length from 5 to the leaf of 10 is 4 steps, while the shortest path (the left side) is 2. We know we need to do a rotation.
The rotation is easy, we take the child of the longer side (i.e. 7) and make it root and make the original root (i.e. 5) a child of it. Since 7 has two children already (5 and 9), it's original child 6 moves to under 5. Next, we just trigger a repaint, starting with the fact 7 needs to be black and you end up with the following:
This might seem really expensive to do, but since you are just changing pointers and only a few the performance of the insert becomes also O(log n).