Building a B-Tree in JavaScript

How I created an animated BTree using JavaScript (and Vue.JS)

Sebastián Fernández (sebastianfdez)
Published in
18 min readMay 4, 2020

--

Why do this?

Just for fun! A friend and I started developing a non professional web application called Structies, with the purpose of approaching people to the most popular data structures, in the way of being able to visualize their main operations and interact with them in a friendly way. The selected framework to create this site was: VueJS, so all the classes were coded in JavaScript. In this way, the idea was to build all these data structures from zero.

This article will explain in detail how I build the BTree class, starting with the general properties to then describe the main operations of the BTree followed by graphic animations and the complexity analysis of them, to end with a brief explanation of how the animations were developed thanks to Structies.

Logo of project Structies

What is a B-Tree?

One way to define B-Trees, is saying that they are like Binary Search Trees but with more keys and more than two children per node, having as main properties the characteristic of being searching trees and self-balanced. Nowadays, B-Trees are the most popular data structure, presenting a great advantage in the access to data over the time processing it. So in this way, today they are mostly used in hard drives, using a hierarchical index to minimise the number of disk reads and improve the time accessing the information.

Technically they are quite complex in comparison to other structures like heaps, but here, all its operations will be explained in detail supported with visual animations, in order to fully understand those mechanisms.

Below, you will find a picture of a B-Tree of order t=2 containing all the multiples of 10 until 140.

B-Tree of order 2 with all the keys multiple of 10 until 140 inserted. BTree created thanks to Structies

But, why BTrees and JavaScript?

JavaScript is not just used in most front-end applications, some popular backend frameworks like NodeJS or DataBases like PouchDB, are chosen by big tech companies to create powerful servers. So I hope this implementation will help you to learn how BTrees work or for some professional implementation.

Main Characteristics

  • Nodes can have multiple children and keys (even thousands!).
  • Traversing the tree from top to bottom takes O(log N).
  • All operations like search(), insert() and remove() take O(log N).
  • Keeps keys in sorted order for sequential traversing.

By keeping partially full blocks, this structure has quick insertions and deletions in comparison to the amount of data they store, so it works very well in magnetic disks.

Properties

  • Every BTree has a root (BTreeNode).
  • Every BTreeNode has the following attributes:
    - leaf (boolean): if it is a leaf or not.
    -
    values (number[]): list of all values stored in order, so:
    BTree.values[i] <= BTree.values[i+1]
    -
    children (BTreeNode[]): list of sub-tree children.
  • Every immediate pair of values in a node defines the limits of all the values in the subtree children contained in between.
// (Pseudo code)
if (key in BTreeNode.children[i+1]) {
BTreeNode.values[i] < key < BTreeNode.values[i+1]
}
  • In a BTree of order t, every BTreeNode has [t-1, 2t-1] keys and [t, 2t]children.
  • All leaves are equally deep (at the same level).

Now let’s code!

BTreeNode class

In the following fragment, the basic definition of the BTreeNode is presented, with the 4 basic operations to insert and remove values or references to child nodes. All these functions keep the array of references or values as small as possible using the JavaScript method array.splice().

  • BTreeNode.addValue(number)
  • BTreeNode.removeValue(position)
  • BTreeNode.addChildren(node, position)
  • BTreeNode.removeChildren(position)
Basic BTreeNode class

BTree class

The following class declaration covers all the necessary functions to perform the methods search, insert, and delete.

Basic BTree class definition

Searching a value in a BTree

Since BTrees have the property of searching trees, this operation is very similar to searching in binary search trees (BST). To search a value k starting from the root, the goal is to traverse the tree from top to bottom looking for the correct child in every iteration.

In the BST the algorithm goes right if the target k value is greater than the actual node value, and left if it is lower. Keeping the same logic, the BTree continues to the subtree where it should be stored the target k by comparing it with the values stored in the actual node.

/**
* Search a value in the Tree and return the node. O(log N)
* @param {number} value
* @param {BTreeNode} node
* @returns {BTreeNode}
*/
searchValue(node, value) {
if (node.values.includes(value)) {
return node;
}
if (node.leaf) {
// Value was not found
return null;
}
let child = 0;
while (child <= node.n &&
node.values[child] < parseInt(value, 10)) {
child++;
}
return this.searchValue(node.children[child]);
}

Inserting a value in a BTree

Simple case: Insert number 70. Animation created thanks to Structies

In the simplest insertion case, as it is shown by the image above, the tree is explored through its children from the root until a leaf. For each node visited, as the search method, the inserted value k has to find the correct child to continue, making a comparison with the values in the actual node. The tree is traversed until one leaf is reached. If that leaf has enough space, less than 2t — 1 values, k is inserted in the leaf.

// Try to add value {number} in node {BTreeNode}
if (node.leaf) {
node.addValue(value);
return;
}
let temp = node.n;
while (temp >= 1 && value < node.values[temp - 1]) {
temp = temp - 1;
}
// Continue searching in -> node.children[temp]

But it’s not always as easy as that. Things get a little complicated when a full node (2t — 1 values) is visited. In this case, an extra operation has to be done: the split method.

A split is basically giving one value to the parent and splitting the rest of the values and the linked children into two different new nodes at the same level. The mechanism used in this algorithm consists of giving to the parent the value in the position t — 1, in other words, the median. Similarly, the new brother created, receives the values from 0 to t — 1 and the children from 0 to t — 1.

In the image below, performing the insertion of the number k = 130, the node which contains the values [60, 80, 100] is full, so before being visited it will be split into two smaller nodes. The value in position t — 1 80 is transferred to the parent, and the rest of the values are split into the 2 new nodes. Notice that the children will be shared equally into the new nodes.

Insert number 130 doing a split in visited node. Animation created thanks to Structies

Below, the BTree definition with the insertion and split method included.

With this approach, the only case when the height of the tree increases, is when during an insertion, the root is visited and found full. So in that case, the split operation is performed and a new node is created and assigned to the root reference. In the following image, value k = 80 is inserted in a tree where the root is full.

Insert k=80 over a tree with full root. Animation created thanks to Structies

Now, the hardest part …

Removing a value in a BTree

Similar to the insertion, the delete operation has simple cases and some complex ones. The main idea of removing a value k, is recursively going through the height of the tree trying to move the target value k to the bottom, but always being sure that the next node has more than t — 1 values before visiting it, otherwise, do some operations before moving down.

Below, the delete operation will be covered in 3 different sub-cases for every step during the iteration through the height of the tree, providing examples and images in every one of each. The examples presented for this method will be all over a B-Tree of order t = 2.

  • Case A: Target is in a Leaf not minimally filled.

The simplest case is when the target value k is found in a leaf which is not minimally filled, in other words, the leaf has more than t — 1 values.

Deleting number k=80 from a leaf. Animation created thanks to Structies

As the image shows, the value k = 80 was found in a leaf with more than t — 1 values, so in this case, the value k is just removed from the node.

/**
* Delete a value from a node
* @param {BTreeNode} node
* @param {number} value
*/
deleteFromNode(node, value) {
// Check if value is in the actual node
const index = node.values.indexOf(value);
if (index >= 0) {
// Value present in the node
if (node.leaf && node.n > this.order - 1) {
// If the node is a leaf and has more than order-1 values,
// just delete it
node.removeValue(node.values.indexOf(value));
return;
}
}
...
}
  • Case B: Value k is in the actual node.

First of all, as to how it was briefly mentioned before and as to how it will be explained in Case C, the rule to continue down the tree and before visiting the next child, is mandatory that the child has to have at least t values to be visited, otherwise do some operations to satisfy this condition.

According to that, when this case reached means that the actual node contains the target value k, has more than t — 1 values and it is not a leaf, because it would be handled in Case A.

In order to continue moving the value down the tree through one of its direct children, either the left or right child of the target value k, there are 2 possible scenarios:

  1. Both left and right direct child of k have only t — 1 values:

If there is no direct child with enough values to continue the recursion, the value k has to be merged with both direct children. The way to do this is defining a target (right or left) child where the value k and all values of the other child will be inserted. Notice that it will leave the target node with (t — 1) + (t — 1) + 1 -> 2t — 1 nodes, which is the maximum authorized.

Removing node k=60. Animation created thanks to Structies

In the same way, when the merged nodes are not leaves, the children of the origin node are also transferred to the target node. In that case, merging the two immediate brothers, the result will be a node with t + t -> 2t children, which is also the maximum authorized.

...
deleteFromNode(node, value) {
// Check if value is in the actual node
const index = node.values.indexOf(value);
if (index >= 0) {
...
// Check if one children has enough values to transfer
if (node.children[index].n > this.order - 1 ||
node.children[index + 1].n > this.order - 1) {
// One of the immediate children has enough values to transfer
...
}
// No child with enough values to continue. Do a merge
this.mergeNodes(node.children[index + 1], node.children[index]);
return this.deleteFromNode(node.children[index], value);
}
...
}
/**
* Merge 2 nodes into one with the parent median value. O(1)
* @param {BTreeNode} origin
* @param {BTreeNode} target
*/
mergeNodes(origin, target) {
const indexo = origin.parent.children.indexOf(origin);
const indext = target.parent.children.indexOf(target);
target.addValue(target.parent
.removeValue(Math.min(indexo,indext)));
for (let i = origin.n - 1; i >= 0; i--) {
target.addValue(origin.removeValue(i));
}
// Remove reference to origin node
target.parent.deleteChild(indexo);
// Transfer all the children from origin node to target
if (!origin.leaf) {
while (origin.children.length) {
indexo > indext ?
target.addChild(
origin.deleteChild(0), target.children.length) :
target.addChild(
origin.deleteChild(origin.children.length-1), 0);
}
}
}

2. Left or right direct child to value k, has more than t — 1 values:
In this case, find the predecessor/successor k’ of the target value in the adjacent subtree. Then replace k by k' in the node and continue to delete k’ from the respective child recursively.

Removing node k=40. Animation created thanks to Structies

In the previous image, trying to delete the target value k = 40 in the current node, in this case the root of the tree, the successor k' = 50 was found and replaced the target value, then the recursion continued to the right child with a new target value k = 50.

...
deleteFromNode(node, value) {
// Check if value is in the actual node
const index = node.values.indexOf(value);
if (index >= 0) {
...
// Check if one children has enough values to transfer
if (node.children[index].n > this.order - 1 ||
node.children[index + 1].n > this.order - 1
) {
// One of the immediate children has enough values to transfer
if (node.children[index].n > this.order - 1) {
// Left child
// Replace the target value for the higher of left node.
// Then delete that value from the child
const predecessor =
this.getMinMaxFromSubTree(node.children[index], 1);
node.values[index] = predecessor;
return this.deleteFromNode(node.children[index],
predecessor);
}
// Right child
const successor =
this.getMinMaxFromSubTree(node.children[index+1], 0);
node.values[index] = successor;
return this.deleteFromNode(node.children[index+1], successor);
}
// No child with enough values to continue. Do a merge
this.mergeNodes(node.children[index + 1], node.children[index]);
return this.deleteFromNode(node.children[index], value);
}
...
}
/**
* Get the lower or higher value in a sub-tree
* @param {BTreeNode} node
* @param { 0 | 1 } max 1 for find max, 0 for min
* @returns {number}
*/
getMinMaxFromSubTree(node, max) {
while (!node.leaf) {
node = node.children[max ? node.n : 0];
}
return node.values[max ? node.n - 1 : 0];
}
  • Case C: (And last!) Value k is not in the actual node.

When the visited node does not include the target value k, it has to inspect the values to determine the next child node to visit. If the respective child node has more than t — 1 values, continue the operation over that node.

...
deleteFromNode(node, value) {
// Check if value is in the actual node
const index = node.values.indexOf(value);
if (index >= 0) {
...
}
// Value is not present in the node
if (node.leaf) {
// Value is not in the tree
return;
}
// Value is not present in the node, search in the children
let nextNode = 0;
while (nextNode < node.n && node.values[nextNode] < value) {
nextNode++;
}
if (node.children[nextNode].n > this.order - 1) {
// Child node has enough values to continue
return this.deleteFromNode(node.children[nextNode], value);
}
...
}

If the respective child that is supposed to have k, has not enough values to continue, it has to find one of the immediate brothers to transfer it a value. If one of the immediate brothers has more than t — 1 values, the way to do this, is transferring the median value k’ from the parent (the value that separates both nodes) to the target node, and at the same time, the closest value k’’ in the origin node to the parent. The next image illustrates this method clearly.

Removing node k=120. Animation created thanks to Structies

As an example, in the image above, trying to remove k = 120, the node containing values [110, 130] was visited. Since the next node to visit, the one in the middle containing [120], has just t — 1 values, the tree searched for one immediate brother that has an extra value that could be transferred to the target node, in this case, the node at the right. So to remove k = 120, the method transfers value k’’ = 140 from the brother to the parent and value k’ = 130 from the parent to the target node, to then continue moving down to the target node.

When involved nodes in a transfer method have children, the closest child from the origin node is transferred to the target node.

deleteFromNode(node, value) {
...
// Child node has not enough values to continue
if ((nextNode > 0 &&
node.children[nextNode - 1].n > this.order - 1)
|| (nextNode < node.n &&
node.children[nextNode + 1].n > this.order - 1)) {
// One of the immediate children has enough values to transfer
if (nextNode > 0
&& node.children[nextNode - 1].n > this.order - 1) {
this.transferValue(node.children[nextNode - 1],
node.children[nextNode]);
} else {
this.transferValue(node.children[nextNode + 1],
node.children[nextNode]);
}
return this.deleteFromNode(node.children[nextNode], value);
}
...
}
/**
* Transfer one value from the origin to the target through the parent
* @param {BTreeNode} origin
* @param {BTreeNode} target
*/
transferValue(origin, target) {
const indexo = origin.parent.children.indexOf(origin);
const indext = origin.parent.children.indexOf(target);
if (indexo < indext) {
target.addValue(target.parent.removeValue(indexo));
origin.parent.addValue(origin.removeValue(origin.n-1));
if (!origin.leaf) {
target.addChild(
origin.deleteChild(origin.children.length-1), 0);
}
} else {
target.addValue(target.parent.removeValue(indext));
origin.parent.addValue(origin.removeValue(0));
if (!origin.leaf) {
target.addChild(
origin.deleteChild(0), target.children.length);
}
}
}

In the other hand, if the next node to visit has just t — 1 values, and there is no immediate brother with enough values to transfer, similar to the Case B, the node will have to be merged with one of the immediate brothers to continue.

In the following image, in order to remove k = 20, after the root, the next node to visit is the first child at the left. Since that node has just t — 1 values and there is no immediate brother with extra values to transfer, the actual node descends the middle value, in this case k' = 40 to the target node and transfers all the nodes from the selected immediate brother. Then the recursion continues down to the target node.

Removing node k=20 doing a merge. Animation created thanks to Structies
deleteFromNode(node, value) {
...
// No immediate brother with enough values.
// Merge node with immediate one brother
this.mergeNodes( nextNode > 0 ?
node.children[nextNode - 1] : node.children[nextNode + 1],
node.children[nextNode]);
return this.deleteFromNode(node.children[nextNode], value);
}

So how can the tree decrease its height by removing a node?

In the case when a remove operation is made but the root has just one value and both children have only t — 1 values, it will not be possible to satisfy the rule about the minimum quantity of values to move down to the next node. So the only alternative is doing a merge between the two children bringing down the only value of the root, to be sure that there is a node down with enough values to continue. However, this will make that the root reference will be an empty node.

The technique is, before every remove operation, check if this particular case is present before doing anything. If it is the case, as how it was explained before, the tree will shrink in height merging the 2 children with the only value it has, to then change the reference from the old root to the node with the merged values. The following image shows how removing target k = 10, the tree merges the root with both children before continuing down to find the target.

Removing node k=10. Tree reduces it’s height in one. Animation created thanks to Structies

The following code contains the BTree class with the remove operation.

BTree with remove operation.

Analysing the operation’s complexity

All operations in BTree depends only on the height of the tree. The following table shows the complexity of main BTree methods.

  • Space: For a BTree with n values, the space is proportional to the number of values inserted. Using this helper to calculate the size of the b-tree instance after every insertion, a little experiment was done to calculate the memory size of the tree compared it to the number of values, and the result is constant: around 12 bytes in average per value, this means that the space used is linearly proportional to the number of values n.
let count = 0;
const sizes = [];
for (let i = 10; i < 1000; i += 10) {
count += 1;
this.bTree.insert(i);
const size = this.sizeof(this.bTree);
sizes.push(size);
console.log(size / count); // Around 12
}
console.log(sizes); // [20, 28, 36, 52, 60, 72, 80, 92, 108 ... ]
Memory allocation of a Btree instance over number of nodes.
  • Search: In the worst case, when the value is not present in the tree or is stored in one leaf, the search method will do constant c operations per node visited and it will visit as many nodes as the height of the tree log3(N). So the complexity of this method is: O(log3(N)*c) -> O(log N)
  • Insert: This operation, in every case, has to go down by the tree until reaching a leaf, so, like the search method the complexity is O(log N) also. Notice that all the sub-method called by insert() like split() or insertNonFull() have constant time O(1)complexity, because they just do value reasignations and do not iterate nor call other functions with higher complexity.

In the following graphic, the line blue shows the result of the time taken by 400 consecutive insertions in milliseconds, and by its side, the red line represents the accumulated average. The function log3(N)/8was also added to show the similarity in the shape of both curves. It’s interesting to observe how there are some periodical peaks, which is the case is when the insertion does an extra operation like split().

const times = [];
for (let i = 1; i < 400; i += 1) {
const init = performance.now();
this.bTree.insert(i);
const end = performance.now();
times.push(end - init);
}
console.log(times); // [0.074999989010393, 0.0099999886006116, ... ]
  • Delete: The delete operation as the search and insert method has to reach a leaf in any possible case. In the worst case, this method will call the sub method getMinMaxFromSubTree() when the target value is in a non leaf node, which will leave the new target value in a leaf. So this will traverse the tree twice, but it will still stay in the logarithmic complexity O(log N). The other operations transferValue() and mergeNodes() have linear complexity O(1) so doesn’t affect the time complexity.

The following image shows the performance in time of i in 400 consecutive deletions in a BTree with i values. The result was displayed beside the accumulated average and the logarithm curve to visualize the shape similarity between the last two curves.

const times = [];
this.bTree.insert(0);
this.bTree.insert(1);
for (let i = 1; i < 400; i += 1) {
const init = performance.now();
this.bTree.delete(i);
const end = performance.now();
times.push(end - init);
this.bTree.insert(i);
this.bTree.insert(i + 1);
}
console.log(times); // [0.0149999978020787, 0.0200000067707151, ...]

Animating the operations with Vue.JS

D3 + Vue.js

With the help of the D3 Library using the hierarchy and tree classes (Github), Structies was able to visualize a generic tree giving data to D3 with a specific structure. So in order to represent and visualize a particular state of a BTree, a new functionality was created BTree.toJSON(node) in order to create a compatible object that can be understood by D3 that allows creating a graphic representation of the current BTree. Below, there is a codepen that shows how a simple object containing children and values is interpreted by D3 to render a BTree.

Create an animation

In order to make things as simple as possible, the main idea of this animation was just to display every step and every comparison made by the insert and the delete methods. Notice that in the last code fragment, there is an attribute highlighted which turns a node into red, and if two connected nodes are highlighted, the link between them also becomes red. Thanks to this basic notion of highlighting, it is possible to emphasize some nodes and links in every step of an operation.

Structies use the notion of Sequences and Frames to represent the animation of an operation with multiple steps, in this way, a component called Visualizer, is responsible for displaying Trees using D3. This component receives a list of Sequences to display, so every operation, like BTree.insert(k) or BTree.delete(k), will return a Sequence object with multiple Frames inside; which every Frame contains an object Tree, compatible with D3, with some highlighted values.

After every operation, a new empty Sequence is created and added to the list which the Visualizer is subscribed to, and immediately, it will progressively add the Frames one by one to this new Sequence. The visualizer by default is displaying the last frame of the last sequence, so this technique of adding frames asynchronously, will give a sensation of animation.

addSequenceAsync(frames) {
this.isAnimating = true;
const newSequence = new Sequence();
newSequence.addFrame(frames[0]);
this.sequencesList.push(newSequence);
this.currentSequenceNumber = this.sequencesList.length - 1;
this.currentFrame = 0;
// Asynchronously add all frames to the last Sequence
for (let i = 1; i < frames.length; i += 1) {
setTimeout
(() => {
newSequence.addFrame(frames[i]);
if (i === frames.length - 1) {
this.isAnimating = false;
}
this.currentFrame += 1;
}, i * 500);
}
},
Visualizer diagram

And here you go!

I hope this article will help you to understand how BTrees work and how they are able to keep a large amount of information with a logarithmic complexity on access to this information.

Thanks for reading!

References

--

--