B-Tree is a unique kind of self-balancing tree primarily used for searching an element more efficiently. Unlike binary trees, in B-Tree every node can contain more than one piece of data and can have more than two children. It is an extended and generalized shape of the binary search tree and is also known as a height-balanced m-way tree. You can visualize all the things mentioned above in this diagram.
Secondary storage devices or physical storage media like hard disks, etc., have larger capacity but are slower due to low accessing time. There was a need for such sorts of data structures that limit disk access.
But still, that’s not a statement which clears the need for B-Tree. Among all data structures used for searching we are familiar with binary search trees, AVL trees, Red-Black trees, etc… So why B-Tree? Because in all those data structures each node can contain only one key and in order to store a large number of keys, one has to face high access time because the height of such trees becomes very large. However, B-Tree can contain many keys in a single node and can have more than two children. Therefore, for a large number of keys the height of the tree is significantly lower and results in faster disk access.
Various operations can be done on B-Tree but its main usage is a generalized and extended form of Binary Search Tree (BST) for searching an element present in the tree. Let’s learn about how to search an element first then we will learn how to perform insertion and deletion operations on B-Tree.
Let n be the element to be searched, then the following steps are followed to search n in B-Tree-
Initiating from the root node of the tree, we check equality between n and the first key of the node. If they both are equal then we return that node along with its index.
If n.leaf is equal to true that means we are at the last node and there is no more path and the chance remains for n to exist in that tree. So, in that case, we return NULL that is not found.
If the value of n is less than the first value of the node that means the only possibility of n existing is in the left subtree of the tree. So to do so we will search n further recursively in the left part of the tree.
If there are multiple keys in the node and the value of n is more than the first key then compare the value of n with the next key within the node. If n is less than the next key continue the search in the left child of the key, else in the right child of the key that is n lies in between the first and second keys.
Iterate steps one to four until we meet the leaf node.
Algorithm for searching an element-
1
2
3
4
5
6
7
8
9
10
11
12
searchElement(m, x):
i = 1
while i≤ n[m] and x≥ keyi[m]
// n[m] : number of keys in m node
do i = i + 1
if i n[m] and x = keyi[m]
then
return (m, i)
if leaf[m]
then
return NIL
else return searchElement(ci[m], x)
Insertion operation always follows a bottom-up fashion to insert an element in the B-Tree and generally takes two steps to be done. The first is to search for a suitable node in the tree to insert the element and then split the nodes if required. The steps are given below to do an insertion in B-Tree-
If the tree has no node, i.e., empty, then simply insert it and assign it as the root node.
Update the value of the permitted number of keys within the node.
For insertion, search for a suitable node.
If and only if the node is full then follow the steps given below, otherwise skip to step eight.
Continue insertion of elements in ascending order.
If the number of elements exceeds its limit value then split it at the median.
Send the key at the median up and assign right keys to the right child and left keys to the left child of it.
If the capacity of the node isn’t full then insert the node in the tree in ascending order.
Let show an example of inserting the following keys **[**8 9 10 11 15 20 17] in the B-Tree -
Algorithm for inserting an element-
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
insertElement(T, K):
R = root[T]
if N[R] = 2 T - 1
S = AssignNode()
root[T] = S
leaf[S] = false
N[S] = 0
c1[S] = R
splitChild(S, 1, R)
insertNonFull(S, K)
else insertNonFull(R, K)
insertNonFull(X, K):
i = N[X]
if leaf[X]
while i≥ 1 and K < key_i[X]
key_i + 1[X] = key_i[X]
i = i - 1
key_i + 1[X] = K
N[X] = N[X] + 1
else
while i≥ 1 and K < key_i[X]
i = i - 1
i = i + 1
if N[ci[X]] == 2 t - 1
splitChild(X, i, ci[X])
if K < key_i[X]
i = i + 1
insertNonFull(ci[X], K)
splitChild(X, i)
splitChild(X, i, y)
z = AssignNode()
leaf[z] = leaf[y]
N[z] = t - 1
for j = 1 to t - 1
key_j[z] = key_j + t[y]
if not leaf[y]
for j = 1 to t
cj[z] = cj + t[y]
N[y] = t - 1
for j = N[X] + 1 to i + 1
cj + 1[X] = cj[X]
ci + 1[X] = z
for j = N[X] to i
key_j + 1[X] = key_j[X]
key_i[X] = key_t[y]
N[X] = N[X] + 1
Deletion operation on B-Tree generally takes three steps - searching for a node that contains the key to be deleted, deletion of that key, and then if required, balancing of the tree. While deleting, a condition may occur when we try to delete and there is no node present in the tree. That condition is known as underflow of the tree.
There are three type cases of deletion-
CASE 1- The key to be deleted is in the leaf node. One thing to keep in mind is that deletion of the key should not violate the property of the minimal number of keys a node has. In the below example if we delete 32 from the tree then the resulting tree will look like this.
CASE 2- If the key to be deleted lies in between inner nodes then the inner node which is going to be deleted is replaced by its inorder predecessor (largest key in the left subtree of the node). If the value of the minimum number of keys in the left child exceeds its limit, and is replaced by inorder successor (largest key in the right subtree of the right child) in the case of the right child exceeding the value of the minimum number of keys it can hold. Otherwise if the child has the exact minimum number of keys, merge both its right and left children. If we delete 33 in the below case it results-
CASE 3- If the key to be deleted lies in between inner nodes and deletion of it results in less than the minimum required number of keys in the node, we have to shrink the size of the tree. For this to be done first we have to find its inorder successor and predecessor and if both of the children have the least number of keys then we have to merge the children. To understand it more clearly look at the example below of deleting 10 from the tree-
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
public class BTree {
private int BT;
public class TreeNode {
int n;
int node_key[] = new int[(2 * BT) - 1];
TreeNode child_node[] = new TreeNode[2 * BT];
boolean is_leaf = true;
public int find_one(int ktr) {
for (int itr = 0; itr < this.n; itr++) {
if (this.node_key[itr] == ktr) {
return itr;
}
}
return -1;
};
}
public BTree(int t) {
BT = t;
rootNode = new TreeNode();
rootNode.n = 0;
rootNode.is_leaf = true;
}
private TreeNode rootNode;
private TreeNode searchKey(TreeNode node, int node_key) {
int itr = 0;
if (node == null)
return node;
for (itr = 0; itr < node.n; itr++) {
if (node_key < node.node_key[itr]) {
break;
}
if (node_key == node.node_key[itr]) {
return node;
}
}
if (node.is_leaf) {
return null;
} else {
return searchKey(node.child_node[itr], node_key);
}
}
// Splitting the node
private void split(TreeNode node, int pos, TreeNode y) {
TreeNode z = new TreeNode();
z.is_leaf = y.is_leaf;
z.n = BT - 1;
for (int jtr = 0; jtr < BT - 1; jtr++) {
z.node_key[jtr] = y.node_key[jtr + BT];
}
if (!y.is_leaf) {
for (int jtr = 0; jtr < BT; jtr++) {
z.child_node[jtr] = y.child_node[jtr + BT];
}
}
y.n = BT - 1;
for (int jtr = node.n; jtr >= pos + 1; jtr--) {
node.child_node[jtr + 1] = node.child_node[jtr];
}
node.child_node[pos + 1] = z;
for (int jtr = node.n - 1; jtr >= pos; jtr--) {
node.node_key[jtr + 1] = node.node_key[jtr];
}
node.node_key[pos] = y.node_key[BT - 1];
node.n = node.n + 1;
}
// Inserting a value
public void insert_value(final int node_key) {
TreeNode root = rootNode;
if (root.n == 2 * BT - 1) {
TreeNode nd = new TreeNode();
rootNode = nd;
nd.is_leaf = false;
nd.n = 0;
nd.child_node[0] = root;
nd(nd, 0, root);
insertValue(nd, node_key);
} else {
insertValue(root, node_key);
}
}
// Inserting the node
final private void insertValue(TreeNode node, int ktr) {
if (node.is_leaf) {
int itr = 0;
for (itr = node.n - 1; itr >= 0 && ktr < node.node_key[itr]; itr--) {
node.node_key[itr + 1] = node.node_key[itr];
}
node.node_key[itr + 1] = ktr;
node.n = node.n + 1;
} else {
int itr = 0;
for (itr = node.n - 1; itr >= 0 && ktr < node.node_key[itr]; itr--) {};
itr++;
TreeNode temp = node.child_node[itr];
if (temp.n == 2 * BT - 1) {
nd(node, itr, temp);
if (ktr > node.node_key[itr]) {
itr++;
}
}
insertValue(node.child_node[itr], ktr);
}
}
public void display() {
display(rootNode);
}
// Display
private void display(TreeNode node) {
assert(node == null);
for (int itr = 0; itr < node.n; itr++) {
System.out.print(node.node_key[itr] + " ");
}
if (!node.is_leaf) {
for (int itr = 0; itr < node.n + 1; itr++) {
display(node.child_node[itr]);
}
}
}
// Check if present
public boolean isPresent(int ktr) {
if (this.searchKey(rootNode, ktr) != null) {
return true;
} else {
return false;
}
}
// delete the key
private void deleteKey(TreeNode node, int key) {
int position = node.Find(key);
if (position != -1) {
if (node.is_leaf) {
int itr = 0;
for (itr = 0; itr < node.n && node.key[itr] != key; itr++) {};
for (; itr < node.n; itr++) {
if (itr != (2 * BT) - 2) {
node.key[itr] = node.key[itr + 1];
}
}
node.n--;
return;
}
if (!node.is_leaf) {
TreeNode prd = node.child[position];
int predKey = 0;
if (prd.n >= BT) {
for (;;) {
if (prd.is_leaf) {
System.out.println(prd.n);
predKey = prd.key[prd.n - 1];
break;
} else {
prd = prd.child[prd.n];
}
}
deleteKey(prd, predKey);
node.key[position] = predKey;
return;
}
TreeNode next_node = node.child[position + 1];
if (next_node.n >= BT) {
int nextKey = next_node.key[0];
if (!next_node.is_leaf) {
next_node = next_node.child[0];
for (;;) {
if (next_node.is_leaf) {
nextKey = next_node.key[next_node.n - 1];
break;
} else {
next_node = next_node.child[next_node.n];
}
}
}
deleteKey(next_node, nextKey);
node.key[position] = nextKey;
return;
}
int tmp = prd.n + 1;
prd.key[prd.n++] = node.key[position];
for (int itr = 0, jtr = prd.n; itr < next_node.n; itr++) {
prd.key[jtr++] = next_node.key[itr];
prd.n++;
}
for (int itr = 0; itr < next_node.n + 1; itr++) {
prd.child[tmp++] = next_node.child[itr];
}
node.child[position] = prd;
for (int itr = position; itr < node.n; itr++) {
if (itr != 2 * BT - 2) {
node.key[itr] = node.key[itr + 1];
}
}
for (int itr = position + 1; itr < node.n + 1; itr++) {
if (itr != 2 * BT - 1) {
node.child[itr] = node.child[itr + 1];
}
}
node.n--;
if (node.n == 0) {
if (node == root) {
root = node.child[0];
}
node = node.child[0];
}
deleteKey(prd, key);
return;
}
} else {
for (position = 0; position < node.n; position++) {
if (node.key[position] > key) {
break;
}
}
TreeNode tmp = node.child[position];
if (tmp.n >= BT) {
deleteKey(tmp, key);
return;
}
if (true) {
TreeNode nd = null;
int divider = -1;
if (position != node.n && node.child[position + 1].n >= BT) {
divider = node.key[position];
nd = node.child[position + 1];
node.key[position] = nd.key[0];
tmp.key[tmp.n++] = divider;
tmp.child[tmp.n] = nd.child[0];
for (int itr = 1; itr < nd.n; itr++) {
nd.key[itr - 1] = nd.key[itr];
}
for (int itr = 1; itr <= nd.n; itr++) {
nd.child[itr - 1] = nd.child[itr];
}
nd.n--;
deleteKey(tmp, key);
return;
} else if (position != 0 && node.child[position - 1].n >= BT) {
divider = node.key[position - 1];
nd = node.child[position - 1];
node.key[position - 1] = nd.key[nd.n - 1];
TreeNode child = nd.child[nd.n];
nd.n--;
for (int itr = tmp.n; itr > 0; itr--) {
tmp.key[itr] = tmp.key[itr - 1];
}
tmp.key[0] = divider;
for (int itr = tmp.n + 1; itr > 0; itr--) {
tmp.child[itr] = tmp.child[itr - 1];
}
tmp.child[0] = child;
tmp.n++;
deleteKey(tmp, key);
return;
} else {
TreeNode left = null;
TreeNode right = null;
boolean is_last = false;
if (position != node.n) {
divider = node.key[position];
left = node.child[position];
right = node.child[position + 1];
} else {
divider = node.key[position - 1];
right = node.child[position];
left = node.child[position - 1];
is_last = true;
position--;
}
for (int itr = position; itr < node.n - 1; itr++) {
node.key[itr] = node.key[itr + 1];
}
for (int itr = position + 1; itr < node.n; itr++) {
node.child[itr] = node.child[itr + 1];
}
node.n--;
left.key[left.n++] = divider;
for (int itr = 0, jtr = left.n; itr < right.n + 1; itr++, jtr++) {
if (itr < right.n) {
left.key[jtr] = right.key[itr];
}
left.child[jtr] = right.child[itr];
}
left.n += right.n;
if (node.n == 0) {
if (node == root) {
root = node.child[0];
}
node = node.child[0];
}
deleteKey(left, key);
return;
}
}
}
}
public void deleteKey(int key) {
TreeNode node = searchKey(root, key);
if (node == null) return;
deleteKey(root, key);
}
// Main Function
public static void main(String[] args) {
// BTree of degree 3
BTree tree = new BTree(3);
// Inserting nodes [8,9,10,15,20,17]
tree.insert_value(8);
tree.insert_value(9);
tree.insert_value(10);
tree.insert_value(11);
tree.insert_value(15);
tree.insert_value(20);
tree.insert_value(17);
// To display current tree
tree.display();
// To check if 12 is present in Tree or not
if (tree.isPresent(12)) {
System.out.println("\nKey Found");
} else {
System.out.println("\nKey not Found");
}
// To delete 10 from tree
tree.deleteKey(10);
// To display tree after 10 is deletedthe the
tree.display();
}
}
The term B-Tree can also refer to a particular layout or an extensive class of designs. B-Tree saves key data in its inner nodes and does not save those keys in the records at the leaf nodes.
B+ Tree- In the B+ Tree internal nodes hold copies of the keys and the leaf nodes hold keys and records. Also, a leaf node may contain a pointer to the next leaf node for fastening the sequential access.
B* Tree- The B* Tree balances additional neighboring inner nodes to hold internal nodes more closely packed. In this variant, every non-root node is at least compacted by ⅔ in comparison to B-Tree which is ½. The most strong point of the disadvantage of B-Tree is its operation of inserting a node in the tree using a splitting node mechanism which is very costly. On the other hand B* Trees are generated as such to avoid splitting operations as long as they can.
B*+ Tree- That variant of B-Tree combines the major features of B+ Tree and B* tree.
For every node N, the keys are kept in sorted order.
There is a boolean data N.leaf in each node that indicates whether N is a leaf node or not.
If the order of the tree is x, then each internal node can contain at most x-1 keys.
Except for the root node, every node can have at most x children and at least x/2 children.
All leaves are on the same level, i.e., they all have the same depth and height.
The root node contains a minimum of one key and has at least two children.
If x ≥ 1, then for any x-key B-Tree of height h having the minimum degree t ≥ 2 has height h,
h ≥ logt(n+1)/2
Searches, insertions, and deletions of records in/to/from the database and file systems.
To store blocks/chunks of data in the secondary storage devices.
For multi-level indexing.