[[TOC]]
1. Table Indexes#
A table index is a copy of the table attributes and is mainly used for acceleration.
Table indexes consider two factors:
- Cost of maintenance
- Acceleration effect
2. B+ Tree#
A B+ tree is a self-balancing data structure with a time complexity of O(log n) for operations such as search and insertion. It is optimized for large data blocks.
A B+ tree is an M-ary search tree with the following characteristics:
- Perfectly balanced (all leaf nodes have the same level)
- Except for the root node, each node has M/2-1 <= #keys <= M-1
- All nodes with k keys have k+1 non-empty child nodes.
In a B+ tree, the leaf nodes are where the final keys we need to retrieve are stored. Therefore, there are two ways to find the final value based on the leaf nodes.
- Store a pointer to the location of the value.
- Store the value directly in the leaf node.
B-Tree VS. B+Tree#
The main difference between B-trees and B+ trees is that the keys in B-trees are also stored in non-leaf nodes. Therefore, in terms of storage space utilization, B-trees are undoubtedly more efficient than B+ trees.
This may lead to the following issues:
- B-trees require jumping back and forth between nodes during traversal, while B+ trees can be scanned directly.
- Storing values in nodes in B-trees may result in slower queries.
Visualization#
https://www.cs.usfca.edu/~galles/visualization/BPlusTree.html
INSERT#
Find a leaf node for insertion. Insert in order of size.
If the node has enough space to accommodate the insertion, the insertion stops. If there is not enough space, the following steps are taken:
- Split the node into two nodes at the median value.
- Push the median value up to the parent node. If there is no parent node, create one. Then perform the same operation on the parent node.
DELETE#
Find a leaf node for deletion.
If the leaf node still has at least half of its capacity after deleting the key, the deletion stops.
If the leaf node has only M/2-1 keys left, there are two strategies:
- Try to borrow a value from a neighboring node.
- If borrowing fails, try to merge the two nodes.
Selection Conditions#
B+ trees have advantages and disadvantages in tree-structured queries.
Duplicate Keys#
There are two ways to handle duplicate keys:
- Append a record ID to each key to ensure uniqueness.
- Use an overflow node to store the duplicate information.
Clustered Indexes#
Tables are usually indexed based on the primary key.
However, some databases use clustered indexes, which add a hidden primary key to the table. Other databases usually cannot use this index.
3. B+ Tree Design Choices#
Node Size#
Node size is typically larger for slower storage devices.
- HDD: ~1MB
- SSD: ~10KB
- In-Memory: ~512B
Merge Threshold#
Some DBMSs do not always merge nodes that are half full.
Delaying the merge operation may reduce the cost of rebalancing.
Maintaining smaller nodes and periodically restructuring the tree may be more efficient.
Variable length Keys#
- Use pointers to store values, storing only pointers to the values in the B+ tree.
- Use variable-length nodes.
- Store all values in the B+ tree with a fixed size.
- Use a map-like structure for storage.
Intra-Node Search#
- Linear
- Scan the nodes linearly.
- SIMD techniques can be used to increase the amount of data scanned.
- Binary
Use binary search to find the key. - Interpolation
Determine the position based on the value.
4. Optimizations#
Prefix Compression#
Compress by extracting common prefixes.
Deduplication#
Merge values into a single key for indexes that do not require uniqueness for all values.
Suffix Truncation#
Select a small prefix length in the nodes to differentiate and find values.
Pointer Swizzling#
Fix the page ID of the B+ tree in the buffer pool to avoid repeated disk reads and speed up access. Then replace the page ID pointer with the actual value in memory, bypassing the buffer pool conversion for direct access.
Bulk Insert#
The fastest way to rebuild a B+ tree is to build it from the bottom up.