C hashtable library.
This benchmark inserts 8-byte elements in a hash table, for numeric keys ranging from 1 to 20 million while measuring the insertion time for each element.
This graph reveals the internal structure of the hash table, which expands when a certain number of items is reached. The values for different thresholds are often prime numbers, or powers of 2.
All inserts were under 30 milliseconds, except for the following values:
Elements | Insertion time (ms) | Time change |
---|---|---|
127,800 | 32 | |
255,608 | 60 | × 1.88 |
511,183 | 93 | × 1.55 |
1,022,366 | 194 | × 2.08 |
2,044,732 | 415 | × 2.14 |
4,089,456 | 821 | × 1.98 |
8,178,897 | 1709 | × 2.08 |
16,357,799 | 3621 | × 2.12 |
Here, powers of 2 times a thousand elements. Other algorithms manage to avoid this kind of problematic behavior.
Abstractions can be considered leaky not only when the underlying data structure is exposed in the public API, but also when the behavior of some special cases put a strain on the whole process.
For a great read on the implementation of hash tables in C, head to chapter 18 of Beautiful Code, titled “Python’s Dictionary Implementation: Being All Things to All People”.