A Bloom filter is a probabilistic data structure that indicates if an element is definitely not in a set or may be in it.
It uses a bit vector and multiple hash functions to set bits when adding elements and to test membership.
The false positive rate depends on the number of bits (m), number of elements (n), and number of hash functions (k), and is roughly (1–e^(–kn/m))^k.
The optimal number of hash functions for given m and n is (m/n)·ln(2).
Insertion and membership tests both run in O(k) time and use minimal memory compared to exact structures.
Choosing fast, independent, and uniformly distributed hash functions (e.g., Murmur, xxHash, FNV) greatly improves performance.
Bloom filters are useful when space efficiency is critical and a small false positive rate is acceptable.
Get notified when new stories are published for "🇺🇸 Hacker News English"