probabilistic-packet-filter

Project Bi-Weekly Update: HashSet vs Bloom Filter Insertion Memory Benchmarking

Student: Jonathan Ami
Date: March 21, 2025


Planned Activities:


Progress Update:

Refactor for Insertion-Only Testing

The Ddos benchmarking module was significantly refactored to better isolate and analyze memory usage. The major improvements include:

These changes help ensure that any memory reported by heaptrack is strictly from the data structure itself.


Benchmark Results (heaptrack)

Tests were run using the following commands:

HashSet:

heaptrack cargo run --release -- --test ddos-performance

Bloom Filter:

heaptrack cargo run --release -- --test ddos-performance --bloom

Here are the key results from heaptrack for both implementations:

HashSet implementation:

peak heap memory consumption:     11.2MB
peak RSS (including overhead):    24.0MB
total memory leaked:              416.9kB
calls to allocation functions:    251,395
temporary allocations:            81,548 (32.44%)
total runtime:                    00.114s

Bloom Filter implementation:

peak heap memory consumption:     11.2MB
peak RSS (including overhead):    23.9MB
total memory leaked:              417.0kB
calls to allocation functions:    251,398
temporary allocations:            81,548 (32.44%)
total runtime:                    00.114s

Discussion & Analysis

Despite the expectation that a Bloom filter should use significantly less memory than a HashSet, both implementations showed identical peak memory usage (~11.2MB) and almost identical allocation behavior. The Bloom filter even made slightly more allocations than the HashSet.

This was surprising because:

Possible explanations include:

This contradiction highlights a key real-world insight: theoretical memory savings may be masked by surrounding allocations unless benchmarking is tightly controlled.


Changes in the Repository:


Next Steps: