Project Bi-Weekly Update: Enhanced Memory & Time Benchmarking with HashSet vs BloomFilter
Student: Jonathan Ami
Date: April 4, 2025
proc_benchmark.sh
and insertion_time_benchmark.sh
Two new benchmarking tools were introduced:
insertion_time_benchmark.sh
: Runs time-based insertion tests from 100k to 5M entries using both data structures, writing results to CSV.proc_benchmark.sh
: Profiles /proc/<pid>/status
data after 1-second runs to capture memory usage for Bloom Filter and HashSet, exporting to bloomfilter_mem.csv
and hashset_mem.csv
.VmRSS (Resident Memory) showed expected trends as insertions scaled:
N | VmRSS (KB) - HashSet | VmRSS (KB) - BloomFilter |
---|---|---|
100k | 3244 | 2740 |
1M | 20428 | 3680 |
2M | 38976 | 4932 |
Observations:
Insertion timing, recorded using the
insertion_time_benchmark.sh
, also showed favorable results for BloomFilter:
N | Time (ms) - HashSet | Time (ms) - BloomFilter |
---|---|---|
100k | ~1.47 | ~2.69 |
5M | ~81.10 | ~182.17 |
Analysis:
benchmark_results/
folder for all generated CSV data..sh
scripts to handle batch timing and /proc
memory capture.