This article compares memory usage between the before_chaincluster and chaincluster branches using three methods: RSS (process resident set size), Valgrind massif (heap profiling), and TxGraph’s internal GetMainMemoryUsage() (based on Cluster::TotalMemoryUsage()).
Trace file: txgraph.trace.4.5days.final (~552 MB, 4.5 days of mempool activity)
Comparison script: bitcoin/bitcoin chaincluster_time_mem_bench branch
txgraph-replay on each branch with -DWITH_TXGRAPH_TRACING=ONtxgraph-replay <trace_file> in the background/proc/<pid>/status VmRSS every second| Branch | Peak RSS (kB) |
|---|---|
| before_chaincluster | 11,412 |
| chaincluster | 10,984 |
Difference: chaincluster uses 428 kB less (~3.7% reduction).
-DCMAKE_BUILD_TYPE=Debug -DWITH_TXGRAPH_TRACING=ONvalgrind --tool=massif --massif-out-file=out.out --stacks=no txgraph-replay <trace_file>ms_print out.out to extract peak heap snapshot| Branch | Peak heap (bytes) | Peak heap (MB) |
|---|---|---|
| before_chaincluster | 7,925,912 | 7.56 |
| chaincluster | 7,605,304 | 7.25 |
Difference: chaincluster uses 320 kB less heap (~4.0% reduction).
txgraph-replay --memory-csv=mem.csv <trace_file>GetMainMemoryUsage() after every 100th COMMIT_STAGING (~17.6k samples)commit_index,usage_bytespython3 contrib/txgraph_tracing/plot_memory_curve.py mem_before.csv mem_chain.csv -o compare.png| Metric | before_chaincluster | chaincluster | Difference |
|---|---|---|---|
| Peak (commit 1,290,200) | 4.61 MB | 4.22 MB | 402 KB less (~8.3%) |
| Final (end of trace) | 1.21 MB | 1.18 MB | 41 KB less (~3.2%) |
| Average over replay | 2.37 MB | 2.22 MB | 155 KB less (~6.5%) |
Peak occurs mid-trace (commit ~1.29M of 1.76M), not at the end.
| Component | GenericClusterImpl (before) | ChainClusterImpl (after) |
|---|---|---|
| Per-tx data | DepGraph::Entry + m_mapping + m_linearization | TxData (GraphIndex + FeeFrac) |
| Per-tx bytes | ~40 | ~20 |
| Ancestor/descendant | Explicit BitSets per tx | Implicit (chain order) |
| Linearization | Explicit vector | Implicit [0,1,…,n−1] |
GenericClusterImpl for an n-tx cluster: ~40n bytes.
ChainClusterImpl for an n-tx chain: ~20n bytes.
Savings per chain cluster of size n: 20n bytes.
Peak state (27,512 transactions, 8,596 clusters):
Theoretical savings (peak): 4,727 × 20 ≈ 94 KB from cluster representation. Measured savings (320–428 KB) are larger; the difference likely comes from DepGraph/chunk overhead and allocator behavior.
| Method | chaincluster reduction |
|---|---|
| RSS | ~3.7% |
| Massif (heap) | ~4.0% |
| GetMainMemoryUsage peak | ~8.3% |
| GetMainMemoryUsage final | ~3.2% |
| GetMainMemoryUsage average | ~6.5% |
All three methods show that the chaincluster branch reduces memory usage. The results align with the theoretical per-cluster savings of 20 bytes per transaction in chain-shaped clusters (size ≥2).