# Micro-Benchmark — Shared Template For any optimization, test in isolation first. Call the target function directly — not through the full application — to isolate its true impact. ## Role Micro-benchmarks are a fast pre-screen — validate that an optimization is worth committing before investing in a full `codeflash compare` run. See `e2e-benchmarks.md` for how this fits into the two-phase measurement workflow and for fallback behavior when `codeflash compare` is not available. ## A/B Pattern ```python # /tmp/micro_bench_.py import sys def bench_a(): """Current approach.""" # ... original code with real input def bench_b(): """Optimized approach.""" # ... optimized code with same input if __name__ == "__main__": {"a": bench_a, "b": bench_b}[sys.argv[1]]() ``` ### Running Domain agents adapt the runner to their measurement tool: - **Memory**: `memray run --native --trace-python-allocators -o /tmp/micro_{a,b}.bin /tmp/micro_bench_.py {a,b}` then `memray stats` - **CPU / Data Structures**: wrap with `timeit.timeit(fn, number=1000)` inside the script - **Async**: wrap with `asyncio.run(fn())` and `time.perf_counter()` for wall-clock - **Structure**: `timeit.timeit(bench_import, number=10)` with `sys.modules` cache clearing ```bash $RUNNER /tmp/micro_bench_.py a $RUNNER /tmp/micro_bench_.py b ``` Adapt commands for the project's specific setup (virtualenv, PYTHONPATH, working directory, etc.). ## Micro-Benchmark-Only Keeps Keep on micro alone if ALL hold: 1. Micro-benchmark shows clear, repeatable improvement above the domain threshold 2. Full tests still pass (always run for correctness) 3. Change is simple, doesn't add complexity 4. Function is confirmed on the hot path / exercised by the target test Domain thresholds: - **Memory**: >10 MiB or >10% - **CPU / Data Structures**: >20% or >2x, hot path confirmed by cProfile - **Async**: >20% or >2x at representative concurrency These saves compound: as dominant bottlenecks shrink, previously-buried savings surface.