codeflash-agent/.codeflash/coveragepy/coveragepy/README.md
Kevin Turcios 3b59d97647 squash
2026-04-13 14:12:17 -05:00

1.8 KiB

coveragepy Performance Optimization

Upstream performance improvements to coveragepy/coveragepy, the standard Python code coverage measurement tool by Ned Batchelder.

Background

coverage.py instruments Python execution to measure which lines and branches are exercised by tests. It's used by virtually every Python project with CI coverage gates. Performance matters because coverage overhead directly increases test suite wall time — often 2-5x slower than uncovered execution.

Profiling reveals optimization surfaces in both the trace loop hot path and the data persistence layer.

Optimization Targets

Data Collection (Phase 1 — highest leverage)

Target File Approach
numbits encoding/union numbits.py Pre-allocate bytearray, replace zip_longest with explicit loop
add_lines() / add_arcs() batching sqldata.py Batch SQL INSERTs, reduce numbits round-trips
should_trace() sys.path check inorout.py Hash sys.path instead of full list comparison
mapped_file_dict() flush collector.py Snapshot strategy instead of retry loop

Parsing & Analysis (Phase 2)

Target File Approach
PythonParser.parse_source() parser.py Memoize tokenization, bulk newline indexing
Analysis set operations results.py Defer expensive calculations to lazy properties
SQLite query caching sqldata.py Cache lines()/arcs() results per context

Reporting (Phase 3)

Target File Approach
HTML report generation html.py Pre-compute analysis metadata, batch rendering
Path normalization files.py Verify cache hit rates, batch path ops

Results

No optimizations applied yet.

PRs

None yet.

PR Branch Status Description