codeflash-agent/.codeflash/pypa/pip/data/benchmarks.md
Kevin Turcios 3b59d97647 squash
2026-04-13 14:12:17 -05:00

6.8 KiB

pip End-to-End Performance: main vs codeflash/optimize

Branch: codeflash/optimize (118 commits ahead of main) Environment: Python 3.15.0a7 | macOS arm64 (Apple Silicon) | ~27 packages installed | HTTP cache warm Tool: hyperfine (5-10 runs, 2-3 warmup)


Startup

Benchmark Main Optimized Delta Speedup
pip --version 138 ms 20 ms -118 ms 7.0x
pip --help 143 ms 121 ms -22 ms 1.18x
pip install --help 207 ms 208 ms +1 ms ~1.0x

Package Operations

Benchmark Main Optimized Delta Speedup
pip list 162 ms 146 ms -16 ms 1.11x
pip freeze 225 ms 211 ms -14 ms 1.07x
pip show pip 162 ms 148 ms -14 ms 1.09x
pip check 191 ms 174 ms -17 ms 1.10x

Dependency Resolution

Cached HTTP responses, --dry-run --ignore-installed to force full resolution.

Benchmark Main Optimized Delta Speedup
requests (simple, ~5 deps) 589 ms 516 ms -73 ms 1.14x
flask + django (medium, ~15 deps) 708 ms 599 ms -109 ms 1.18x
flask + django + boto3 + requests (complex, ~30 deps) 1,493 ms 826 ms -667 ms 1.81x
fastapi[standard] (heavy, ~42 deps) 13,325 ms 11,664 ms -1,661 ms 1.14x

Parsing

Benchmark Main Optimized Delta Speedup
install -r requirements.txt (21 pinned packages, --no-deps) 1,344 ms 740 ms -604 ms 1.82x

Import Time

Benchmark Main Optimized Delta Speedup
import pip._internal.cli.main 50 ms 50 ms 0 ms 1.0x

Note: On Python 3.15 the import chain is already fast (50ms). The --version fast-path bypasses this import entirely, which is why pip --version is 7x faster.


Totals

Main Optimized Speedup
All benchmarks (sum) 18,717 ms 15,223 ms 1.23x (18.7% faster)
Excluding fastapi[standard] 5,392 ms 3,559 ms 1.51x (34.0% faster)

Top Improvements

Rank Benchmark Improvement Time Saved
1 resolve: fastapi[standard] 12.5% 1,661 ms
2 resolve: flask+django+boto3+requests 44.7% 667 ms
3 install -r requirements.txt 44.9% 604 ms
4 pip --version 85.5% 118 ms
5 resolve: flask+django 15.4% 109 ms
6 resolve: requests 12.4% 73 ms

What Was Optimized (118 commits)

1. Startup

  • Ultra-fast --version path in __main__.py that exits before importing pip._internal
  • Fast-path --version in cli/main.py that avoids pip._internal.utils.misc import
  • Deferred base_command.py import chain to command creation time (saves ~22ms on --help)
  • Deferred Configuration module loading
  • Deferred autocompletion imports behind PIP_AUTO_COMPLETE check

2. Dependency Resolver -- Architecture

  • Speculative metadata prefetch: background thread downloads PEP 658 metadata for the top candidate while the resolver processes other packages
  • Conditional Criterion rebuild: _remove_information_from_criteria now skips rebuilding unaffected criteria, eliminating ~95% of allocations
  • __slots__ on Criterion: reduces per-instance memory by ~100 bytes
  • Two-level cache for _iter_found_candidates (specifier merge cache + candidate infos cache)
  • Fail-first preference heuristic (candidate_count in resolver preference tuple)
  • ChainMap delta and plain dict in resolvelib state management
  • Parallel index-page prefetch during dependency resolution
  • Thread-safe dist property on candidates for concurrent metadata access

3. Dependency Resolver -- Micro

  • Cached wheel tag priority dict on TargetPython
  • Pre-extracted requirements tuple on Criterion to avoid per-call generator expressions
  • Cached specifier merge and candidate infos across resolver backtracking
  • Cached Marker.evaluate() results for repeated extra lookups
  • Cached _sort_key results to avoid double evaluation in compute_best_candidate
  • Hoisted operator.methodcaller/attrgetter to module-level constants

4. Packaging (vendored pip._vendor.packaging)

  • Replaced _tokenizer dataclass with __slots__ class
  • Deferred Version.__hash__ computation until first call
  • Integer comparison key (_cmp_int) for Version and Specifier -- avoids full _key tuple construction
  • Bisect-based filter_versions for O(log n + k) batch filtering
  • Pre-computed integer bounds on SpecifierSet for fast rejection
  • Cached parsed Version objects in _coerce_version
  • Cached parsed Requirement fields for repeated requirement strings
  • Cached parsed frozenset of Specifiers in SpecifierSet
  • Fast-path tokenizer for simple tokens to bypass regex engine
  • Ultra-fast path in SpecifierSet.contains for prereleases=True
  • Pre-computed is_prerelease/is_postrelease flags at Version init
  • Direct release-tuple prefix comparison in _compare_equal and _compare_compatible
  • Cached Specifier.__str__ and __hash__
  • Pre-computed Link._is_wheel slot to avoid repeated splitext comparison
  • Cached URL scheme on Link to skip urlsplit for is_vcs/is_file
  • Deferred URL path extraction in Link.from_json when filename exists
  • Inlined Link construction in _evaluate_json_page to skip redundant work
  • Direct string extraction replacing parse_wheel_filename in sort path
  • rsplit instead of rfindx3 for wheel tag extraction
  • Cached parse_tag results to eliminate redundant Tag creation

6. I/O and Caching

  • Replaced pure-Python msgpack with C-level stdlib JSON for cache serialization (backward compatible)
  • Increased HTTP connection pool and prefetch concurrency

7. Import Deferral

  • Deferred base_command.py import chain to command creation time
  • Deferred all Rich imports to first use
  • Stripped unused Rich modules from import chain
  • Deferred heavy imports in Rich console.py (pretty/pager/scope/screen/export)
  • Deferred Rich imports in progress_bars.py and self_outdated_check.py

8. Micro-optimizations

  • Bypassed InstallationCandidate.__init__ with __new__ + direct slot assignment
  • Removed redundant O(n) subset assertion in BestCandidateResult
  • Replaced min() builtins with inline conditionals in _cmp_int
  • Cached Hashes.__hash__ to avoid repeated sort+join computation
  • Cached Constraint.empty() singleton to avoid 169K redundant allocations
  • Bypassed email.parser for metadata parsing