codeflash-agent/.codeflash/textualize/rich/README.md
Kevin Turcios 3b59d97647 squash
2026-04-13 14:12:17 -05:00

131 lines
5.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Rich Performance Optimization
Upstream performance improvements to [Textualize/rich](https://github.com/Textualize/rich), motivated by pip startup time profiling.
## Background
pip vendors Rich for its progress bars, logging, and error display. Profiling `pip --version` revealed Rich as one of the heaviest imports in the startup chain — `from rich.console import Console` alone took ~79ms on CPython 3.12 (Standard_D2s_v5 VM).
Rather than patching pip's vendored copy, we contributed upstream so everyone benefits.
## Results
### Import Time (hyperfine, 30+ runs, Standard_D2s_v5)
#### CPython 3.12
| Import | master | optimized | Speedup |
|---|---|---|---|
| `Console` | 79.1 ± 0.8ms | 37.5 ± 0.5ms | **2.11x** |
| `RichHandler` | 100.3 ± 3.6ms | 39.6 ± 0.5ms | **2.53x** |
#### CPython 3.13
| Import | master | optimized | Speedup |
|---|---|---|---|
| `Console` | 67.9 ± 0.7ms | 33.6 ± 0.5ms | **2.02x** |
| `RichHandler` | — | 37.5 ± 0.4ms | — |
> On Python 3.13+, `typing` no longer imports `re`, so deferring all `re.compile()` calls eliminates `re` (+ `_sre`, `re._compiler`, `re._parser`, `re._constants`) from the Console import chain entirely.
### Runtime Micro-benchmarks (Python 3.13.13)
| Benchmark | Before | After | Speedup |
|---|---|---|---|
| Style.\_\_eq\_\_ (identity) | 114ns/call | 62ns/call | **1.84x** |
| Style.combine (3 styles) | 579ns/call | 433ns/call | **1.34x** |
| Segment.simplify (identity) | 1269ns/call | 931ns/call | **1.36x** |
| Style.chain (3 styles) | 959ns/call | 878ns/call | **1.09x** |
| E2E Console.print | 173.7us/call | 171.6us/call | ~1.01x |
## What We Changed
### PR #12 — Architectural wins ([KRRT7/rich#12](https://github.com/KRRT7/rich/pull/12))
- **Replace `@dataclass` with `__slots__` classes** — `ConsoleOptions` and `ConsoleThreadLocals` used `@dataclass`, which imports `inspect` at module level (~10ms). Replaced with plain classes + `__slots__`. ConsoleOptions memory: 344 → 136 bytes (60% reduction).
- **Lazy-load emoji dictionary** — `_emoji_codes.EMOJI` (3,608 entries) loaded unconditionally via `text.py → emoji.py`. Deferred to first use via module-level `__getattr__`.
- **Defer imports across 12+ modules** — `inspect`, `pretty`, `scope`, `getpass`, `configparser`, `html.escape`, `zlib`, `traceback`, `pathlib` → deferred to the methods that actually use them.
- **`from __future__ import annotations`** — Enabled in key modules to allow moving type-only imports to `TYPE_CHECKING`.
### PR #13 — Import deferral + runtime micro-opts ([KRRT7/rich#13](https://github.com/KRRT7/rich/pull/13))
**Import deferral (7 files):**
- `color.py`: `RE_COLOR` compiled lazily in `Color.parse()` (LRU-cached)
- `text.py`: `_re_whitespace` lazy; inline `import re` in 6 methods
- `markup.py`: `RE_TAGS` via `_compile_tags()`, `RE_HANDLER` and escape regex lazy
- `_emoji_replace.py`: regex default arg → lazy `_EMOJI_SUB` global
- `_wrap.py`: `re_word` → lazy `_re_word`
- `highlighter.py`: `import re` inside `JSONHighlighter.highlight()`
- `default_styles.py`: 3 `rgb(...)` strings → `Color.from_rgb()` to avoid `Color.parse()` regex at import
**Runtime micro-optimizations:**
- `Style.__eq__`/`__ne__`: identity shortcut (`is`) before hash comparison
- `Style.combine`/`chain`: use `_add` (LRU-cached) directly instead of `sum()``__add__``.copy()` check
- `Segment.simplify`: `is` before `==` for style comparison
### Upstream PR
- [Textualize/rich#4070](https://github.com/Textualize/rich/pull/4070) — Initial import deferral PR (subset of the above)
## Methodology
### Environment
- **VM**: Azure Standard_D2s_v5 (2 vCPU, 8 GB RAM, non-burstable)
- **OS**: Ubuntu 24.04 LTS
- **Region**: westus2
- **Python**: 3.12 and 3.13 via uv
- **Tooling**: hyperfine (warmup 5, min-runs 30), timeit (best of 7)
Non-burstable VM chosen for consistent CPU performance — no thermal throttling or turbo variability.
### Benchmark harness
All scripts in [`bench/`](bench/):
| Script | Purpose |
|---|---|
| `bench_import.sh` | Overall `import rich` time via hyperfine |
| `bench_module.sh` | Per-module import time (Console, RichHandler, Traceback, etc.) |
| `bench_e2e.sh` | A/B comparison: master vs optimized branch |
| `bench_compare.sh` | Generic branch comparison wrapper |
| `bench_importtime.py` | `python -X importtime` parser → sorted TSV breakdown |
| `bench_runtime.py` | PR #12 runtime benchmarks (ConsoleOptions, emoji_replace) |
| `bench_runtime2.py` | PR #13 runtime benchmarks (Style.__eq__, combine, Segment.simplify) |
| `bench_text.py` | Text hot-path benchmarks (construction, copy, divide, render) |
| `test_all_impls.sh` | Run tests across CPython 3.93.14 + PyPy 3.10 |
### Raw data
Hyperfine JSON exports in [`data/`](data/).
## Maintainer Engagement
Reached out to Will McGugan (Textualize CEO) via Discord. Conversation in [`discord-transcript.md`](discord-transcript.md).
Key quotes:
- "Seems like a clear win. Feel free to open a PR."
- "I'd say single PR."
## Repo Structure
```
.
├── README.md # This file
├── cloud-init.yaml # VM provisioning (one-shot reproducible setup)
├── discord-transcript.md # Will McGugan conversation
├── bench/ # Benchmark scripts (from VM)
│ ├── bench_import.sh
│ ├── bench_module.sh
│ ├── bench_e2e.sh
│ ├── bench_compare.sh
│ ├── bench_importtime.py
│ ├── bench_runtime.py
│ ├── bench_runtime2.py
│ ├── bench_text.py
│ └── test_all_impls.sh
├── data/ # Raw benchmark data (hyperfine JSON)
│ ├── e2e-3.12/
│ └── runtime/
└── vm-setup.md # Azure VM provisioning instructions
```