mirror of
https://github.com/codeflash-ai/codeflash-agent.git
synced 2026-05-04 18:25:19 +00:00
3.2 KiB
3.2 KiB
core-product Status
Last updated: 2026-04-10
Current state
Stacked PR #1500 updated with cumulative progression (proper benchmarking: 5 rounds + 1 warmup, median reported, <0.4% stddev). Cumulative: 14.6% latency on 10p-scan, 13.3% on 16p-mixed, 2.1 GB memory savings. Next: optimization #4 (direct numpy-to-BMP for tesseract).
Target repo
~/Desktop/work/unstructured_org/core-product on branch main (PR branch: perf/cpu-aware-serial-ocr)
PRs
| PR | Branch | Status | Description |
|---|---|---|---|
| #1503 | perf/bmp-render-format |
Draft | Render PDF pages as BMP instead of PNG in pdfium pool |
| #1502 | perf/cpu-aware-serial-ocr |
Draft | Cap OCR workers to available CPUs (serial mode on 1-CPU pods) |
| #1500 | codeflash-agent |
Draft | Stacked optimizations + benchmark infra (cumulative progression) |
| #1481 | perf/elements-intersect-vertically |
Merged | Reduce attribute lookups |
| #1464 | replace-lazyproperty-with-cached-property |
Merged | Replace lazyproperty with functools.cached_property |
| #1448 | mem/free-pil-before-table-extraction |
Merged | Free page image before table OCR |
| #1441 | mem/numpy-preprocessing-yolox |
Merged | Resize-first preprocessing |
| #1400 | async-join-responses |
Merged | Fix blocking event loop in CSV merge |
Optimization queue
CPU-aware serial OCR— PR #1502 open (draft), benchmarked. Rebase after #1501 merges.Early memory release— skipped, codebase already well-optimized (context managers, per-page cleanup)BMP render format— PR #1503 open (draft), benchmarked. 14.9% latency improvement on 10p-scan.- Direct numpy-to-BMP for tesseract — encode from numpy without PIL round-trip
- Skip remaining PIL↔numpy conversions in OCR path
Dependencies
- PR #1501 (segfault fix,
patched_convert_pdf_to_imagerefactor) must merge before #1502 rebase. Different functions, clean rebase expected.
VM
- IP: 40.65.91.158
- Size: Standard_E4s_v5
- RG: core-product-BENCH-RG
- State: Running (verified 2026-04-10)
- Git auth: HTTPS with embedded token (set previously). Use
ssh -Afor agent forwarding if token expires. - Note:
uvis at~/.local/bin/uv— needsexport PATH=$HOME/.local/bin:$PATHin non-login shells. - Note:
pytest-benchmarkinstalled in.venv(not in lockfile).
Next steps
- Implement "pass file path to tesseract" optimization (skip PIL→numpy→PIL→temp-file round-trip)
- Benchmark on VM, open draft PR
- Rebase #1502 once #1501 merges
Notes
memray treeopens a TUI — do not run directly over SSH. Usememray stats,memray summary, ormemray flamegraph --output file.htmlinstead.- memray peak is 1.0 GB (10p scan, serial path). 10 GB total allocated = heavy PIL churn per page, not accumulation.
- Benchmarking: use
pedantic(rounds=5, warmup_rounds=1)— warmup absorbs ONNX JIT + page cache. Observed stddev <0.4% of median. Guest CPU frequency controls are ineffective on Azure Hyper-V — use statistical methods (more rounds + median) instead of trying to pin frequency. - Workflow: independent
perf/<name>branch → open individual draft PR → cherry-pick tocodeflash-agent→ benchmark stacked progression → update #1500 body.