mirror of
https://github.com/codeflash-ai/codeflash-agent.git
synced 2026-05-04 18:25:19 +00:00
1.9 KiB
1.9 KiB
metaflow Status
Last updated: 2026-04-10
Current state
First PR open upstream. Waiting for maintainer feedback.
Target repo
~/Desktop/work/netflix_org/metaflow — fork remote: KRRT7/metaflow
VM
Azure Standard_D2s_v5, IP: 20.112.32.177, RG: metaflow-BENCH-RG (deallocated)
- Python 3.12, pip editable install, lz4/xxhash/numpy/hyperfine installed
- Baseline + realistic benchmarks complete in ~/results/
PRs
| PR | Branch | Status | Description |
|---|---|---|---|
| Netflix/metaflow#3090 | perf/lz4-artifact-compression |
Open, waiting for review | Replace gzip with lz4 in CAS — 7-18x on realistic data |
| KRRT7/metaflow#1 | perf/lz4-artifact-compression |
Draft (mirror) | Same, on fork |
Key results (realistic artifacts)
| Payload | Pickled Size | gzip total | lz4 total | Speedup |
|---|---|---|---|---|
| Small dict (config) | 233B | 0.341ms | 0.218ms | 1.6x |
| Metrics dict (feature stats) | 52KB | 2.278ms | 0.327ms | 7.0x |
| Numpy float64 (embeddings) | 800KB | 29.111ms | 1.557ms | 18.7x |
| Numpy float64 (model weights) | 8MB | 289.234ms | 15.792ms | 18.3x |
| Random bytes (opaque model) | 5MB | 118.315ms | 9.646ms | 12.3x |
Open questions on PR
- Hard vs soft dependency for lz4
- Forward compat story (old metaflow can't read cas_version=2)
- Benchmark scripts to be reverted before merge
Next steps (pending maintainer response)
- If approach accepted: make lz4 optional, revert benchmark scripts, address feedback
- If rejected on dependency grounds: explore zlib.compress directly (no new dep, smaller win)
- Open SHA1 discussion issue (data in
data/sha1-proposal.md) - Multicore polling improvement (low priority, marginal impact)
Blockers
Waiting on Netflix/metaflow#3090 review.