codeflash-agent/.codeflash/netflix/metaflow/README.md
Kevin Turcios 3b59d97647 squash
2026-04-13 14:12:17 -05:00

2 KiB

metaflow Performance Optimization

Upstream performance improvements to Netflix/metaflow, a human-centric framework for data science and ML workflows.

Background

Metaflow is Netflix's open-source Python framework for building and managing real-world data science projects. It handles workflow orchestration, versioning, and execution across local, cloud, and Kubernetes environments.

Profiling reveals two main optimization surfaces:

  1. Import time (~513ms): Heavy optional dependencies (requests, kubernetes, asyncio, yaml) loaded eagerly even when not needed. Plugin resolution alone accounts for 65% of import time.
  2. Runtime hot paths: Double gzip compression on every artifact, SHA1 hashing where faster non-crypto hashes suffice, sleep-based polling in multiprocessing utilities.

Optimization Targets

Import Time (Phase 1 — ~200ms savings estimated)

Target Current Savings Approach
Defer requests in metadata providers 128ms ~108ms Lazy import inside ServiceMetadataProvider
Lazy-load Kubernetes clients 50ms ~48ms Conditional import when K8s decorator used
Defer asyncio in subprocess_manager 91ms ~41ms Import inside async functions only
Defer YAML/cards infrastructure 52ms ~37ms Move YAML import to card render time

Runtime (Phase 2)

Target File Approach
Double gzip compression content_addressed_store.py Single compression, tune level
SHA1 content hashing content_addressed_store.py Switch to xxHash/BLAKE3
Sleep-based polling multicore_utils.py Event-based waiting
Extension loading cache extension_support/__init__.py Mtime-based cache

Results

No optimizations applied yet.

Benchmark Before After Speedup
import metaflow 513ms
metaflow --version CLI ~1.8s

PRs

None yet.

PR Branch Status Description