mirror of https://github.com/codeflash-ai/codeflash-agent.git synced 2026-05-04 18:25:19 +00:00

Migrate .codeflash/ to {teammember}/{org}/{project}/ format (#15 )

Add team member dimension to case study paths so multiple contributors
can track optimization data independently. Derives member from
git config user.name in session-start hooks.

- Move all case studies under .codeflash/krrt7/
- Rename pypa/pip → python/pip (org grouping)
- Update session-start hooks, docs, scripts, and references

2026-04-14 23:04:34 -05:00

2.9 KiB

Raw Permalink Blame History

Azure VM Setup for Benchmarking

VM Spec

Setting	Value
Name	`rich-bench`
Resource group	`RICH-BENCH-RG`
Region	`westus2`
Size	`Standard_D2s_v5` (2 vCPU, 8 GB RAM, non-burstable)
OS	Ubuntu 24.04 LTS
Image	`Canonical:ubuntu-24_04-lts:server:latest`

Non-burstable is critical — burstable VMs (B-series) have variable CPU performance that makes benchmarks unreliable.

Provisioning

# Create resource group
az group create --name RICH-BENCH-RG --location westus2

# Create VM
az vm create \
  --resource-group RICH-BENCH-RG \
  --name rich-bench \
  --image Canonical:ubuntu-24_04-lts:server:latest \
  --size Standard_D2s_v5 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --custom-data cloud-init.yaml

Cloud-init

The full cloud-init is in cloud-init.yaml. It installs:

System packages: git, build-essential, curl
uv: curl -LsSf https://astral.sh/uv/install.sh | sh
Python 3.12 + 3.13: uv python install 3.12 3.13
hyperfine: From GitHub releases (latest)
Rich clone: git clone https://github.com/Textualize/rich /home/azureuser/rich
Venvs: .venv (3.12) and venv313 (3.13) with Rich in editable mode
Bench scripts: Copied to /home/azureuser/bench/

Post-provisioning verification

ssh azureuser@<ip>

# Check tools
python3.12 --version
python3.13 --version
hyperfine --version

# Check Rich
cd ~/rich && git status
~/rich/.venv/bin/python -c "import rich; print(rich.__version__)"

# Run baseline
bash ~/bench/bench_import.sh

# Verify low stddev (should be <2ms for import benchmarks)

Directory layout on VM

/home/azureuser/
├── rich/                  # Rich repo clone (editable install)
│   ├── .venv/             # Python 3.12 venv
│   └── ...
├── venv313/               # Python 3.13 venv
├── bench/
│   ├── bench_import.sh    # Overall import time
│   ├── bench_module.sh    # Per-module imports
│   ├── bench_e2e.sh       # A/B branch comparison
│   ├── bench_compare.sh   # Generic branch comparison
│   ├── bench_importtime.py # -X importtime parser
│   ├── bench_runtime.py   # PR #12 runtime benchmarks
│   ├── bench_runtime2.py  # PR #13 runtime benchmarks
│   ├── bench_text.py      # Text hot-path benchmarks
│   └── test_all_impls.sh  # Multi-version test runner
└── results/               # Benchmark output storage

Why this setup

Dedicated VM eliminates background process noise from a developer laptop
Non-burstable gives consistent CPU frequency — no turbo boost variability
Two Python versions because typing imports re on 3.12 but not 3.13, which affects the re deferral benchmarks
hyperfine handles warmup, min-runs, and statistical reporting (mean ± stddev)
Editable install allows quick branch switching without reinstall overhead

2.9 KiB Raw Permalink Blame History