Migrate .codeflash/ to {teammember}/{org}/{project}/ format (#15)

Add team member dimension to case study paths so multiple contributors
can track optimization data independently. Derives member from
git config user.name in session-start hooks.

- Move all case studies under .codeflash/krrt7/
- Rename pypa/pip → python/pip (org grouping)
- Update session-start hooks, docs, scripts, and references
This commit is contained in:
Kevin Turcios 2026-04-14 23:04:34 -05:00 committed by GitHub
parent 4a65f17bfb
commit cc29a27289
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
77 changed files with 69 additions and 33 deletions

View file

@ -47,7 +47,7 @@ fi
STATE="${STATE}\nProject conventions to preserve:\n"
STATE="${STATE}- Monorepo: packages/ (UV workspace), plugin/ (self-contained, multi-language: plugin/languages/python/, plugin/languages/javascript/)\n"
STATE="${STATE}- Build: make build-plugin, prek run --all-files (lint), uv run pytest packages/ -v (test)\n"
STATE="${STATE}- Optimization projects in .codeflash/{org}/{project}/ with status.md, bench/, data/results.tsv\n"
STATE="${STATE}- Optimization projects in .codeflash/{teammember}/{org}/{project}/ with status.md, bench/, data/results.tsv\n"
STATE="${STATE}- Target repos in ~/Desktop/work/{org}_org/{project}\n"
STATE="${STATE}- VM benchmarks via ssh -A, record to data/results.tsv, update status.md\n"
STATE="${STATE}- Atomic commits, one purpose per commit, verify before committing\n"

View file

@ -1,22 +1,26 @@
#!/usr/bin/env bash
# SessionStart hook: Scaffold .codeflash/{org}/{project}/ if it doesn't exist.
# Infers org/project from git remote origin. File generation is delegated to
# scripts/scaffold.sh — the single source of truth for project scaffolding.
# SessionStart hook: Scaffold .codeflash/{teammember}/{org}/{project}/ if it doesn't exist.
# Infers org/project from git remote origin. Team member is derived from
# git user name. File generation is delegated to scripts/scaffold.sh.
cd "$CLAUDE_PROJECT_DIR" 2>/dev/null || exit 0
CF_DIR="$CLAUDE_PROJECT_DIR/.codeflash"
SCAFFOLD="$CLAUDE_PROJECT_DIR/scripts/scaffold.sh"
# Derive team member from git user (lowercase, no spaces)
MEMBER=$(git config user.name 2>/dev/null | tr '[:upper:]' '[:lower:]' | tr ' ' '-')
[ -z "$MEMBER" ] && MEMBER="unknown"
# Parse git remote origin
REMOTE=$(git remote get-url origin 2>/dev/null)
if [ -z "$REMOTE" ]; then
if [ -d "$CF_DIR" ]; then
exit 0
fi
cat <<'EOF'
cat <<EOF
{
"systemMessage": "No .codeflash/ directory found and no git remote origin to infer org/project. Ask the user for the organization and project name, then run: bash scripts/scaffold.sh <org> <project> .codeflash/<org>/<project>"
"systemMessage": "No .codeflash/ directory found and no git remote origin to infer org/project. Ask the user for the organization and project name, then run: bash scripts/scaffold.sh <org> <project> .codeflash/$MEMBER/<org>/<project>"
}
EOF
exit 0
@ -51,15 +55,15 @@ if [ -z "$ORG" ] || [ -z "$PROJECT" ]; then
if [ -d "$CF_DIR" ]; then
exit 0
fi
cat <<'EOF'
cat <<EOF
{
"systemMessage": "No .codeflash/ directory found. Could not parse org/project from git remote. Ask the user for the organization and project name, then run: bash scripts/scaffold.sh <org> <project> .codeflash/<org>/<project>"
"systemMessage": "No .codeflash/ directory found. Could not parse org/project from git remote. Ask the user for the organization and project name, then run: bash scripts/scaffold.sh <org> <project> .codeflash/$MEMBER/<org>/<project>"
}
EOF
exit 0
fi
PROJECT_DIR="$CF_DIR/$ORG/$PROJECT"
PROJECT_DIR="$CF_DIR/$MEMBER/$ORG/$PROJECT"
# Skip bootstrap when working on the agent repo itself
if [ "$ORG" = "codeflash-ai" ] && [ "$PROJECT" = "codeflash-agent" ]; then

View file

@ -2,7 +2,7 @@
## Location
Active optimization data lives in `.codeflash/{org}/{project}/` on main. Summaries are built into `case-studies/{org}/{project}/`.
Active optimization data lives in `.codeflash/{teammember}/{org}/{project}/` on main. Summaries are built into `case-studies/{org}/{project}/`.
## Status tracking

View file

@ -0,0 +1,27 @@
# Optimization PR Stack Handoff
## PR Stack (order of merge)
| PR | Title | Status |
|----|-------|--------|
| #231 | perf: Use `executemany` for SQLite batch inserts | Merged |
| #229 | perf: Batch term collection in indexing pipeline | Merged |
| #230 | perf: Batch SQLite INSERTs for indexing pipeline | Awaiting re-review |
| #232 | perf: Batch metadata query | Blocked on #230 |
## What was done in #230
- Addressed all 10 review comments from @bmerkle
- Deleted `text_range_from_location` (was identical to `text_range_from_message_chunk` from messageutils)
- Deleted `add_entity_to_index`, `add_action_to_index`, `add_topic_to_index` (duplicates of `add_entity`, `add_action`, `add_topic`)
- Updated all callers (tests) to use the unified functions directly
- Fixed pre-existing bug: `inverse_actions` were silently skipped in both the batch path (`add_metadata_to_index_from_list`) and the async iterator path (`add_metadata_to_index`)
- Moved inline imports to top-level in `sqlite/propindex.py` per AGENTS.md guidelines
- Summary comment posted on #230, heads-up comment posted on #232
## What's next
1. Wait for @bmerkle / @gvanrossum to re-review #230
2. Once #230 merges, rebase #232 (`perf/batch-metadata-query`) on `main` and resolve conflicts
- #232 will have conflicts in `semrefindex.py` due to our function renames and deduplication
3. CI checks on #230 were still queued at last check — monitor for results

View file

@ -4,15 +4,16 @@ Monorepo for the Codeflash optimization platform: Python packages, Claude Code p
## Case Studies
Active case study data lives in `.codeflash/{org}/{project}/` (status, bench scripts, raw data, VM infra). Summaries are built out of `.codeflash/` into `case-studies/{org}/{project}/`.
Active case study data lives in `.codeflash/{teammember}/{org}/{project}/` (status, bench scripts, raw data, VM infra). Summaries are built out of `.codeflash/` into `case-studies/{org}/{project}/`.
Active case studies in `.codeflash/`:
Active case studies in `.codeflash/krrt7/`:
- `microsoft/typeagent`
- `unstructured/core-product`
- `netflix/metaflow`
- `coveragepy/coveragepy`
- `textualize/rich`
- `pypa/pip`
- `python/pip`
- `odoo`
### Directory conventions
@ -28,8 +29,8 @@ Target repos live in `~/Desktop/work/{org}_org/{project}`:
2. **Run tests locally** to verify nothing breaks
3. **Commit and push** to the fork
4. **Benchmark on the VM** via `ssh -A azureuser@<ip> "cd ~/<project> && git fetch origin && ..."`
5. **Record results** in `.codeflash/{org}/{project}/data/results.tsv`
6. **Update status.md** in `.codeflash/{org}/{project}/`
5. **Record results** in `.codeflash/{teammember}/{org}/{project}/data/results.tsv`
6. **Update status.md** in `.codeflash/{teammember}/{org}/{project}/`
7. **Open a PR** on the fork with VM benchmark numbers
### VM access

View file

@ -176,11 +176,11 @@ plugin/ # Claude Code plugin (self-contained, multi-langu
languages/python/ # Python domain agents, skills, references
languages/javascript/ # JavaScript domain agents, skills, references
.codeflash/ # active optimization data (org-grouped)
textualize/rich/ # 2x Rich import speedup
pypa/pip/ # 7x pip --version, 1.81x resolver
microsoft/typeagent/ # Structured RAG optimization
<org>/<project>/ # new optimization targets
.codeflash/ # active optimization data (teammember/org/project)
krrt7/textualize/rich/ # 2x Rich import speedup
krrt7/python/pip/ # 7x pip --version, 1.81x resolver
krrt7/microsoft/typeagent/ # Structured RAG optimization
<member>/<org>/<project>/ # new optimization targets
case-studies/ # summaries built from .codeflash/
scripts/ # scaffold scripts
@ -216,7 +216,7 @@ make bootstrap ORG=unstructured PROJECTS="unstructured unstructured-inference co
This creates:
```
.codeflash/<org>/<project>/
.codeflash/<member>/<org>/<project>/
├── README.md # results, what changed, methodology (from template)
├── bench/ # add your benchmark scripts here
├── data/ # save raw benchmark data here
@ -243,7 +243,7 @@ The cloud-init template includes examples for Python, Rust, Go, Node.js, and Jav
Each project gets a `vm-manage.sh` for the benchmark VM:
```bash
cd .codeflash/<org>/<project>
cd .codeflash/<member>/<org>/<project>
bash infra/vm-manage.sh create # provision VM with cloud-init
bash infra/vm-manage.sh bench main # run benchmarks on a branch
bash infra/vm-manage.sh ssh # SSH into VM
@ -254,5 +254,5 @@ bash infra/vm-manage.sh destroy # delete everything
### Examples
Use the existing projects as templates:
- [Rich](.codeflash/textualize/rich/) — focused scope, 2 PRs, import + runtime micro-opts
- [pip](.codeflash/pypa/pip/) — large scope, 122 commits across 8 categories
- [Rich](.codeflash/krrt7/textualize/rich/) — focused scope, 2 PRs, import + runtime micro-opts
- [pip](.codeflash/krrt7/python/pip/) — large scope, 122 commits across 8 categories

View file

@ -3,7 +3,7 @@
#
# Two modes:
# Dogfood: Plugin loaded from codeflash-agent/dist/ (--plugin-dir).
# Redirects .codeflash/ → agent repo's .codeflash/{org}/{project}/ via symlink.
# Redirects .codeflash/ → agent repo's .codeflash/{teammember}/{org}/{project}/ via symlink.
# Normal: Plugin installed normally.
# Creates .codeflash/ in the project directory.
@ -27,6 +27,10 @@ PROJECT=$(echo "$PATH_PART" | cut -d'/' -f2 | tr '[:upper:]' '[:lower:]')
[ -z "$ORG" ] || [ -z "$PROJECT" ] && exit 0
# Derive team member from git user (lowercase, no spaces)
MEMBER=$(git config user.name 2>/dev/null | tr '[:upper:]' '[:lower:]' | tr ' ' '-')
[ -z "$MEMBER" ] && MEMBER="unknown"
# Detect dogfood mode: CLAUDE_PLUGIN_ROOT's parent has .codeflash/ (the agent repo)
AGENT_REPO=""
if [ -n "$CLAUDE_PLUGIN_ROOT" ]; then
@ -38,7 +42,7 @@ fi
if [ -n "$AGENT_REPO" ]; then
# --- Dogfood mode ---
TARGET="$AGENT_REPO/.codeflash/$ORG/$PROJECT"
TARGET="$AGENT_REPO/.codeflash/$MEMBER/$ORG/$PROJECT"
mkdir -p "$TARGET"
if [ ! -e "$CLAUDE_PROJECT_DIR/.codeflash" ]; then

View file

@ -192,7 +192,7 @@ Optimizer resumes: Reads compacted summary, can't find branch name
```
Teammate: Check available context in order:
Option A: Read .codeflash/{org}/{project}/status.md
Option A: Read .codeflash/{teammember}/{org}/{project}/status.md
Should say "Current branch: perf/batch-size"
Should say "Latest: 40% throughput gain, ready for benchmark"
@ -217,7 +217,7 @@ git branch -v
git log -1 --format=fuller
# Confirm results exist
ls -la .codeflash/{org}/{project}/data/
ls -la .codeflash/{teammember}/{org}/{project}/data/
# Read what lead said in last interaction
# (Look for approval, next steps in task description)
@ -245,7 +245,7 @@ Now that context restored, proceed normally
- **Configure PreCompact hook** to snapshot before compaction
- **Configure SessionStart hook** to inject status after compaction
- **Keep MEMORY.md updated** with current branch and findings as you work
- **Update .codeflash/{org}/{project}/status.md** regularly, not just at handoff
- **Update .codeflash/{teammember}/{org}/{project}/status.md** regularly, not just at handoff
---
@ -350,8 +350,8 @@ Optimizer: Compacted summary says "bottleneck found" but no numbers
**Step 1: Check what was actually lost** (2 min)
```bash
# Do the files still exist?
ls -la .codeflash/{org}/{project}/data/profile.json
cat .codeflash/{org}/{project}/data/profile.json | head
ls -la .codeflash/{teammember}/{org}/{project}/data/profile.json
cat .codeflash/{teammember}/{org}/{project}/data/profile.json | head
# Is it in teammate MEMORY.md?
grep -A 5 "O(n²)" MEMORY.md

View file

@ -180,7 +180,7 @@ Result: Wrong approach, wastes benchmarking time/VM cost, leads to dead-end bran
| 4 teammates | 60k | 60-120k | 120-180k | Multi-project or multi-layer work |
| 5 teammates | 75k | 80-150k | 155-225k | Large exploration, competing hypotheses |
**Cost tracking rule**: If team approach > lead-only, switch back to single session. Use `.codeflash/{org}/{project}/metrics.tsv` to track.
**Cost tracking rule**: If team approach > lead-only, switch back to single session. Use `.codeflash/{teammember}/{org}/{project}/metrics.tsv` to track.
## Decision Tree

View file

@ -7,7 +7,7 @@
#
# Called by:
# - make bootstrap ORG=roboflow PROJECTS="supervision"
# - .claude/hooks/session-start.sh (auto-scaffolds .codeflash/{org}/{project}/)
# - .claude/hooks/session-start.sh (auto-scaffolds .codeflash/{teammember}/{org}/{project}/)
set -euo pipefail