codeflash-agent/plugin/ARCHITECTURE.md

155 lines
7.7 KiB
Markdown
Raw Permalink Normal View History

2026-04-03 22:36:50 +00:00
# Plugin Architecture & Execution Order
## Lifecycle
1. **SessionStart hook** — initializes Codex session state
2. **User triggers** `/codeflash-optimize start` (skill)
2026-04-09 08:36:01 +00:00
3. **Language router** (`codeflash`) — detects project language, delegates to language-specific router
4. **Language-specific router** (e.g., `codeflash-python`) — detects domain, asks user questions, launches setup
5. **Setup agent** (e.g., `codeflash-setup`) — detects env, installs deps/profilers, writes `.codeflash/setup.md`
6. **Router validates** setup, runs test suite, researches deps via context7
7. **Router creates team** and dispatches optimizer agent
2026-04-03 22:36:50 +00:00
## Optimization Loop
2026-04-09 08:36:01 +00:00
8. **Optimizer** (`codeflash-deep` or domain-specific: `-cpu`, `-memory`, `-async`, `-structure`) — profiles all dimensions, ranks targets
9. **Researcher** (`codeflash-researcher`) — launched alongside to analyze targets in parallel, sends findings back to optimizer
10. **Experiment cycle**: profile → reason → implement → test → benchmark → keep/discard → commit → re-profile → repeat
11. **Plateau detection** (3+ consecutive discards) → optimizer sends `[complete]`
2026-04-03 22:36:50 +00:00
## Review Gate
2026-04-09 08:36:01 +00:00
12. **Review agent** (`codeflash-review`) — 6-pass deep review (comprehension → correctness → safety → benchmark verification → quality → disclosure)
13. Writes `.codeflash/review-report.md` with verdict (APPROVE/REQUEST CHANGES/BLOCK)
2026-04-03 22:36:50 +00:00
## Cleanup
2026-04-09 08:36:01 +00:00
14. Router shuts down teammates, deletes team
15. Preserves `learnings.md`, `results.tsv`, `changelog.md`; deletes temp files
16. **SessionEnd hook** — finalizes Codex session
2026-04-03 22:36:50 +00:00
## Hooks
Defined in `plugin/hooks/hooks.json`, fire at session boundaries:
| Hook | When | What |
|------|------|------|
| **SessionStart** | New Claude session begins | Initializes Codex session state, records metadata |
| **SessionEnd** | Session ends | Cleans up Codex jobs, saves final state |
| **Stop** | User clicks Stop (900s timeout) | Optionally runs Codex adversarial review gate before allowing termination |
## Agents
2026-04-09 08:36:01 +00:00
### Language-agnostic (`plugin/agents/`)
2026-04-03 22:36:50 +00:00
| Agent | Role | Triggered by |
|-------|------|-------------|
2026-04-09 08:36:01 +00:00
| `codeflash` | Language router — detects language, delegates to language-specific router | `/codeflash-optimize` skill, user request |
2026-04-03 22:36:50 +00:00
| `codeflash-researcher` | Read-only research teammate | Domain agents, after baseline profiling |
| `codeflash-review` | Independent 6-pass deep review | `/codex-review`, post-optimization gate |
2026-04-09 08:36:01 +00:00
### Python-specific (`plugin/languages/python/agents/`)
2026-04-03 22:36:50 +00:00
| Agent | Role | Triggered by |
|-------|------|-------------|
2026-04-09 08:36:01 +00:00
| `codeflash-python` | Python domain router/team lead — orchestrates Python sessions | Language router after detecting Python |
| `codeflash-setup` | Environment detection & preparation | Python router, before first optimization |
2026-04-03 22:36:50 +00:00
| `codeflash-scan` | Quick cross-domain diagnosis | `/codeflash-optimize scan` or router recon |
2026-04-09 08:36:01 +00:00
| `codeflash-deep` | Primary optimizer (all dimensions) | Python router (default unless single-domain requested) |
| `codeflash-cpu` | CPU/runtime specialist | Python router or deep agent dispatch |
| `codeflash-memory` | Memory specialist | Python router or deep agent dispatch |
| `codeflash-async` | Async/concurrency specialist | Python router or deep agent dispatch |
| `codeflash-structure` | Import-time/module structure specialist | Python router or deep agent dispatch |
| `codeflash-ci` | CI mode agent for GitHub webhooks | CI service |
| `codeflash-pr-prep` | PR preparation agent | Post-session |
2026-04-03 22:36:50 +00:00
## Commands (`plugin/commands/`)
User-invocable anytime:
| Command | Purpose |
|---------|---------|
| `/codex-review` | Manual adversarial review via Codex companion |
| `/codex-setup` | Check/install Codex CLI, configure review gate |
| `/codex-status` | Check active and recent Codex jobs |
2026-04-09 08:36:01 +00:00
## Skills (`plugin/languages/python/skills/`)
2026-04-03 22:36:50 +00:00
| Skill | Purpose |
|-------|---------|
| `codeflash-optimize` | Entry point: `start\|resume\|status\|scan\|review` |
| `memray-profiling` | Advanced memory profiling utilities (used by codeflash-memory) |
2026-04-09 08:36:01 +00:00
## References
### Language-agnostic (`plugin/references/shared/`)
Methodology, templates, and frameworks that apply to any language:
| File | Purpose |
|------|---------|
| `agent-base-protocol.md` | Shared operational rules (experiment discipline, commit rules, stuck recovery) |
| `experiment-loop-base.md` | Shared experiment loop framework (keep/discard tree, guard, plateau) |
| `pre-submit-review.md` | Shared pre-submit checklist (resource ownership, concurrency, correctness) |
| `e2e-benchmarks.md` | Two-phase measurement concept (micro-benchmark → E2E) |
| `micro-benchmark.md` | A/B pre-screen pattern |
| `pr-body-templates.md` | Generic PR body structure and writing guidelines |
| `pr-preparation.md` | PR workflow (inventory, folding, conventions) |
| `adversarial-review.md` | Codex adversarial review methodology |
| `changelog-template.md` | Changelog generation structure |
| `handoff-template.md` | HANDOFF.md template |
| `learnings-template.md` | Cross-session learnings template |
### Python-specific (`plugin/languages/python/references/`)
Python implementations of shared protocols, plus domain-specific deep-dive docs:
| File/Dir | Purpose |
|----------|---------|
| `agent-base-protocol.md` | Python profilers (cProfile, tracemalloc, memray), test runners, package managers |
| `e2e-benchmarks.md` | `codeflash compare` usage, pytest-benchmark, fallback tools |
| `micro-benchmark.md` | Python A/B template (timeit, memray, asyncio), domain thresholds |
| `pre-submit-review.md` | Python checks (asyncio, .pyc, os.environ, monkey-patching) |
| `pr-body-templates.md` | Python PR variants (codeflash compare output, memray memory table) |
| `unified-profiling-script.py` | CPU+memory+GC profiling script for deep agent |
| `library-replacement.md` | Library boundary breaking guide |
| `async/` | Async domain: asyncio patterns, blocking detection, concurrency |
| `data-structures/` | CPU domain: containers, algorithms, bytecode, stdlib |
| `memory/` | Memory domain: tracemalloc, memray, leak detection, framework leaks |
| `structure/` | Structure domain: import time, module decomposition, circular deps |
2026-04-03 22:36:50 +00:00
## State Files
Created during execution in `.codeflash/`:
| File | Created by | Purpose |
|------|-----------|---------|
| `setup.md` | codeflash-setup | Environment summary |
| `scan-report.md` | codeflash-scan | Ranked targets + domain recommendations |
| `results.tsv` | optimizer agents | Experiment log (baseline, speedup, keep/discard) |
| `HANDOFF.md` | optimizer agents | Session state for resume |
| `conventions.md` | router | Binding constraints from maintainer feedback |
| `learnings.md` | router | Cross-session discoveries |
| `review-report.md` | codeflash-review | 6-pass review findings + verdict |
| `changelog.md` | router | PR-ready optimization summary |
## Ordering Guarantees
**Sequential:**
1. SessionStart hook fires before any agent acts
2026-04-09 08:36:01 +00:00
2. Language detection before domain routing
3. Setup agent completes before domain agents start
4. Baseline profiling before any optimization experiment
5. Re-profiling after every KEEP to update rankings
6. Review gate runs after optimizer `[complete]`, before cleanup
7. SessionEnd hook fires as session terminates
2026-04-03 22:36:50 +00:00
**Parallel allowed:**
- Researcher analyzes targets #2-5 while optimizer works on target #1
- Multiple domain agents can run in separate worktrees
- Deep agent can dispatch domain agents while continuing its own profiling
## Assembly
2026-04-09 08:36:01 +00:00
`make build-plugin` merges `plugin/` (base, excluding `languages/`) + `plugin/languages/python/` (overlay) into `dist/`. Set `LANG=javascript` to build for JS instead. Agent files use `${CLAUDE_PLUGIN_ROOT}` for references — paths differ between source and assembled output.