7.7 KiB
7.7 KiB
Plugin Architecture & Execution Order
Lifecycle
- SessionStart hook — initializes Codex session state
- User triggers
/codeflash-optimize start(skill) - Language router (
codeflash) — detects project language, delegates to language-specific router - Language-specific router (e.g.,
codeflash-python) — detects domain, asks user questions, launches setup - Setup agent (e.g.,
codeflash-setup) — detects env, installs deps/profilers, writes.codeflash/setup.md - Router validates setup, runs test suite, researches deps via context7
- Router creates team and dispatches optimizer agent
Optimization Loop
- Optimizer (
codeflash-deepor domain-specific:-cpu,-memory,-async,-structure) — profiles all dimensions, ranks targets - Researcher (
codeflash-researcher) — launched alongside to analyze targets in parallel, sends findings back to optimizer - Experiment cycle: profile → reason → implement → test → benchmark → keep/discard → commit → re-profile → repeat
- Plateau detection (3+ consecutive discards) → optimizer sends
[complete]
Review Gate
- Review agent (
codeflash-review) — 6-pass deep review (comprehension → correctness → safety → benchmark verification → quality → disclosure) - Writes
.codeflash/review-report.mdwith verdict (APPROVE/REQUEST CHANGES/BLOCK)
Cleanup
- Router shuts down teammates, deletes team
- Preserves
learnings.md,results.tsv,changelog.md; deletes temp files - SessionEnd hook — finalizes Codex session
Hooks
Defined in plugin/hooks/hooks.json, fire at session boundaries:
| Hook | When | What |
|---|---|---|
| SessionStart | New Claude session begins | Initializes Codex session state, records metadata |
| SessionEnd | Session ends | Cleans up Codex jobs, saves final state |
| Stop | User clicks Stop (900s timeout) | Optionally runs Codex adversarial review gate before allowing termination |
Agents
Language-agnostic (plugin/agents/)
| Agent | Role | Triggered by |
|---|---|---|
codeflash |
Language router — detects language, delegates to language-specific router | /codeflash-optimize skill, user request |
codeflash-researcher |
Read-only research teammate | Domain agents, after baseline profiling |
codeflash-review |
Independent 6-pass deep review | /codex-review, post-optimization gate |
Python-specific (plugin/languages/python/agents/)
| Agent | Role | Triggered by |
|---|---|---|
codeflash-python |
Python domain router/team lead — orchestrates Python sessions | Language router after detecting Python |
codeflash-setup |
Environment detection & preparation | Python router, before first optimization |
codeflash-scan |
Quick cross-domain diagnosis | /codeflash-optimize scan or router recon |
codeflash-deep |
Primary optimizer (all dimensions) | Python router (default unless single-domain requested) |
codeflash-cpu |
CPU/runtime specialist | Python router or deep agent dispatch |
codeflash-memory |
Memory specialist | Python router or deep agent dispatch |
codeflash-async |
Async/concurrency specialist | Python router or deep agent dispatch |
codeflash-structure |
Import-time/module structure specialist | Python router or deep agent dispatch |
codeflash-ci |
CI mode agent for GitHub webhooks | CI service |
codeflash-pr-prep |
PR preparation agent | Post-session |
Commands (plugin/commands/)
User-invocable anytime:
| Command | Purpose |
|---|---|
/codex-review |
Manual adversarial review via Codex companion |
/codex-setup |
Check/install Codex CLI, configure review gate |
/codex-status |
Check active and recent Codex jobs |
Skills (plugin/languages/python/skills/)
| Skill | Purpose |
|---|---|
codeflash-optimize |
Entry point: start|resume|status|scan|review |
memray-profiling |
Advanced memory profiling utilities (used by codeflash-memory) |
References
Language-agnostic (plugin/references/shared/)
Methodology, templates, and frameworks that apply to any language:
| File | Purpose |
|---|---|
agent-base-protocol.md |
Shared operational rules (experiment discipline, commit rules, stuck recovery) |
experiment-loop-base.md |
Shared experiment loop framework (keep/discard tree, guard, plateau) |
pre-submit-review.md |
Shared pre-submit checklist (resource ownership, concurrency, correctness) |
e2e-benchmarks.md |
Two-phase measurement concept (micro-benchmark → E2E) |
micro-benchmark.md |
A/B pre-screen pattern |
pr-body-templates.md |
Generic PR body structure and writing guidelines |
pr-preparation.md |
PR workflow (inventory, folding, conventions) |
adversarial-review.md |
Codex adversarial review methodology |
changelog-template.md |
Changelog generation structure |
handoff-template.md |
HANDOFF.md template |
learnings-template.md |
Cross-session learnings template |
Python-specific (plugin/languages/python/references/)
Python implementations of shared protocols, plus domain-specific deep-dive docs:
| File/Dir | Purpose |
|---|---|
agent-base-protocol.md |
Python profilers (cProfile, tracemalloc, memray), test runners, package managers |
e2e-benchmarks.md |
codeflash compare usage, pytest-benchmark, fallback tools |
micro-benchmark.md |
Python A/B template (timeit, memray, asyncio), domain thresholds |
pre-submit-review.md |
Python checks (asyncio, .pyc, os.environ, monkey-patching) |
pr-body-templates.md |
Python PR variants (codeflash compare output, memray memory table) |
unified-profiling-script.py |
CPU+memory+GC profiling script for deep agent |
library-replacement.md |
Library boundary breaking guide |
async/ |
Async domain: asyncio patterns, blocking detection, concurrency |
data-structures/ |
CPU domain: containers, algorithms, bytecode, stdlib |
memory/ |
Memory domain: tracemalloc, memray, leak detection, framework leaks |
structure/ |
Structure domain: import time, module decomposition, circular deps |
State Files
Created during execution in .codeflash/:
| File | Created by | Purpose |
|---|---|---|
setup.md |
codeflash-setup | Environment summary |
scan-report.md |
codeflash-scan | Ranked targets + domain recommendations |
results.tsv |
optimizer agents | Experiment log (baseline, speedup, keep/discard) |
HANDOFF.md |
optimizer agents | Session state for resume |
conventions.md |
router | Binding constraints from maintainer feedback |
learnings.md |
router | Cross-session discoveries |
review-report.md |
codeflash-review | 6-pass review findings + verdict |
changelog.md |
router | PR-ready optimization summary |
Ordering Guarantees
Sequential:
- SessionStart hook fires before any agent acts
- Language detection before domain routing
- Setup agent completes before domain agents start
- Baseline profiling before any optimization experiment
- Re-profiling after every KEEP to update rankings
- Review gate runs after optimizer
[complete], before cleanup - SessionEnd hook fires as session terminates
Parallel allowed:
- Researcher analyzes targets #2-5 while optimizer works on target #1
- Multiple domain agents can run in separate worktrees
- Deep agent can dispatch domain agents while continuing its own profiling
Assembly
make build-plugin merges plugin/ (base, excluding languages/) + plugin/languages/python/ (overlay) into dist/. Set LANG=javascript to build for JS instead. Agent files use ${CLAUDE_PLUGIN_ROOT} for references — paths differ between source and assembled output.