codeflash-admin/codeflash-agent

Fork 0

mirror of https://github.com/codeflash-ai/codeflash-agent.git synced 2026-05-04 18:25:19 +00:00

Kevin Turcios 3b59d97647 squash

2026-04-13 14:12:17 -05:00

7.7 KiB

Raw Blame History

Plugin Architecture & Execution Order

Lifecycle

SessionStart hook — initializes Codex session state
User triggers /codeflash-optimize start (skill)
Language router (codeflash) — detects project language, delegates to language-specific router
Language-specific router (e.g., codeflash-python) — detects domain, asks user questions, launches setup
Setup agent (e.g., codeflash-setup) — detects env, installs deps/profilers, writes .codeflash/setup.md
Router validates setup, runs test suite, researches deps via context7
Router creates team and dispatches optimizer agent

Optimization Loop

Optimizer (codeflash-deep or domain-specific: -cpu, -memory, -async, -structure) — profiles all dimensions, ranks targets
Researcher (codeflash-researcher) — launched alongside to analyze targets in parallel, sends findings back to optimizer
Experiment cycle: profile → reason → implement → test → benchmark → keep/discard → commit → re-profile → repeat
Plateau detection (3+ consecutive discards) → optimizer sends [complete]

Review Gate

Review agent (codeflash-review) — 6-pass deep review (comprehension → correctness → safety → benchmark verification → quality → disclosure)
Writes .codeflash/review-report.md with verdict (APPROVE/REQUEST CHANGES/BLOCK)

Cleanup

Router shuts down teammates, deletes team
Preserves learnings.md, results.tsv, changelog.md; deletes temp files
SessionEnd hook — finalizes Codex session

Hooks

Defined in plugin/hooks/hooks.json, fire at session boundaries:

Hook	When	What
SessionStart	New Claude session begins	Initializes Codex session state, records metadata
SessionEnd	Session ends	Cleans up Codex jobs, saves final state
Stop	User clicks Stop (900s timeout)	Optionally runs Codex adversarial review gate before allowing termination

Agents

Language-agnostic (`plugin/agents/`)

Agent	Role	Triggered by
`codeflash`	Language router — detects language, delegates to language-specific router	`/codeflash-optimize` skill, user request
`codeflash-researcher`	Read-only research teammate	Domain agents, after baseline profiling
`codeflash-review`	Independent 6-pass deep review	`/codex-review`, post-optimization gate

Python-specific (`plugin/languages/python/agents/`)

Agent	Role	Triggered by
`codeflash-python`	Python domain router/team lead — orchestrates Python sessions	Language router after detecting Python
`codeflash-setup`	Environment detection & preparation	Python router, before first optimization
`codeflash-scan`	Quick cross-domain diagnosis	`/codeflash-optimize scan` or router recon
`codeflash-deep`	Primary optimizer (all dimensions)	Python router (default unless single-domain requested)
`codeflash-cpu`	CPU/runtime specialist	Python router or deep agent dispatch
`codeflash-memory`	Memory specialist	Python router or deep agent dispatch
`codeflash-async`	Async/concurrency specialist	Python router or deep agent dispatch
`codeflash-structure`	Import-time/module structure specialist	Python router or deep agent dispatch
`codeflash-ci`	CI mode agent for GitHub webhooks	CI service
`codeflash-pr-prep`	PR preparation agent	Post-session

Commands (`plugin/commands/`)

User-invocable anytime:

Command	Purpose
`/codex-review`	Manual adversarial review via Codex companion
`/codex-setup`	Check/install Codex CLI, configure review gate
`/codex-status`	Check active and recent Codex jobs

Skills (`plugin/languages/python/skills/`)

Skill	Purpose
`codeflash-optimize`	Entry point: `start\|resume\|status\|scan\|review`
`memray-profiling`	Advanced memory profiling utilities (used by codeflash-memory)

References

Language-agnostic (`plugin/references/shared/`)

Methodology, templates, and frameworks that apply to any language:

File	Purpose
`agent-base-protocol.md`	Shared operational rules (experiment discipline, commit rules, stuck recovery)
`experiment-loop-base.md`	Shared experiment loop framework (keep/discard tree, guard, plateau)
`pre-submit-review.md`	Shared pre-submit checklist (resource ownership, concurrency, correctness)
`e2e-benchmarks.md`	Two-phase measurement concept (micro-benchmark → E2E)
`micro-benchmark.md`	A/B pre-screen pattern
`pr-body-templates.md`	Generic PR body structure and writing guidelines
`pr-preparation.md`	PR workflow (inventory, folding, conventions)
`adversarial-review.md`	Codex adversarial review methodology
`changelog-template.md`	Changelog generation structure
`handoff-template.md`	HANDOFF.md template
`learnings-template.md`	Cross-session learnings template

Python-specific (`plugin/languages/python/references/`)

Python implementations of shared protocols, plus domain-specific deep-dive docs:

File/Dir	Purpose
`agent-base-protocol.md`	Python profilers (cProfile, tracemalloc, memray), test runners, package managers
`e2e-benchmarks.md`	`codeflash compare` usage, pytest-benchmark, fallback tools
`micro-benchmark.md`	Python A/B template (timeit, memray, asyncio), domain thresholds
`pre-submit-review.md`	Python checks (asyncio, .pyc, os.environ, monkey-patching)
`pr-body-templates.md`	Python PR variants (codeflash compare output, memray memory table)
`unified-profiling-script.py`	CPU+memory+GC profiling script for deep agent
`library-replacement.md`	Library boundary breaking guide
`async/`	Async domain: asyncio patterns, blocking detection, concurrency
`data-structures/`	CPU domain: containers, algorithms, bytecode, stdlib
`memory/`	Memory domain: tracemalloc, memray, leak detection, framework leaks
`structure/`	Structure domain: import time, module decomposition, circular deps

State Files

Created during execution in .codeflash/:

File	Created by	Purpose
`setup.md`	codeflash-setup	Environment summary
`scan-report.md`	codeflash-scan	Ranked targets + domain recommendations
`results.tsv`	optimizer agents	Experiment log (baseline, speedup, keep/discard)
`HANDOFF.md`	optimizer agents	Session state for resume
`conventions.md`	router	Binding constraints from maintainer feedback
`learnings.md`	router	Cross-session discoveries
`review-report.md`	codeflash-review	6-pass review findings + verdict
`changelog.md`	router	PR-ready optimization summary

Ordering Guarantees

Sequential:

SessionStart hook fires before any agent acts
Language detection before domain routing
Setup agent completes before domain agents start
Baseline profiling before any optimization experiment
Re-profiling after every KEEP to update rankings
Review gate runs after optimizer [complete], before cleanup
SessionEnd hook fires as session terminates

Parallel allowed:

Researcher analyzes targets #2-5 while optimizer works on target #1
Multiple domain agents can run in separate worktrees
Deep agent can dispatch domain agents while continuing its own profiling

Assembly

make build-plugin merges plugin/ (base, excluding languages/) + plugin/languages/python/ (overlay) into dist/. Set LANG=javascript to build for JS instead. Agent files use ${CLAUDE_PLUGIN_ROOT} for references — paths differ between source and assembled output.

7.7 KiB Raw Blame History