codeflash-agent/plugin/ROADMAP.md at main

codeflash-admin/codeflash-agent

Fork 0

mirror of https://github.com/codeflash-ai/codeflash-agent.git synced 2026-05-04 18:25:19 +00:00

Kevin Turcios 3b59d97647 squash

2026-04-13 14:12:17 -05:00

2.3 KiB

Raw Permalink Blame History

Phase 2: Claude-Native PR Review (future)

Replace Codex CLI dependency with native Claude Code agents:

Create codeflash-pr-review agent — adapts codex adversarial review prompt for Claude, with attack surface taxonomy and structured JSON output. Focused on general PR review (not optimization-specific like existing codeflash-review agent).
Create /codeflash-pr-review command — handles scope selection (working-tree/branch/PR number), gathers git context, launches the agent. Replaces codex-companion.mjs logic with native git commands.
Add review output schema to agents/references/shared/review-output.schema.json.
Create stop-review-gate hook — uses the stop-review-gate prompt concept, still powered by Codex CLI (OpenAI models are better reviewers, Claude is better at implementing the fixes).

Phase 3: Game-Theoretic Strategy Selection (future)

Formalize the game theory patterns already implicit in the plugin — currently agents discover payoffs empirically through the experiment loop; this phase adds reasoning about expected value before trying strategies.

Payoff matrix from history — parse results.tsv across sessions to build a strategy × target-pattern payoff matrix. E.g., "container swap on dict-heavy hot path → 85% chance of ≥10% speedup". Agents consult this before choosing their first move instead of always following a fixed rotation order.
Strategy selection with priors — domain agents use accumulated payoff data to rank strategies by expected value for the current target's profile signature, falling back to the default rotation when no history matches.
Cross-domain coalition scoring — deep agent scores interaction pairs (memory→CPU, structure→memory, etc.) by historical compounding rates from the interaction column in results.tsv. Prioritizes targets where coalition payoff is highest.
Adaptive exploration budget — allocate experiment budget per strategy proportional to historical success rate, with a minimum exploration floor (e.g., 20%) for untried strategies to avoid premature convergence.
Feedback loop closure — after each session, auto-update learnings.md with strategy outcomes keyed by target profile signature, so future sessions start with better priors.

2.3 KiB Raw Permalink Blame History Unescape Escape

Phase 2: Claude-Native PR Review (future)

Phase 3: Game-Theoretic Strategy Selection (future)

2.3 KiB

Raw Permalink Blame History