codeflash-agent/plugin/ROADMAP.md
Kevin Turcios 3b59d97647 squash
2026-04-13 14:12:17 -05:00

18 lines
2.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## Phase 2: Claude-Native PR Review (future)
Replace Codex CLI dependency with native Claude Code agents:
1. **Create `codeflash-pr-review` agent** — adapts codex adversarial review prompt for Claude, with attack surface taxonomy and structured JSON output. Focused on general PR review (not optimization-specific like existing `codeflash-review` agent).
2. **Create `/codeflash-pr-review` command** — handles scope selection (working-tree/branch/PR number), gathers git context, launches the agent. Replaces codex-companion.mjs logic with native git commands.
3. **Add review output schema** to `agents/references/shared/review-output.schema.json`.
4. **Create stop-review-gate hook** — uses the stop-review-gate prompt concept, still powered by Codex CLI (OpenAI models are better reviewers, Claude is better at implementing the fixes).
## Phase 3: Game-Theoretic Strategy Selection (future)
Formalize the game theory patterns already implicit in the plugin — currently agents discover payoffs empirically through the experiment loop; this phase adds reasoning about expected value *before* trying strategies.
1. **Payoff matrix from history** — parse `results.tsv` across sessions to build a strategy × target-pattern payoff matrix. E.g., "container swap on dict-heavy hot path → 85% chance of ≥10% speedup". Agents consult this before choosing their first move instead of always following a fixed rotation order.
2. **Strategy selection with priors** — domain agents use accumulated payoff data to rank strategies by expected value for the current target's profile signature, falling back to the default rotation when no history matches.
3. **Cross-domain coalition scoring** — deep agent scores interaction pairs (memory→CPU, structure→memory, etc.) by historical compounding rates from the interaction column in results.tsv. Prioritizes targets where coalition payoff is highest.
4. **Adaptive exploration budget** — allocate experiment budget per strategy proportional to historical success rate, with a minimum exploration floor (e.g., 20%) for untried strategies to avoid premature convergence.
5. **Feedback loop closure** — after each session, auto-update `learnings.md` with strategy outcomes keyed by target profile signature, so future sessions start with better priors.