Replace Codex CLI dependency with native Claude Code agents:
1.**Create `codeflash-pr-review` agent** — adapts codex adversarial review prompt for Claude, with attack surface taxonomy and structured JSON output. Focused on general PR review (not optimization-specific like existing `codeflash-review` agent).
3.**Add review output schema** to `agents/references/shared/review-output.schema.json`.
4.**Create stop-review-gate hook** — uses the stop-review-gate prompt concept, still powered by Codex CLI (OpenAI models are better reviewers, Claude is better at implementing the fixes).
Formalize the game theory patterns already implicit in the plugin — currently agents discover payoffs empirically through the experiment loop; this phase adds reasoning about expected value *before* trying strategies.
1.**Payoff matrix from history** — parse `results.tsv` across sessions to build a strategy × target-pattern payoff matrix. E.g., "container swap on dict-heavy hot path → 85% chance of ≥10% speedup". Agents consult this before choosing their first move instead of always following a fixed rotation order.
2.**Strategy selection with priors** — domain agents use accumulated payoff data to rank strategies by expected value for the current target's profile signature, falling back to the default rotation when no history matches.
3.**Cross-domain coalition scoring** — deep agent scores interaction pairs (memory→CPU, structure→memory, etc.) by historical compounding rates from the interaction column in results.tsv. Prioritizes targets where coalition payoff is highest.
4.**Adaptive exploration budget** — allocate experiment budget per strategy proportional to historical success rate, with a minimum exploration floor (e.g., 20%) for untried strategies to avoid premature convergence.
5.**Feedback loop closure** — after each session, auto-update `learnings.md` with strategy outcomes keyed by target profile signature, so future sessions start with better priors.