codeflash-agent/agents
Kevin Turcios 37efa524d7 feat: improve skill, eval system, and tessl config
- Optimize codeflash-optimize SKILL.md (review score 17% → 98%, eval 87% → 100%)
  - Fix frontmatter (allowed-tools format, argument-hint under metadata)
  - Lead description with concrete actions, explicit agent launch parameters
- Add multi-run variance detection to eval system (--runs N flag)
  - score.py aggregate command: min/max/avg/stddev per criterion, flaky detection
  - check-regression.sh defaults to 3 runs for reliable regression detection
- Add per-criterion regression tracking to baseline-scores.json (v3)
  - Reports exactly which criteria regressed, not just total score drops
- Rename evals/ → codeflash-evals/ to avoid tessl directory conflicts
- Switch tessl to managed mode, gitignore vendored tiles and symlinks
2026-03-27 11:30:17 -05:00
..
references merge: resolve conflicts with main (guard, git history, stuck recovery) 2026-03-27 10:15:10 -05:00
codeflash-async.md merge: resolve conflicts with main (guard, git history, stuck recovery) 2026-03-27 10:15:10 -05:00
codeflash-cpu.md merge: resolve conflicts with main (guard, git history, stuck recovery) 2026-03-27 10:15:10 -05:00
codeflash-memory.md merge: resolve conflicts with main (guard, git history, stuck recovery) 2026-03-27 10:15:10 -05:00
codeflash-setup.md feat: improve skill, eval system, and tessl config 2026-03-27 11:30:17 -05:00
codeflash-structure.md merge: resolve conflicts with main (guard, git history, stuck recovery) 2026-03-27 10:15:10 -05:00
codeflash.md merge: resolve conflicts with main (guard, git history, stuck recovery) 2026-03-27 10:15:10 -05:00