mirror of https://github.com/codeflash-ai/codeflash.git synced 2026-05-04 18:25:17 +00:00

Kevin Turcios 0650973d8c refactor: restructure CLAUDE.md for effective context usage

- Remove commands block from CLAUDE.md (standard tool usage Claude knows)
- Remove dead @AGENTS.md reference
- Add optimization pipeline overview with module pointers
- Add domain glossary (optimization candidate, addressable time, candidate
  forest, replay test, tracer, worktree mode)
- Extract mypy workflow to .claude/skills/fix-mypy.md (on-demand)
- Create .claude/skills/fix-prek.md for prek workflow (on-demand)
- Add key entry points table to architecture.md
- Create path-scoped rules: optimization-patterns.md, language-patterns.md
- Remove redundancy from source-code.md and across rules files
- Move "never use pip" convention to code-style.md

2026-02-14 17:37:51 -05:00

2 KiB

Raw Blame History

CLAUDE.md

Project Overview

CodeFlash is an AI-powered Python code optimizer that automatically improves code performance while maintaining correctness. It uses LLMs to generate optimization candidates, verifies correctness through test execution, and benchmarks performance improvements.

Optimization Pipeline

Discovery → Ranking → Context Extraction → Test Gen + Optimization → Baseline → Candidate Evaluation → PR

Discovery (discovery/): Find optimizable functions across the codebase
Ranking (benchmarking/function_ranker.py): Rank functions by addressable time using trace data
Context (context/): Extract code dependencies (read-writable code + read-only imports)
Optimization (optimization/, api/): Generate candidates via AI service, run in parallel with test generation
Verification (verification/): Run candidates against tests, compare outputs via custom pytest plugin
Benchmarking (benchmarking/): Measure performance, select best candidate by speedup
Result (result/, github/): Create PR with winning optimization

Domain Glossary

Optimization candidate: A generated code variant that might be faster (OptimizedCandidate)
Function context: All code needed for optimization — split into read-writable (modifiable) and read-only (reference)
Addressable time: Time a function spends that could be optimized (own time + callee time / call count)
Candidate forest: DAG of candidates where refinements/repairs build on previous candidates
Replay test: Test generated from recorded benchmark data to reproduce real workloads
Tracer: Profiling system that records function call trees and timings (tracing/, tracer.py)
Worktree mode: Git worktree-based parallel optimization (--worktree flag)

Agent Rules

@.tessl/RULES.md follow the instructions