mirror of
https://github.com/codeflash-ai/codeflash-internal.git
synced 2026-05-04 18:25:18 +00:00
Three private tiles published to the codeflash workspace: - codeflash-internal-rules: 6 eager rules (code-style, architecture, optimization-patterns, git-conventions, testing-rules, multi-language-handlers) - codeflash-internal-docs: 8 lazy doc pages (domain-types, optimization-pipeline, test-generation-pipeline, context-extraction, aiservice/cf-api endpoints, configuration-thresholds, llm-provider-abstraction) - codeflash-internal-skills: 4 on-demand skills (debug-optimization-failure, add-language-support, add-api-endpoint, debug-test-generation)
1.6 KiB
1.6 KiB
Context Extraction
How code context is extracted and prepared for LLM optimization prompts.
Context Types
Single-File Context (SingleOptimizerContext)
Used when the function to optimize lives in a single file:
- Extracts the function source code
- Collects helper functions and class definitions
- Formats as system prompt + user prompt
Multi-File Context (MultiOptimizerContext)
Used when the function spans or depends on multiple files:
- Collects code from multiple source files
- Manages file-path-annotated code blocks
BaseOptimizerContext (optimizer_context.py)
Abstract base class for all context types:
Factory Method
get_dynamic_context() — dispatches to SingleOptimizerContext or MultiOptimizerContext based on the input.
Prompt Construction
get_system_prompt(python_version_str)— builds system prompt with language versionget_user_prompt(dependency_code, line_profiler_results)— builds user prompt with code and optional profiler data
LLM Response Parsing
extract_code_and_explanation_from_llm_res(content)— parses markdown code blocks from LLM output, extracts code and explanation textparse_and_generate_candidate_schema()— converts extracted code intoOptimizeResponseItemSchemais_valid_code()— validates the extracted code is syntactically correct
Code Formatting
LLM responses use markdown code blocks with file path annotations:
\`\`\`python:path/to/file.py
# optimized code here
\`\`\`
The context extraction system both generates this format (for prompts) and parses it (from responses).