codeflash-internal/tiles/codeflash-internal-docs/docs/context-extraction.md

# Context Extraction

How code context is extracted and prepared for LLM optimization prompts.

## Context Types

### Single-File Context (`SingleOptimizerContext`)

Used when the function to optimize lives in a single file:
- Extracts the function source code
- Collects helper functions and class definitions
- Formats as system prompt + user prompt

### Multi-File Context (`MultiOptimizerContext`)

Used when the function spans or depends on multiple files:
- Collects code from multiple source files
- Manages file-path-annotated code blocks

## `BaseOptimizerContext` (`optimizer_context.py`)

Abstract base class for all context types:

### Factory Method

`get_dynamic_context()` — dispatches to `SingleOptimizerContext` or `MultiOptimizerContext` based on the input.

### Prompt Construction

- `get_system_prompt(python_version_str)` — builds system prompt with language version
- `get_user_prompt(dependency_code, line_profiler_results)` — builds user prompt with code and optional profiler data

### LLM Response Parsing

- `extract_code_and_explanation_from_llm_res(content)` — parses markdown code blocks from LLM output, extracts code and explanation text
- `parse_and_generate_candidate_schema()` — converts extracted code into `OptimizeResponseItemSchema`
- `is_valid_code()` — validates the extracted code is syntactically correct

## Code Formatting

LLM responses use markdown code blocks with file path annotations:

```
\`\`\`python:path/to/file.py
# optimized code here
\`\`\`
```

The context extraction system both generates this format (for prompts) and parses it (from responses).
feat: add Tessl tiles for codeflash-internal (rules, docs, skills) Three private tiles published to the codeflash workspace: - codeflash-internal-rules: 6 eager rules (code-style, architecture, optimization-patterns, git-conventions, testing-rules, multi-language-handlers) - codeflash-internal-docs: 8 lazy doc pages (domain-types, optimization-pipeline, test-generation-pipeline, context-extraction, aiservice/cf-api endpoints, configuration-thresholds, llm-provider-abstraction) - codeflash-internal-skills: 4 on-demand skills (debug-optimization-failure, add-language-support, add-api-endpoint, debug-test-generation) 2026-02-15 03:16:33 +00:00			`# Context Extraction`

			`How code context is extracted and prepared for LLM optimization prompts.`

			`## Context Types`

			### Single-File Context (`SingleOptimizerContext`)

			`Used when the function to optimize lives in a single file:`
			`- Extracts the function source code`
			`- Collects helper functions and class definitions`
			`- Formats as system prompt + user prompt`

			### Multi-File Context (`MultiOptimizerContext`)

			`Used when the function spans or depends on multiple files:`
			`- Collects code from multiple source files`
			`- Manages file-path-annotated code blocks`

			## `BaseOptimizerContext` (`optimizer_context.py`)

			`Abstract base class for all context types:`

			`### Factory Method`

			`get_dynamic_context()` — dispatches to `SingleOptimizerContext` or `MultiOptimizerContext` based on the input.

			`### Prompt Construction`

			- `get_system_prompt(python_version_str)` — builds system prompt with language version
			- `get_user_prompt(dependency_code, line_profiler_results)` — builds user prompt with code and optional profiler data

			`### LLM Response Parsing`

			- `extract_code_and_explanation_from_llm_res(content)` — parses markdown code blocks from LLM output, extracts code and explanation text
			- `parse_and_generate_candidate_schema()` — converts extracted code into `OptimizeResponseItemSchema`
			- `is_valid_code()` — validates the extracted code is syntactically correct

			`## Code Formatting`

			`LLM responses use markdown code blocks with file path annotations:`

			```
			\`\`\`python:path/to/file.py
			`# optimized code here`
			\`\`\`
			```

			`The context extraction system both generates this format (for prompts) and parses it (from responses).`