# Task: Implement an LLM Response Parser for Optimization Candidates

## Context

The codeflash-internal optimization pipeline receives LLM responses as markdown text containing code blocks with file path annotations. The context extraction system in `optimizer_context.py` parses these responses to extract optimized code candidates. The format uses annotated markdown code blocks like:

```
\`\`\`python:path/to/file.py
# optimized code here
\`\`\`
```

After extraction, candidates go through postprocessing: AST-based deduplication (using `ast.parse()` + `ast.dump()`) and equality checking against the original code.

## Task

1. Write a function `extract_code_blocks(llm_response: str) -> list[dict]` that:
   - Parses markdown code blocks from an LLM response string
   - Handles the file-path-annotated format: `` ```python:path/to/file.py ``
   - Returns a list of dicts, each with keys: `"code"` (str), `"file_path"` (str or None), `"language"` (str)
   - Handles both annotated (with file path) and plain code blocks

2. Write a function `deduplicate_candidates(candidates: list[str], original_code: str) -> list[str]` that:
   - Removes duplicate candidates using AST-based comparison (`ast.parse()` + `ast.dump()`)
   - Filters out candidates that are identical to the original code (equality check)
   - Returns only unique, non-original candidates
   - Handles `SyntaxError` gracefully (keep candidates that fail to parse, as they might use features beyond basic AST)

3. Write a function `validate_python_code(code: str) -> bool` that:
   - Checks if the code is syntactically valid Python
   - Returns True if `ast.parse()` succeeds, False otherwise

## Expected Outputs

- A Python module with all three functions
- `extract_code_blocks` should correctly parse multi-block responses
- `deduplicate_candidates` should use AST normalization, not string comparison
- The module should import only `ast` and `re` from the standard library