# Task: Build an LLM Client Wrapper with Provider-Specific Handling ## Context The codeflash-internal aiservice uses a unified LLM abstraction in `aiservice/llm.py`. All LLM calls go through a single `call_llm()` function that handles both OpenAI (via Azure) and Anthropic (via Foundry) providers. Each provider has different client setup, message handling, and response parsing. The test generation pipeline also uses LLM calls, adding framework detection for GPU sync in timing blocks and Jinja2-based prompt construction. ## Task 1. Write an `LLM` dataclass (using `pydantic_dataclass`) with fields: - `name: str` -- deployment name - `max_tokens: int` -- max context window - `model_type: Literal["openai", "anthropic", "google"]` - `input_cost: float` -- USD per 1M tokens - `cached_input_cost: float` -- USD per 1M cached tokens - `output_cost: float` -- USD per 1M tokens 2. Define concrete model instances: - `OpenAI_GPT_5_Mini` with pricing: $0.25 input, $0.03 cached, $2.00 output - `Anthropic_Claude_Sonnet_4_5` with pricing: $3.00 input, $0.00 cached, $15.00 output 3. Write an `async def call_llm()` function that: - Accepts `llm: LLM`, `messages: list[dict]`, `call_type: str`, `trace_id: str`, `max_tokens: int`, and optional `user_id: str` - For OpenAI: uses `client.chat.completions.create()`. If the model is GPT-5-mini, uses `max_completion_tokens` parameter; otherwise uses `max_tokens` - For Anthropic: extracts the system prompt from the messages list and passes it separately via the `system=` kwarg. Concatenates text blocks from the response - Records every call to the database via `record_llm_call()` in a `finally` block (including trace_id, call_type, model, cost, latency) - Returns an `LLMResponse` with `content: str`, `usage: LLMUsage`, and `raw_response` 4. Write a `detect_frameworks_from_code(code: str) -> set[str]` function that: - Parses import statements to identify ML frameworks: PyTorch, TensorFlow, JAX - Detects both direct imports and aliases - Returns a set of framework names found ## Expected Outputs - A Python module with the LLM dataclass, model instances, call_llm function, and framework detection - The call_llm function must handle both providers with their specific quirks - record_llm_call must be in a finally block for observability