mirror of
https://github.com/codeflash-ai/codeflash-internal.git
synced 2026-05-04 18:25:18 +00:00
2.3 KiB
2.3 KiB
Task: Build an LLM Client Wrapper with Provider-Specific Handling
Context
The codeflash-internal aiservice uses a unified LLM abstraction in aiservice/llm.py. All LLM calls go through a single call_llm() function that handles both OpenAI (via Azure) and Anthropic (via Foundry) providers. Each provider has different client setup, message handling, and response parsing.
The test generation pipeline also uses LLM calls, adding framework detection for GPU sync in timing blocks and Jinja2-based prompt construction.
Task
-
Write an
LLMdataclass (usingpydantic_dataclass) with fields:name: str-- deployment namemax_tokens: int-- max context windowmodel_type: Literal["openai", "anthropic", "google"]input_cost: float-- USD per 1M tokenscached_input_cost: float-- USD per 1M cached tokensoutput_cost: float-- USD per 1M tokens
-
Define concrete model instances:
OpenAI_GPT_5_Miniwith pricing: $0.25 input, $0.03 cached, $2.00 outputAnthropic_Claude_Sonnet_4_5with pricing: $3.00 input, $0.00 cached, $15.00 output
-
Write an
async def call_llm()function that:- Accepts
llm: LLM,messages: list[dict],call_type: str,trace_id: str,max_tokens: int, and optionaluser_id: str - For OpenAI: uses
client.chat.completions.create(). If the model is GPT-5-mini, usesmax_completion_tokensparameter; otherwise usesmax_tokens - For Anthropic: extracts the system prompt from the messages list and passes it separately via the
system=kwarg. Concatenates text blocks from the response - Records every call to the database via
record_llm_call()in afinallyblock (including trace_id, call_type, model, cost, latency) - Returns an
LLMResponsewithcontent: str,usage: LLMUsage, andraw_response
- Accepts
-
Write a
detect_frameworks_from_code(code: str) -> set[str]function that:- Parses import statements to identify ML frameworks: PyTorch, TensorFlow, JAX
- Detects both direct imports and aliases
- Returns a set of framework names found
Expected Outputs
- A Python module with the LLM dataclass, model instances, call_llm function, and framework detection
- The call_llm function must handle both providers with their specific quirks
- record_llm_call must be in a finally block for observability