2.2 KiB
Task: Implement a Model Distribution Calculator
Context
You are working on the codeflash-internal aiservice backend. The optimization pipeline distributes LLM calls across OpenAI and Anthropic models in parallel. The distribution logic lives in core/shared/optimizer_config.py.
You need to write a Python function that replicates the model distribution logic, and a second function that calculates the total estimated LLM cost for an optimization run given usage data.
Task
-
Write a function
get_model_distribution(n_candidates: int, max_calls: int) -> list[tuple[str, int]]that:- Takes the number of requested candidates and the maximum allowed parallel calls
- Computes
total = min(n_candidates, max_calls) - Splits between OpenAI and Anthropic using the formula:
claude_calls = (total - 1) // 2,gpt_calls = total - claude_calls - Returns a list of
(model_name, call_count)tuples, using"openai"and"anthropic"as model names
-
Write a function
calculate_optimization_cost(input_tokens: int, output_tokens: int, cached_input_tokens: int, provider: str) -> floatthat:- Computes the cost in USD given token counts
- For the
"openai"provider:cached_input_tokensis a subset ofinput_tokens, so non-cached =input_tokens - cached_input_tokens. Use GPT-5-mini pricing: $0.25 input, $0.03 cached input, $2.00 output per 1M tokens. - For the
"anthropic"provider:cached_input_tokensis additive toinput_tokens(they are separate). Use Claude Sonnet 4.5 pricing: $3.00 input, $15.00 output per 1M tokens (no cached discount).
-
Write a function
estimate_full_run_cost(n_candidates: int, avg_input_tokens: int, avg_output_tokens: int, avg_cached_tokens: int) -> floatthat:- Uses
get_model_distributionwithMAX_OPTIMIZER_CALLS = 6 - For each provider's call count, calculates cost using
calculate_optimization_cost - Returns total estimated cost
- Uses
Expected Outputs
- A Python module with all three functions
- The distribution for
n_candidates=5, max_calls=6should produce 3 OpenAI + 2 Anthropic calls - The distribution for
n_candidates=6, max_calls=6should produce 4 OpenAI + 2 Anthropic calls