codeflash-internal/tiles/codeflash-internal-docs/evals/summary_infeasible.json
2026-02-14 22:25:30 -05:00

14 lines
987 B
JSON

{
"infeasible_capabilities": [
{
"id": 1,
"name": "domain-models-relationships",
"reason": "The OptimizationFeatures, OptimizationEvents, and Repositories Django models are primarily data storage schemas. Testing whether an agent 'knows' their field names is a trivia quiz, not a realistic coding task. The models are partially covered by scenario-2 (schema conventions) but a dedicated scenario would devolve into rote recall rather than applied knowledge."
},
{
"id": 2,
"name": "optimization-pipeline-flow",
"reason": "The full 6-step pipeline flow is an architectural overview that spans multiple modules and async patterns. A realistic eval would require the agent to orchestrate actual asyncio.TaskGroup calls with real LLM clients and context objects, which cannot be validated in a static code output scenario. Individual steps (distribution, context, postprocessing) are tested separately in scenarios 1, 3, and 5."
}
]
}