Fix 10 failing tests: remove wrong assertions expecting import statements
inside extracted class code, use substring matching for UserDict class
signature, and rewrite click-dependent tests as project-local equivalents.
Add tests for resolve_instance_class_name, enhanced extract_init_stub_from_class,
and enrich_testgen_context instance resolution.
Add enrichment step that parses FTO parameter type annotations, resolves
types via jedi (following re-exports), and extracts full __init__ source
to give the LLM constructor context for typed parameters.
Move code_context_extractor.py and unused_definition_remover.py from
codeflash/context/ to codeflash/languages/python/context/ and update
all import sites.
Consolidate three enricher functions (get_imported_class_definitions,
get_external_base_class_inits, get_external_class_inits) into a single
enrich_testgen_context that parses code context once. Extract shared
helpers, unify prune_cst variants, deduplicate loop bodies, and remove
dead UsedNameCollector class.
Add BFS-based transitive resolution so that classes referenced in __init__
type annotations of imported external classes are also extracted. This gives
the LLM the constructor signatures it needs to instantiate parameter types.
Assignments that don't reference module-level definitions are now placed
right after imports. Only assignments that reference classes/functions
are placed after those definitions to prevent NameError.
When LLM-generated optimizations include module-level function calls like
`_register(MessageKind.ASK, ...)`, they were being inserted right after
imports, BEFORE the function definition they reference, causing NameError
at module load time.
Changes:
- Add GlobalStatementTransformer to append global statements at module end
- Reorder transformations: functions → assignments → statements
- Remove unused ImportInserter class
- Update test expectations to reflect new placement behavior
When LLM-generated optimizations include module-level code like
`_REIFIERS = {MessageKind.XXX: ...}`, the global assignment was being
inserted right after imports, BEFORE the class definition it referenced,
causing NameError at module load time.
Changes:
- GlobalAssignmentTransformer now inserts assignments after all
class/function definitions instead of right after imports
- GlobalStatementCollector now skips AnnAssign (annotated assignments)
so they are handled by GlobalAssignmentCollector instead
Classes used as dependencies (enums, dataclasses, types) were being
excluded from the optimization context even when marked as used by
the target function. This caused NameError when the LLM used these
types in generated optimizations.
Add get_external_base_class_inits to extract __init__ methods from external
library base classes (e.g., collections.UserDict) when project classes inherit
from them. This helps the LLM understand constructor signatures for mocking.
Add GlobalFunctionCollector and GlobalFunctionTransformer to collect and
insert module-level function definitions introduced by LLM optimizations.
This fixes NameError when optimized code introduces new helper functions
like @lru_cache decorated functions that are used by the optimized method.
GlobalAssignmentCollector only handled cst.Assign but not cst.AnnAssign
(annotated assignments like `X: int = 1`). When the LLM generated
optimizations with annotated module-level variables, these weren't
copied to the target file, causing NameError at runtime.
- Add visit_AnnAssign to GlobalAssignmentCollector
- Add leave_AnnAssign to GlobalAssignmentTransformer
- Update type hints to include cst.AnnAssign
- Add test for annotated assignment handling
- Update test_get_imported_class_definitions_includes_dataclass_decorators
to expect both base class and derived class to be extracted
- Add test_get_imported_class_definitions_extracts_multilevel_inheritance
to verify multi-level inheritance chains are fully extracted
When generating tests, the LLM now receives class definitions for
types imported from project modules. This helps the LLM understand:
- Constructor signatures (avoiding incorrect argument guessing)
- Base classes (e.g., abstract classes that can't be instantiated)
- Class structure for creating proper test instances
Previously, the LLM only saw import statements like:
from mypackage.elements import Element
Now it also sees the actual class definition with constructor details.
Changes:
- Add get_imported_class_definitions() to extract class definitions
from project modules referenced in import statements
- Integrate into get_code_optimization_context() to include extracted
classes in testgen context
- Gracefully handle token limits by dropping class definitions if needed
- Add 4 unit tests covering extraction, deduplication, and filtering