The pytest collection subprocess only added module_root's parent to
PYTHONPATH, which works when module_root is a package (e.g. src/aviary)
but fails when it is the source root itself (e.g. src). Now both
module_root and its parent are added so imports like
`from mypkg.core import func` resolve correctly in either layout.
AIClient.post() now catches ValueError (parent of JSONDecodeError)
when the server returns 200 with an empty or malformed body, converting
it to AIServiceError. All callers already handle AIServiceError, so the
pipeline degrades cleanly instead of crashing with an unhandled exception.
TimeoutExpired in execute_test_subprocess was propagating unhandled,
crashing the pipeline instead of degrading to "no baseline." Now
caught and returned as CompletedProcess(returncode=-1). Subprocess
timeout scales with test file count (120s base + 60s/file, cap 600s)
instead of a fixed 10-minute wait for small suites.
Tests overlay isolation, concurrent dispatch, thread safety,
exception handling, and the full evaluate_candidate_isolated flow
with mocked subprocess execution.
Previously, a failed baseline would still block on
candidates_future.result(), hanging the process if the AI service
was slow. Now checks baseline first and cancels the future.
Evaluates candidates concurrently using ThreadPoolExecutor with project
overlays for isolation. Each candidate gets its own symlinked copy of
the project so test subprocesses don't interfere with each other or
the original source. Shared result lists protected with threading.Lock.
Introduces symlink-based temporary directories that mirror the project
root, replacing only the target module file with candidate code. This
allows test subprocesses to run against candidate code without mutating
the original source on disk, enabling safe parallel evaluation.
Start the AI service HTTP call for candidate generation concurrently
with baseline subprocess runs. The HTTP call doesn't depend on
baseline results, so it runs in a background thread while behavioral
tests, line profiling, and benchmarking execute. Saves 10-20s per
function (the AI round-trip time that was previously sequential).
Run line profiling and benchmarking subprocesses concurrently via
ThreadPoolExecutor instead of sequentially. Each writes to a unique
JUnit XML file to avoid collisions. Saves 20-60s per function
(the duration of the shorter subprocess).
Add missing # noqa: PLC0415 comments to deferred imports that lost
them during ruff --fix reformatting. Add unittest to TYPE_CHECKING
block in discovery.py so annotations resolve. Fix import sorting.
Move jedi, libcst, dill, and unittest imports from module level to
first use in _function_optimizer, _optimizer, _orchestrator, and
test_discovery. Pipeline __init__.py is now lazy via __getattr__.
Optimizer module load: 3.7s → 51ms. All 2591 tests pass.
Replace eager imports of all submodules with a __getattr__-based
lookup table. Submodules now load only when a name is accessed,
dropping `import codeflash_core` from ~74ms to ~4ms (98% reduction
from the original ~230ms). TYPE_CHECKING imports preserved for
static analysis. All 65 core + 2526 downstream tests pass.
Move requests, gitpython, sentry-sdk, and posthog imports from
module level into the functions that use them. This drops
`import codeflash_core` from ~230ms to ~74ms, making it viable
for lightweight consumers (e.g. the project detector) to depend
on core submodules without blowing startup budgets.
- _telemetry: defer sentry_sdk + posthog into init functions (20ms → 0.2ms)
- _git: use PEP 562 __getattr__ for lazy git import (59ms → 4ms)
- _platform: defer requests + sentry_sdk into methods (55ms → 1ms)
- _client: defer requests into post() method (72ms → 34ms)
- _http: add shared _make_session() factory for deferred Session creation
* Fix mypy errors and apply ruff formatting across packages
Fix ast.FunctionDef calls missing type_params for Python 3.12+,
correct type: ignore error codes in _comparator and _plugin, and
run ruff format on all package source and test files.
* Switch CI to prek for lint/typecheck checks
Use j178/prek-action for consistent lint+typecheck (ruff check,
ruff format, interrogate, mypy) matching local pre-commit config.
Keep test as a separate parallel job for test-env support.