The prek mypy hook runs on changed files and bypasses the pyproject.toml
tests/ exclude, surfacing pre-existing errors in both context.py and
test_context.py that block CI for this PR. Fixes applied:
- Import Language from language_enum instead of base (base re-exports are
not explicit; strict mypy flags attr-defined)
- Annotate _extract_class_declaration, _import_to_statement,
get_java_imported_type_skeletons, and resolved_imports
- Guard None start/end_line in _extract_function_source_by_lines and
find_helper_functions; guard None file_path in the import skeleton loop
- Drop unreachable `if not node: continue` in _extract_public_method_signatures
(JavaMethodNode.node is non-nullable)
- Add -> None to every test method and fix an `int | None` comparison in
test_context.py
All 880 Java tests pass after the change.
Add -> None return annotations and Path / JavaSupport parameter annotations
to every test method + fixture so the prek mypy hook passes when the file
is in the CI diff.
Replaces source-level JavaScript function tracing with Babel AST
transformation via babel-tracer-plugin.js and trace-runner.js. Adds
replay test generation, Python-side tracer runner, and --language
flag to the tracer CLI for explicit JS/TS routing.
Keep the combined JFR + tracing agent single JVM invocation from main while
preserving the fix's intent: raise when trace-db was not created, warn when
exit code is non-zero but trace-db exists. Integration tests rewritten to
match the combined-invocation semantics.
Gradle evaluates all project configurations during the configuration
phase, even when only one module is targeted. Multi-module projects with
diverse toolchain requirements (e.g., OpenRewrite's rewrite-gradle needs
JDK 8) fail when an unrelated module's toolchain isn't available.
Adds --configure-on-demand to all 8 Gradle command construction sites
so Gradle only configures projects needed for the requested task.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_run_java_with_graceful_timeout() discarded the subprocess exit code in
both the no-timeout and timeout paths. If Maven/Gradle failed (compilation
error, OOM, etc.), the tracer silently continued with missing/stale data.
Now returns the exit code. Stage 1 (JFR profiling) warns on failure but
continues. Stage 2 (argument capture) raises RuntimeError since trace
data is essential for replay test generation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep only repeatString which reliably produces 284% improvement.
Drop computeSum (marginal 16%), filterEvens and instanceMethod (no
optimization found). Reduces tracer E2E from ~1h27m to ~21m.
Drop repeatString from the Workload fixture (2→1 function).
computeSum alone exercises the full tracer→optimizer pipeline
(trace → replay tests → optimize → evaluate → rank → explain → review).
The second function added no additional pipeline coverage.
Remove filterEvens and instanceMethod from the Workload fixture (4→2
functions) and reduce main() loop from 1000→100 rounds. The E2E test
only needs to verify the tracer→optimizer pipeline works end-to-end;
it doesn't need 4 functions or 1604 replay tests to prove that.
Expected impact: ~2 functions × ~8 candidates × fewer replay tests
should bring the job from ~75 min down to ~10-15 min.
- Remove dead `import shutil` from test_comparator.py
- Rename `requires_java` → `requires_java_runtime` for consistency with test_run_and_parse.py
- Remove redundant `@requires_java_runtime` on test_behavior_return_value_correctness (class already has it)
Apply @requires_java_runtime to TestJavaRunAndParseBehavior and
TestJavaRunAndParsePerformance at the class level. The performance
test was failing on Windows with a flaky 10ms timing assertion
(10.515ms actual, 5% tolerance) — pre-existing issue masked by
continue-on-error.
Same fix as test_comparator.py — uses _find_comparator_jar() to skip
when the codeflash-runtime JAR isn't built. Fixes Windows unit-tests
which don't have Java pre-installed (unlike Linux runners).
Ubuntu runners have Java/Maven pre-installed, so checking for java/mvn
binaries doesn't skip. The actual dependency is the codeflash-runtime
JAR which must be built from codeflash-java-runtime/ via Maven.
The requires_java marker only checked for java binary but the tests
also need mvn to build the codeflash-runtime JAR. These 13 tests
were silently failing in unit-tests (masked by continue-on-error).
Normalize paths to forward slashes in JS/TS code generation and coverage
parsing — backslashes are escape chars in JavaScript strings and cause
silent corruption on Windows. Also relax timing test thresholds for CI.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The regex for extracting modules from settings.gradle only matched
single-line include statements. Multi-line includes like eureka's
(include 'a',\n 'b',\n 'c') only captured the first module, causing
test_module to be None and breaking multi-module path resolution
(e.g., classfiles lookup for JaCoCo coverage conversion).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Generalize _find_top_level_dependencies_block() into _find_top_level_block(name)
so it can find any top-level block (dependencies, repositories, etc.)
- Rewrite _ensure_maven_central_repo() to use tree-sitter instead of regex,
preventing false matches inside buildscript/subprojects/allprojects blocks
- Add _update_existing_codeflash_dependency() to replace stale versions or
old files() format with the current Maven Central coordinate
- Wire version update into add_codeflash_dependency() and
add_codeflash_dependency_multimodule() so old entries get updated instead
of silently skipped
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Parse listOf(...) patterns in settings.gradle.kts for projects that
build include lists dynamically (e.g. OpenRewrite)
- Use word boundary in include regex to avoid matching variable names
like 'includedProjects'
- Break module voting ties using codeflash.toml module-root config,
so the function's own module is preferred over cross-module tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
**Problem**: When a project uses `jest.config.ts` (TypeScript config), the
generated runtime config tries to `require('./jest.config.ts')`, which fails
because Node.js CommonJS cannot parse TypeScript syntax without compilation.
**Error**: `SyntaxError: Missing initializer in const declaration` at the
TypeScript type annotation (e.g., `const config: Config = ...`).
**Impact**: Affected 18 out of 38 optimization runs (~47%) in initial testing.
All TypeScript projects using `jest.config.ts` were unable to run tests.
**Root Cause**: Line 386 in test_runner.py used `base_config_path.name`
directly without checking the extension. The generated runtime config is
always a `.js` file, so it cannot use `require()` on `.ts` files.
**Solution**: Check if `base_config_path` is a TypeScript file (.ts). If so,
create a standalone runtime config without trying to extend it via require().
Jest will still discover and use the original TypeScript config naturally.
**Testing**:
- Added comprehensive test in test_jest_typescript_config_bug.py
- Test creates a realistic TypeScript Jest config and verifies the generated
runtime config loads without syntax errors
- Existing 34 JavaScript test runner tests still pass
- No linting/type errors from `uv run prek`
**Trace IDs affected**: 0fd176bf-5c7f-4f41-8396-77c46be86412 and 17 others
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
PR #1968 changed JS to skip nested functions like Python, but the parity
test still expected JS to discover both outer and inner.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Discover void methods, instrument them by serializing the receiver instead
of a return value, and treat all-null comparisons as equivalent.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Problem
The inject_test_globals() function was adding duplicate framework imports,
causing parse errors like "Identifier 'describe' has already been declared".
This occurred when:
1. AI service generated tests WITH vitest imports
2. CLI called inject_test_globals() which added its own import
3. String-based duplicate check failed because identifiers had different order
Result: TWO import statements declaring the same identifiers → parse error.
## Solution
Replace string-based duplicate detection with regex-based detection that
catches ANY import from the framework, regardless of identifier order.
## Changes
- Added regex patterns for Vitest, Jest, and Mocha imports
- Modified inject_test_globals() to use regex search
- Added comprehensive tests in test_inject_test_globals_duplicate.py
## Impact
Fixes HIGH severity bug blocking test generation for all Vitest projects
## Trace IDs
- 03a5a9d9-8e56-47e8-9c5e-0160fb9a529a
- 0be70f8d-884e-45e4-8fa2-28ed40cdf068
- 29c6d314-8561-4bb4-9b77-00b3b83943f0
The tracer e2e fixture and code_to_optimize/java pom.xml files had
hardcoded 1.0.0 dependency versions, causing compilation failures
in CI when only 1.0.1 is installed to ~/.m2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Problem
Generated tests for ESM TypeScript projects were importing from relative
paths without .js extensions (e.g., `import X from './module'`), causing
ERR_MODULE_NOT_FOUND errors when tests run.
Node.js ESM requires explicit .js extensions for relative imports, even
when the source files are .ts. This is a TypeScript/ESM specification
requirement.
## Solution
Added `add_js_extensions_to_relative_imports()` function that:
- Adds .js extensions to relative imports (./x or ../x) without extensions
- Preserves imports that already have extensions (.js, .ts, etc.)
- Leaves non-relative imports (node modules) unchanged
- Only runs for ESM projects (CommonJS doesn't need extensions)
Integrated into test processing pipeline after module system conversion.
## Testing
- Added 7 unit tests covering various import patterns
- All 35 module_system tests pass
- All 315 JavaScript language tests pass
- Verified fix resolves ERR_MODULE_NOT_FOUND for trace 17751b8f-fa61-48bc-bdee-b924f0c7afc4
## References
Trace IDs with this issue: 17751b8f-fa61-48bc-bdee-b924f0c7afc4,
3b985200-a906-4c54-a685-df40361d6b2c, 91795877-3ccf-482c-86bd-748834b76f6e,
0298c59c-8980-4aed-b05d-b94940a6544f, ec2864a4-0de0-4ce9-9ec8-b545c82a4f53
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
find_fields() was called without a class_name filter, causing fields from
inner/anonymous classes to be injected into the outer target class. Now
scoped to target_method.class_name using the existing filter parameter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wildcard imports like `import org.jooq.*` expand to 870+ types, causing
5 minutes of disk I/O per function before the token budget check kicks in.
89% of jOOQ functions were skipped due to this.
When a wildcard expands to >50 types, filter to only types referenced in
the target method's code. This turns a 5-minute failure into a <1 second
resolution with only the relevant types included.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instrumented test files fail Spotless format checks on projects like
Apache Flink, Kafka, and Beam. Adds -Dspotless.check.skip=true and
-Dspotless.apply.skip=true to Maven, and spotlessCheck/Apply/Java/
Kotlin/Scala task disabling to the Gradle init script.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>