Replaces source-level JavaScript function tracing with Babel AST
transformation via babel-tracer-plugin.js and trace-runner.js. Adds
replay test generation, Python-side tracer runner, and --language
flag to the tracer CLI for explicit JS/TS routing.
Keep the combined JFR + tracing agent single JVM invocation from main while
preserving the fix's intent: raise when trace-db was not created, warn when
exit code is non-zero but trace-db exists. Integration tests rewritten to
match the combined-invocation semantics.
Gradle evaluates all project configurations during the configuration
phase, even when only one module is targeted. Multi-module projects with
diverse toolchain requirements (e.g., OpenRewrite's rewrite-gradle needs
JDK 8) fail when an unrelated module's toolchain isn't available.
Adds --configure-on-demand to all 8 Gradle command construction sites
so Gradle only configures projects needed for the requested task.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_run_java_with_graceful_timeout() discarded the subprocess exit code in
both the no-timeout and timeout paths. If Maven/Gradle failed (compilation
error, OOM, etc.), the tracer silently continued with missing/stale data.
Now returns the exit code. Stage 1 (JFR profiling) warns on failure but
continues. Stage 2 (argument capture) raises RuntimeError since trace
data is essential for replay test generation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep only repeatString which reliably produces 284% improvement.
Drop computeSum (marginal 16%), filterEvens and instanceMethod (no
optimization found). Reduces tracer E2E from ~1h27m to ~21m.
Move codeflash's own benchmarks to .codeflash/benchmarks/. Add
auto-discovery of .codeflash/benchmarks/ in codeflash compare and
benchmark mode -- when benchmarks-root is not explicitly configured,
the CLI checks for .codeflash/benchmarks/ before erroring.
Backwards compatible: users with existing benchmarks-root config
are unaffected. Docs continue to show tests/benchmarks as the
example path.
Drop repeatString from the Workload fixture (2→1 function).
computeSum alone exercises the full tracer→optimizer pipeline
(trace → replay tests → optimize → evaluate → rank → explain → review).
The second function added no additional pipeline coverage.
Remove filterEvens and instanceMethod from the Workload fixture (4→2
functions) and reduce main() loop from 1000→100 rounds. The E2E test
only needs to verify the tracer→optimizer pipeline works end-to-end;
it doesn't need 4 functions or 1604 replay tests to prove that.
Expected impact: ~2 functions × ~8 candidates × fewer replay tests
should bring the job from ~75 min down to ~10-15 min.
- test_benchmark_libcst_multi_file: discover_functions + get_code_optimization_context across 10 real source files
- test_benchmark_libcst_pipeline: full discover → extract → replace → merge pipeline on one file
- Remove dead `import shutil` from test_comparator.py
- Rename `requires_java` → `requires_java_runtime` for consistency with test_run_and_parse.py
- Remove redundant `@requires_java_runtime` on test_behavior_return_value_correctness (class already has it)
Apply @requires_java_runtime to TestJavaRunAndParseBehavior and
TestJavaRunAndParsePerformance at the class level. The performance
test was failing on Windows with a flaky 10ms timing assertion
(10.515ms actual, 5% tolerance) — pre-existing issue masked by
continue-on-error.
Same fix as test_comparator.py — uses _find_comparator_jar() to skip
when the codeflash-runtime JAR isn't built. Fixes Windows unit-tests
which don't have Java pre-installed (unlike Linux runners).
Ubuntu runners have Java/Maven pre-installed, so checking for java/mvn
binaries doesn't skip. The actual dependency is the codeflash-runtime
JAR which must be built from codeflash-java-runtime/ via Maven.
The requires_java marker only checked for java binary but the tests
also need mvn to build the codeflash-runtime JAR. These 13 tests
were silently failing in unit-tests (masked by continue-on-error).
Normalize paths to forward slashes in JS/TS code generation and coverage
parsing — backslashes are escape chars in JavaScript strings and cause
silent corruption on Windows. Also relax timing test thresholds for CI.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The regex for extracting modules from settings.gradle only matched
single-line include statements. Multi-line includes like eureka's
(include 'a',\n 'b',\n 'c') only captured the first module, causing
test_module to be None and breaking multi-module path resolution
(e.g., classfiles lookup for JaCoCo coverage conversion).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Generalize _find_top_level_dependencies_block() into _find_top_level_block(name)
so it can find any top-level block (dependencies, repositories, etc.)
- Rewrite _ensure_maven_central_repo() to use tree-sitter instead of regex,
preventing false matches inside buildscript/subprojects/allprojects blocks
- Add _update_existing_codeflash_dependency() to replace stale versions or
old files() format with the current Maven Central coordinate
- Wire version update into add_codeflash_dependency() and
add_codeflash_dependency_multimodule() so old entries get updated instead
of silently skipped
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Parse listOf(...) patterns in settings.gradle.kts for projects that
build include lists dynamically (e.g. OpenRewrite)
- Use word boundary in include regex to avoid matching variable names
like 'includedProjects'
- Break module voting ties using codeflash.toml module-root config,
so the function's own module is preferred over cross-module tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant `import re` inside _is_vitest_workspace() since re is already imported at module level
- Convert tests to use pytest tmp_path fixture instead of tempfile.TemporaryDirectory()
- Add missing return type annotations and encoding= parameters
- Remove unused pytest import and docstrings
Co-authored-by: mohammed ahmed <undefined@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
**Problem**: When a project uses `jest.config.ts` (TypeScript config), the
generated runtime config tries to `require('./jest.config.ts')`, which fails
because Node.js CommonJS cannot parse TypeScript syntax without compilation.
**Error**: `SyntaxError: Missing initializer in const declaration` at the
TypeScript type annotation (e.g., `const config: Config = ...`).
**Impact**: Affected 18 out of 38 optimization runs (~47%) in initial testing.
All TypeScript projects using `jest.config.ts` were unable to run tests.
**Root Cause**: Line 386 in test_runner.py used `base_config_path.name`
directly without checking the extension. The generated runtime config is
always a `.js` file, so it cannot use `require()` on `.ts` files.
**Solution**: Check if `base_config_path` is a TypeScript file (.ts). If so,
create a standalone runtime config without trying to extend it via require().
Jest will still discover and use the original TypeScript config naturally.
**Testing**:
- Added comprehensive test in test_jest_typescript_config_bug.py
- Test creates a realistic TypeScript Jest config and verifies the generated
runtime config loads without syntax errors
- Existing 34 JavaScript test runner tests still pass
- No linting/type errors from `uv run prek`
**Trace IDs affected**: 0fd176bf-5c7f-4f41-8396-77c46be86412 and 17 others
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>