Allows running arbitrary benchmark scripts on both git refs and
rendering a styled comparison table. Supports optional --memory
via memray wrapping. No codeflash config required for script mode.
When --memory is used and no changed top-level functions are detected,
skip trace benchmarking but still run memray profiling. This fixes the
class method limitation where codeflash compare couldn't profile memory
for changes in class methods (which are excluded from @codeflash_trace
instrumentation due to pickle overhead).
**Problem:**
When running Codeflash-generated tests with coverage enabled, Vitest would
fail with returncode=1 due to project-level coverage thresholds not being met.
Generated tests typically cover only a single function (~1-2% of codebase),
which fails projects with thresholds like 70% lines/functions configured in
their vitest.config.ts.
**Root Cause:**
In vitest_runner.py line 450, Codeflash was adding --coverage flag without
disabling the project's global coverage thresholds. This caused false failures
even when all tests passed successfully.
**Solution:**
Added coverage threshold override flags when coverage is enabled:
- --coverage.thresholds.lines=0
- --coverage.thresholds.functions=0
- --coverage.thresholds.statements=0
- --coverage.thresholds.branches=0
These flags disable project-level thresholds, allowing coverage collection
without failing the test run. Coverage data is still collected for analysis,
but thresholds no longer cause false failures.
**Testing:**
- Added comprehensive unit tests in test_vitest_coverage_thresholds.py
- All 40 existing vitest-related tests pass
- Verified with uv run prek (linter + type checker)
**Related Issues:**
Trace IDs affected: 05a626f3, 932e7799, a145328d, aa9bb63f, d669202e, e6de097a
Fixes 6 out of 10 optimization failures in openclaw project.
The benchmark plugin now runs multiple rounds with calibrated
iterations. Tests need SELECT DISTINCT for row counts and must
extract median_ns from BenchmarkStats before validation.
Bug: Nested functions were being discovered and attempted to be optimized,
but the extraction logic only captured the nested function body without
parent scope variables, causing validation errors like:
'Undefined variable(s): base, streamFn, record, writer'
Root cause: The discover_functions method was allowing nested functions
(functions defined inside other functions) to be marked for optimization.
These nested functions depend on closure variables from their parent scope
and cannot be optimized in isolation.
Fix: Added explicit check to skip functions with parent_function set.
Nested functions are now filtered out during discovery phase.
Impact: Resolves 140+ trace failures with undefined variable errors.
Functions like 'wrapStreamFn.wrapped' will no longer be attempted.
Test: Added test_discover_functions.py with 4 test cases:
- test_discovers_top_level_function
- test_skips_nested_functions_in_closures (main bug fix test)
- test_discovers_class_methods (ensure methods still work)
- test_skips_nested_functions_with_multiple_levels
Affects trace IDs including: 02a59310-bb18-47e4-87cb-1e5144ce2d8c
and 140+ others with nested function extraction issues.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Java docs incorrectly referenced codeflash.toml (which doesn't exist) and
omitted Java from several pages despite being fully implemented.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Discover void methods, instrument them by serializing the receiver instead
of a return value, and treat all-null comparisons as equivalent.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add -w flag for pnpm workspace roots to avoid ERR_PNPM_ADDING_TO_ROOT
- Use local package path (/opt/codeflash/packages/codeflash) in dev mode
- Improve error logging to show actual stderr at ERROR level instead of WARNING
- Add unit tests for workspace detection and local package usage
Fixes 9/13 optimization failures caused by 'Cannot find package codeflash'
Trace IDs affected: 08d594a2, 1722cff7, 23480bf7, 3074f19b, 6043236e,
b883f1bd, d01b03ce, e56507a4, f8f54e06
## Problem
The inject_test_globals() function was adding duplicate framework imports,
causing parse errors like "Identifier 'describe' has already been declared".
This occurred when:
1. AI service generated tests WITH vitest imports
2. CLI called inject_test_globals() which added its own import
3. String-based duplicate check failed because identifiers had different order
Result: TWO import statements declaring the same identifiers → parse error.
## Solution
Replace string-based duplicate detection with regex-based detection that
catches ANY import from the framework, regardless of identifier order.
## Changes
- Added regex patterns for Vitest, Jest, and Mocha imports
- Modified inject_test_globals() to use regex search
- Added comprehensive tests in test_inject_test_globals_duplicate.py
## Impact
Fixes HIGH severity bug blocking test generation for all Vitest projects
## Trace IDs
- 03a5a9d9-8e56-47e8-9c5e-0160fb9a529a
- 0be70f8d-884e-45e4-8fa2-28ed40cdf068
- 29c6d314-8561-4bb4-9b77-00b3b83943f0
Previously, if codeflash-runtime was already in a user's pom.xml
(e.g. from a prior run with 1.0.0), the dependency was left as-is.
After a CLI upgrade expecting 1.0.1, Maven would fail to resolve
the old version. Now the dependency is always updated to match
CODEFLASH_RUNTIME_VERSION, handling both version bumps and the
legacy system-scope to test-scope migration in one pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The tracer e2e fixture and code_to_optimize/java pom.xml files had
hardcoded 1.0.0 dependency versions, causing compilation failures
in CI when only 1.0.1 is installed to ~/.m2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three places in maven_strategy.py had the version hardcoded instead of
using the constant: the dependency snippet, the install-file command,
and the system-scope replacement. This caused CI failures because the
pom.xml dependency pointed to 1.0.0 while ~/.m2 had 1.0.1.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The reviewer asked to remove the bundled JAR file, not the code
that looks for it. The fallback paths are still valid if a JAR
is placed there manually or by future tooling.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
JAR is resolved from Maven Central at runtime. The bundled copy
added 16MB of bloat and got stale between releases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update CODEFLASH_RUNTIME_VERSION, hardcoded JAR names, m2 cache path,
and bundled JAR so users resolve 1.0.1 from Maven Central.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Problem
Generated tests for ESM TypeScript projects were importing from relative
paths without .js extensions (e.g., `import X from './module'`), causing
ERR_MODULE_NOT_FOUND errors when tests run.
Node.js ESM requires explicit .js extensions for relative imports, even
when the source files are .ts. This is a TypeScript/ESM specification
requirement.
## Solution
Added `add_js_extensions_to_relative_imports()` function that:
- Adds .js extensions to relative imports (./x or ../x) without extensions
- Preserves imports that already have extensions (.js, .ts, etc.)
- Leaves non-relative imports (node modules) unchanged
- Only runs for ESM projects (CommonJS doesn't need extensions)
Integrated into test processing pipeline after module system conversion.
## Testing
- Added 7 unit tests covering various import patterns
- All 35 module_system tests pass
- All 315 JavaScript language tests pass
- Verified fix resolves ERR_MODULE_NOT_FOUND for trace 17751b8f-fa61-48bc-bdee-b924f0c7afc4
## References
Trace IDs with this issue: 17751b8f-fa61-48bc-bdee-b924f0c7afc4,
3b985200-a906-4c54-a685-df40361d6b2c, 91795877-3ccf-482c-86bd-748834b76f6e,
0298c59c-8980-4aed-b05d-b94940a6544f, ec2864a4-0de0-4ce9-9ec8-b545c82a4f53
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Instrumented test files fail Spotless format checks on projects like
Apache Flink, Kafka, and Beam. Adds -Dspotless.check.skip=true and
-Dspotless.apply.skip=true to Maven, and spotlessCheck/Apply/Java/
Kotlin/Scala task disabling to the Gradle init script.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The optimized code replaces f-string formatting (`f"[green]{pct:+.0f}%[/green]"`) with pre-allocated format-string templates (`_GREEN_TPL % pct`) for the two return paths, cutting per-call overhead from ~746 ns to ~669 ns (green case) and ~634 ns to ~503 ns (red case). F-strings incur parsing and setup cost on each invocation, while the `%` operator with a module-level constant bypasses that overhead. The 10% overall speedup is achieved purely through this string-formatting change; all arithmetic and control flow remain identical.
## Problem
Codeflash was generating import statements without file extensions for
TypeScript and ESM projects, causing ERR_MODULE_NOT_FOUND errors when
Node.js tried to resolve the modules.
Example error from trace 08d0e99e-10e6-4ad2-981d-b907e3c068ea:
```
Error [ERR_MODULE_NOT_FOUND]: Cannot find module
'/workspace/target/packages/microservices/server/server-factory'
imported from .../test_create__unit_test_0.test.ts
```
The generated test had:
```typescript
import ServerFactory from '../../server/server-factory'
```
But Node.js ESM requires explicit file extensions.
## Root Cause
The get_module_path method in JavaScriptSupport was unconditionally
removing file extensions with .with_suffix(""), regardless of whether
the project used ESM or CommonJS module system.
## Solution
Modified get_module_path to:
1. Detect the module system using detect_module_system()
2. For ESM or TypeScript files: add .js extension (TypeScript convention)
3. For CommonJS: keep no extension (backward compatible)
TypeScript convention is to use .js extension in imports even when the
source file is .ts, as imports reference the compiled output.
## Testing
- Added two new test cases in TestGetModulePath class
- All 73 existing JavaScript support tests pass
- All 28 module system tests pass
- Lint and type checks pass
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
get_benchmark_timings now returns BenchmarkStats instead of int.
The optimizer pipeline expects float (nanoseconds), so extract
median_ns at the boundary.