Commit graph

1595 commits

Author SHA1 Message Date
mashraf-222
67cf123929
Merge pull request #2064 from codeflash-ai/fix/tracer-subprocess-exit-codes
fix: check subprocess exit codes in Java tracer
2026-04-21 15:35:46 +02:00
mashraf-222
ef535b8834
Merge pull request #2065 from codeflash-ai/fix/gradle-configure-on-demand
fix: add --configure-on-demand to all Gradle commands
2026-04-21 03:44:10 +02:00
Mohamed Ashraf
a4473c3684 merge: resolve conflict with main — adapt exit-code handling to combined invocation
Keep the combined JFR + tracing agent single JVM invocation from main while
preserving the fix's intent: raise when trace-db was not created, warn when
exit code is non-zero but trace-db exists. Integration tests rewritten to
match the combined-invocation semantics.
2026-04-21 01:40:26 +00:00
Kevin Turcios
4d4cb5f517
Merge pull request #2059 from codeflash-ai/refactor/benchmarks-to-dotcodeflash
Move benchmarks to .codeflash/benchmarks/
2026-04-13 05:06:00 -05:00
Mohamed Ashraf
a7371b55ca fix: add --configure-on-demand to all Gradle commands
Gradle evaluates all project configurations during the configuration
phase, even when only one module is targeted. Multi-module projects with
diverse toolchain requirements (e.g., OpenRewrite's rewrite-gradle needs
JDK 8) fail when an unrelated module's toolchain isn't available.

Adds --configure-on-demand to all 8 Gradle command construction sites
so Gradle only configures projects needed for the requested task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 21:46:42 +00:00
Mohamed Ashraf
470482e824 fix: check subprocess exit codes in Java tracer
_run_java_with_graceful_timeout() discarded the subprocess exit code in
both the no-timeout and timeout paths. If Maven/Gradle failed (compilation
error, OOM, etc.), the tracer silently continued with missing/stale data.

Now returns the exit code. Stage 1 (JFR profiling) warns on failure but
continues. Stage 2 (argument capture) raises RuntimeError since trace
data is essential for replay test generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 21:46:11 +00:00
Kevin Turcios
b737f71e46 fix: update test assertions to match simplified Workload fixture
The Workload.java fixture was trimmed to only repeatString but test
files still asserted computeSum, filterEvens, and instanceMethod.
2026-04-10 16:05:27 -05:00
Kevin Turcios
5c778dfad4 perf: trim tracer E2E workload to single function (repeatString)
Keep only repeatString which reliably produces 284% improvement.
Drop computeSum (marginal 16%), filterEvens and instanceMethod (no
optimization found). Reduces tracer E2E from ~1h27m to ~21m.
2026-04-10 15:08:03 -05:00
Kevin Turcios
ec14860d29 Move benchmarks to .codeflash/benchmarks/ and auto-discover
Move codeflash's own benchmarks to .codeflash/benchmarks/. Add
auto-discovery of .codeflash/benchmarks/ in codeflash compare and
benchmark mode -- when benchmarks-root is not explicitly configured,
the CLI checks for .codeflash/benchmarks/ before erroring.

Backwards compatible: users with existing benchmarks-root config
are unaffected. Docs continue to show tests/benchmarks as the
example path.
2026-04-10 08:39:15 -05:00
Kevin Turcios
151df774a4 perf: use --effort low for java-tracer E2E to reduce CI time 2026-04-10 08:29:46 -05:00
Kevin Turcios
01e22152c7 flexing 2026-04-10 05:07:53 -05:00
Kevin Turcios
e81f25f825 fix: remove stale repeatString assertions from integration tests
repeatString was removed from Workload.java in the E2E reduction.
2026-04-10 05:05:17 -05:00
Kevin Turcios
0772398c59 perf: optimize Java tracing agent serialization and writes
- Reuse ThreadLocal Kryo Output buffers (eliminates #1 allocation hotspot)
- Fast-path inline serialization for safe arg types (bypasses executor)
- Skip verification roundtrip for known-safe containers (ArrayList, HashMap, etc.)
- Batch SQLite inserts (256/txn) with permanent autocommit-off
- Switch to ArrayBlockingQueue (no per-element Node allocation)
- Add opt-in in-memory SQLite mode (VACUUM INTO at shutdown), enabled in CI
- Add timing instrumentation (onEntry, serialization, writes, dump)
- Add ProfilingWorkload fixture for benchmarking

Benchmark (50k captures): onEntry 5200ms→1200ms (4.3x), avg/capture
0.43ms→0.02ms (21x), writes 3200ms→900ms (3.5x) with in-memory mode.
2026-04-10 04:55:36 -05:00
Kevin Turcios
08aa94c54a perf: reduce java-tracer E2E to single function for ~11 min target
Drop repeatString from the Workload fixture (2→1 function).
computeSum alone exercises the full tracer→optimizer pipeline
(trace → replay tests → optimize → evaluate → rank → explain → review).
The second function added no additional pipeline coverage.
2026-04-10 03:44:54 -05:00
Kevin Turcios
46957e190f fix: update java tracer unit tests for reduced Workload fixture
Remove assertions for filterEvens and instanceMethod which were removed
from the Workload fixture. Adjust expected invocation counts accordingly.
2026-04-10 03:17:46 -05:00
Kevin Turcios
2b0f633c0f perf: reduce java-tracer E2E from ~75 min to ~15 min
Remove filterEvens and instanceMethod from the Workload fixture (4→2
functions) and reduce main() loop from 1000→100 rounds. The E2E test
only needs to verify the tracer→optimizer pipeline works end-to-end;
it doesn't need 4 functions or 1604 replay tests to prove that.

Expected impact: ~2 functions × ~8 candidates × fewer replay tests
should bring the job from ~75 min down to ~10-15 min.
2026-04-10 03:04:29 -05:00
Kevin Turcios
381d1319ea fix: specify utf-8 encoding in benchmark read_text for Windows CI
Windows defaults to cp1252 which can't decode some source file bytes.
2026-04-10 01:48:31 -05:00
Kevin Turcios
5a5b6e46ac bench: add dedicated comparator microbenchmark for frozenset fast-path
5 scenarios: primitives, nested dicts, DB rows, deep nesting,
and identity types (frozenset/range/complex/Decimal/OrderedDict).
2026-04-10 01:05:02 -05:00
Kevin Turcios
accbab4a16 fix: update test_cmd_auth patches for deferred imports
Imports in cmd_auth.py were moved into function bodies, so mock
patches must target the source modules instead of cmd_auth's namespace.
2026-04-10 00:36:02 -05:00
Kevin Turcios
2e2e19f7ae bench: add libcst visitor benchmarks for multi-file and full pipeline
- test_benchmark_libcst_multi_file: discover_functions + get_code_optimization_context across 10 real source files
- test_benchmark_libcst_pipeline: full discover → extract → replace → merge pipeline on one file
2026-04-10 00:21:45 -05:00
Kevin Turcios
1a25f05e14 fix: remove unnecessary Optimizer from benchmark test
The test only needs project_root, not a full Optimizer (which requires
an API key). Also adds missing __init__.py to tests/benchmarks/.
2026-04-10 00:10:36 -05:00
Kevin Turcios
da536db8a2 Clean up Java test skip markers
- Remove dead `import shutil` from test_comparator.py
- Rename `requires_java` → `requires_java_runtime` for consistency with test_run_and_parse.py
- Remove redundant `@requires_java_runtime` on test_behavior_return_value_correctness (class already has it)
2026-04-09 22:22:39 -05:00
Kevin Turcios
3f53309847
Merge branch 'main' into fix/gradle-maven-central-dependency 2026-04-09 18:13:18 -05:00
Kevin Turcios
5ff38597ef test: skip all Java integration test classes when JAR missing
Apply @requires_java_runtime to TestJavaRunAndParseBehavior and
TestJavaRunAndParsePerformance at the class level. The performance
test was failing on Windows with a flaky 10ms timing assertion
(10.515ms actual, 5% tolerance) — pre-existing issue masked by
continue-on-error.
2026-04-09 16:01:53 -05:00
Kevin Turcios
78372bfbfb test: skip test_behavior_return_value_correctness when JAR missing
Same fix as test_comparator.py — uses _find_comparator_jar() to skip
when the codeflash-runtime JAR isn't built. Fixes Windows unit-tests
which don't have Java pre-installed (unlike Linux runners).
2026-04-09 15:47:10 -05:00
Kevin Turcios
e5a18feb61 test: fix requires_java to check for runtime JAR, not just binaries
Ubuntu runners have Java/Maven pre-installed, so checking for java/mvn
binaries doesn't skip. The actual dependency is the codeflash-runtime
JAR which must be built from codeflash-java-runtime/ via Maven.
2026-04-09 12:19:16 -05:00
Kevin Turcios
be446cd8de test: skip Java comparator tests when Maven is unavailable
The requires_java marker only checked for java binary but the tests
also need mvn to build the codeflash-runtime JAR. These 13 tests
were silently failing in unit-tests (masked by continue-on-error).
2026-04-09 12:06:26 -05:00
Mohamed Ashraf
ebd72acb18 merge: resolve conflict with main in test_build_tools.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 15:07:17 +00:00
HeshamHM28
f5777947c6 Merge remote-tracking branch 'origin/main' into cf-java-void-optimization 2026-04-09 08:15:53 +00:00
Aseem Saxena
a958f3182b
Merge pull request #1856 from codeflash-ai/fix/structured-error-output-subagent-mode
fix: output structured XML errors in subagent mode
2026-04-08 12:48:18 -07:00
Mohamed Ashraf
8961b14d6f fix: update test assertion to match POSIX-normalized paths in Jest config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 12:12:26 +00:00
Mohamed Ashraf
4c70a21294 fix: resolve Windows CI failures from path separator mismatches
Normalize paths to forward slashes in JS/TS code generation and coverage
parsing — backslashes are escape chars in JavaScript strings and cause
silent corruption on Windows. Also relax timing test thresholds for CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 00:15:40 +00:00
Mohamed Ashraf
217544f99e fix: handle multi-line include directives in settings.gradle
The regex for extracting modules from settings.gradle only matched
single-line include statements. Multi-line includes like eureka's
(include 'a',\n 'b',\n 'c') only captured the first module, causing
test_module to be None and breaking multi-module path resolution
(e.g., classfiles lookup for JaCoCo coverage conversion).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 15:03:32 +00:00
Mohamed Ashraf
0ab4800f74 fix: use tree-sitter for Gradle repositories block and add version update logic
- Generalize _find_top_level_dependencies_block() into _find_top_level_block(name)
  so it can find any top-level block (dependencies, repositories, etc.)
- Rewrite _ensure_maven_central_repo() to use tree-sitter instead of regex,
  preventing false matches inside buildscript/subprojects/allprojects blocks
- Add _update_existing_codeflash_dependency() to replace stale versions or
  old files() format with the current Maven Central coordinate
- Wire version update into add_codeflash_dependency() and
  add_codeflash_dependency_multimodule() so old entries get updated instead
  of silently skipped

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 14:46:37 +00:00
HeshamHM28
1fde200bc4 fix: improve multi-module Gradle detection for dynamic settings.gradle.kts
- Parse listOf(...) patterns in settings.gradle.kts for projects that
  build include lists dynamically (e.g. OpenRewrite)
- Use word boundary in include regex to avoid matching variable names
  like 'includedProjects'
- Break module voting ties using codeflash.toml module-root config,
  so the function's own module is preferred over cross-module tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:08:16 +00:00
Mohamed Ashraf
e30bdd6748 Merge remote-tracking branch 'origin/main' into cf-1080-spotless-skip 2026-04-06 16:18:05 +00:00
Sarthak Agarwal
21249265cf
Merge pull request #1988 from codeflash-ai/fix/vitest-coverage-override
Fix Vitest coverage collection by overriding coverage.reporter
2026-04-04 16:32:04 +05:30
Sarthak Agarwal
81be416043
Merge pull request #1991 from codeflash-ai/fix/verifier-path-validation
Fix: Handle test paths outside tests_root in verifier.py
2026-04-04 16:31:52 +05:30
Sarthak Agarwal
c0942b162b
Merge pull request #1992 from codeflash-ai/fix/typescript-jest-config-require
Fix Jest runtime config failing to load TypeScript base configs
2026-04-04 16:31:30 +05:30
Sarthak Agarwal
755d0f24fd
Merge pull request #1990 from codeflash-ai/fix/coverage-utils-framework-agnostic-messages
Fix: Make coverage error messages framework-agnostic
2026-04-04 16:31:16 +05:30
claude[bot]
d8c2b94359 style: remove redundant local import re and fix test conventions
- Remove redundant `import re` inside _is_vitest_workspace() since re is already imported at module level
- Convert tests to use pytest tmp_path fixture instead of tempfile.TemporaryDirectory()
- Add missing return type annotations and encoding= parameters
- Remove unused pytest import and docstrings

Co-authored-by: mohammed ahmed <undefined@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 07:56:07 +00:00
claude[bot]
ba0d2bc9a3 style: add missing -> None return type annotations to test methods 2026-04-04 07:52:21 +00:00
mohammed ahmed
08b9fe8d7f
Merge branch 'main' into fix/vitest-coverage-override 2026-04-04 09:51:41 +02:00
mohammed ahmed
cd1387ff7a
Merge branch 'main' into fix/verifier-path-validation 2026-04-04 09:49:18 +02:00
Sarthak Agarwal
973ebc2cf1
Merge pull request #1979 from codeflash-ai/fix/colocated-test-path-resolution
fix: handle co-located test directories with traverse_up
2026-04-04 12:04:00 +05:30
Sarthak Agarwal
0f2c50c239
Merge pull request #1982 from codeflash-ai/fix/vitest-mock-path-resolution
Fix vi.mock() path resolution in generated vitest tests
2026-04-04 12:03:45 +05:30
Sarthak Agarwal
c63defa2b2
Merge pull request #1984 from codeflash-ai/fix/js-project-root-per-function
Fix: Recalculate js_project_root per function in monorepos
2026-04-04 12:03:29 +05:30
mohammed ahmed
8d1c5e8108 Fix Jest runtime config failing to load TypeScript base configs
**Problem**: When a project uses `jest.config.ts` (TypeScript config), the
generated runtime config tries to `require('./jest.config.ts')`, which fails
because Node.js CommonJS cannot parse TypeScript syntax without compilation.

**Error**: `SyntaxError: Missing initializer in const declaration` at the
TypeScript type annotation (e.g., `const config: Config = ...`).

**Impact**: Affected 18 out of 38 optimization runs (~47%) in initial testing.
All TypeScript projects using `jest.config.ts` were unable to run tests.

**Root Cause**: Line 386 in test_runner.py used `base_config_path.name`
directly without checking the extension. The generated runtime config is
always a `.js` file, so it cannot use `require()` on `.ts` files.

**Solution**: Check if `base_config_path` is a TypeScript file (.ts). If so,
create a standalone runtime config without trying to extend it via require().
Jest will still discover and use the original TypeScript config naturally.

**Testing**:
- Added comprehensive test in test_jest_typescript_config_bug.py
- Test creates a realistic TypeScript Jest config and verifies the generated
  runtime config loads without syntax errors
- Existing 34 JavaScript test runner tests still pass
- No linting/type errors from `uv run prek`

**Trace IDs affected**: 0fd176bf-5c7f-4f41-8396-77c46be86412 and 17 others

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-04 05:12:00 +00:00
mohammed ahmed
218f3b5014 Fix test path validation error for JS/TS tests outside tests_root
When JavaScript/TypeScript support generates test files in __tests__
subdirectories adjacent to source files (e.g., src/foo/__tests__/codeflash-generated/),
these test files are not within the configured tests_project_rootdir.

Previously, verifier.py:37 called module_name_from_file_path() without
handling the ValueError that occurs when the test path is outside tests_root,
causing optimization runs to crash.

This fix adds try-except handling with a fallback to using just the filename,
matching the pattern already used in javascript/parse.py:330-333.

Fixes trace ID: 84f5467f-8acf-427f-b468-02cb3342097e

Changes:
- codeflash/verification/verifier.py:37-48: Added try-except for path computation
- tests/verification/test_verifier_path_handling.py: Added unit tests

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-04 02:39:52 +00:00
mohammed ahmed
933a2602c3 Fix: Make coverage error messages framework-agnostic
Error messages in coverage_utils.py hardcoded "Jest" even when the
test framework was Vitest. This caused confusion in logs when Vitest
tests failed (e.g., "Jest coverage file not found" when using Vitest).

The JestCoverageUtils class is used for both Jest and Vitest since
they share the same Istanbul/v8 coverage format. Error messages
should be framework-agnostic.

Changes:
- "Jest coverage file not found" → "JavaScript coverage file not found"
- "Failed to parse Jest coverage file" → "Failed to parse JavaScript coverage file"
- "No coverage data found for X in Jest coverage" → "No coverage data found for X in JavaScript coverage"
- "Function X not found in Jest fnMap" → "Function X not found in JavaScript fnMap"

Affected trace IDs: 37e5a406, 735555fa, 940dfe80, c1e1de0e, dbec6c33, de96b1ab, fcf08c6b (7 logs from Apr 4 00:50 batch)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-04 00:55:03 +00:00