Commit graph

1611 commits

Author SHA1 Message Date
mashraf-222
33b4eb8e7a
Merge branch 'main' into cf-1085-cap-wildcard-import-expansion 2026-05-04 20:12:01 +03:00
Mohamed Ashraf
2908f76e64 fix: decode help-banner test subprocess output as UTF-8
Rich renders the banner panel with box-drawing characters (╭, ╮, │, etc.)
that cp1252 cannot decode. On Windows, subprocess.run(..., text=True) uses
cp1252 by default, so decoding the child stdout raises UnicodeDecodeError
and subprocess sets result.stdout to None — breaking the assertion with a
misleading "argument of type 'NoneType' is not iterable".

Pass encoding="utf-8" explicitly so the test passes on every platform.
2026-04-28 16:42:10 +00:00
Mohamed Ashraf
f02b99f8fb fix: decode help-banner test subprocess output as UTF-8
Rich renders the banner panel with box-drawing characters (╭, ╮, │, etc.)
that cp1252 cannot decode. On Windows, subprocess.run(..., text=True) uses
cp1252 by default, so decoding the child stdout raises UnicodeDecodeError
and subprocess sets result.stdout to None — breaking the assertion with a
misleading "argument of type 'NoneType' is not iterable".

Pass encoding="utf-8" explicitly so the test passes on every platform.
2026-04-28 16:42:10 +00:00
Mohamed Ashraf
f1521d7a2d fix: resolve pre-existing mypy errors on files touched by this PR
The prek mypy hook runs on changed files and bypasses the pyproject.toml
tests/ exclude, surfacing pre-existing errors in both context.py and
test_context.py that block CI for this PR. Fixes applied:

- Import Language from language_enum instead of base (base re-exports are
  not explicit; strict mypy flags attr-defined)
- Annotate _extract_class_declaration, _import_to_statement,
  get_java_imported_type_skeletons, and resolved_imports
- Guard None start/end_line in _extract_function_source_by_lines and
  find_helper_functions; guard None file_path in the import skeleton loop
- Drop unreachable `if not node: continue` in _extract_public_method_signatures
  (JavaMethodNode.node is non-nullable)
- Add -> None to every test method and fix an `int | None` comparison in
  test_context.py

All 880 Java tests pass after the change.
2026-04-28 15:40:02 +00:00
Mohamed Ashraf
efbd34159c test: annotate test_replacement.py for mypy prek hook
Add -> None return annotations and Path / JavaSupport parameter annotations
to every test method + fixture so the prek mypy hook passes when the file
is in the CI diff.
2026-04-28 15:22:42 +00:00
mashraf-222
0bf92290d2
Merge branch 'main' into cf-1087-field-injection-class-filter 2026-04-28 16:35:24 +03:00
mashraf-222
e95f701ce3
Merge branch 'main' into cf-1085-cap-wildcard-import-expansion 2026-04-28 16:35:09 +03:00
Aseem Saxena
db5b96a80c
Merge branch 'main' into feat/show-logo-on-help 2026-04-23 04:10:15 -07:00
Kevin Turcios
892bff485d feat(js): add JavaScript function tracer with Babel instrumentation
Replaces source-level JavaScript function tracing with Babel AST
transformation via babel-tracer-plugin.js and trace-runner.js. Adds
replay test generation, Python-side tracer runner, and --language
flag to the tracer CLI for explicit JS/TS routing.
2026-04-23 04:33:58 -05:00
mashraf-222
67cf123929
Merge pull request #2064 from codeflash-ai/fix/tracer-subprocess-exit-codes
fix: check subprocess exit codes in Java tracer
2026-04-21 15:35:46 +02:00
mashraf-222
ef535b8834
Merge pull request #2065 from codeflash-ai/fix/gradle-configure-on-demand
fix: add --configure-on-demand to all Gradle commands
2026-04-21 03:44:10 +02:00
Mohamed Ashraf
a4473c3684 merge: resolve conflict with main — adapt exit-code handling to combined invocation
Keep the combined JFR + tracing agent single JVM invocation from main while
preserving the fix's intent: raise when trace-db was not created, warn when
exit code is non-zero but trace-db exists. Integration tests rewritten to
match the combined-invocation semantics.
2026-04-21 01:40:26 +00:00
Kevin Turcios
4d4cb5f517
Merge pull request #2059 from codeflash-ai/refactor/benchmarks-to-dotcodeflash
Move benchmarks to .codeflash/benchmarks/
2026-04-13 05:06:00 -05:00
Mohamed Ashraf
a7371b55ca fix: add --configure-on-demand to all Gradle commands
Gradle evaluates all project configurations during the configuration
phase, even when only one module is targeted. Multi-module projects with
diverse toolchain requirements (e.g., OpenRewrite's rewrite-gradle needs
JDK 8) fail when an unrelated module's toolchain isn't available.

Adds --configure-on-demand to all 8 Gradle command construction sites
so Gradle only configures projects needed for the requested task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 21:46:42 +00:00
Mohamed Ashraf
470482e824 fix: check subprocess exit codes in Java tracer
_run_java_with_graceful_timeout() discarded the subprocess exit code in
both the no-timeout and timeout paths. If Maven/Gradle failed (compilation
error, OOM, etc.), the tracer silently continued with missing/stale data.

Now returns the exit code. Stage 1 (JFR profiling) warns on failure but
continues. Stage 2 (argument capture) raises RuntimeError since trace
data is essential for replay test generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 21:46:11 +00:00
Kevin Turcios
b737f71e46 fix: update test assertions to match simplified Workload fixture
The Workload.java fixture was trimmed to only repeatString but test
files still asserted computeSum, filterEvens, and instanceMethod.
2026-04-10 16:05:27 -05:00
Kevin Turcios
5c778dfad4 perf: trim tracer E2E workload to single function (repeatString)
Keep only repeatString which reliably produces 284% improvement.
Drop computeSum (marginal 16%), filterEvens and instanceMethod (no
optimization found). Reduces tracer E2E from ~1h27m to ~21m.
2026-04-10 15:08:03 -05:00
Kevin Turcios
ec14860d29 Move benchmarks to .codeflash/benchmarks/ and auto-discover
Move codeflash's own benchmarks to .codeflash/benchmarks/. Add
auto-discovery of .codeflash/benchmarks/ in codeflash compare and
benchmark mode -- when benchmarks-root is not explicitly configured,
the CLI checks for .codeflash/benchmarks/ before erroring.

Backwards compatible: users with existing benchmarks-root config
are unaffected. Docs continue to show tests/benchmarks as the
example path.
2026-04-10 08:39:15 -05:00
Kevin Turcios
151df774a4 perf: use --effort low for java-tracer E2E to reduce CI time 2026-04-10 08:29:46 -05:00
Kevin Turcios
01e22152c7 flexing 2026-04-10 05:07:53 -05:00
Kevin Turcios
e81f25f825 fix: remove stale repeatString assertions from integration tests
repeatString was removed from Workload.java in the E2E reduction.
2026-04-10 05:05:17 -05:00
Kevin Turcios
0772398c59 perf: optimize Java tracing agent serialization and writes
- Reuse ThreadLocal Kryo Output buffers (eliminates #1 allocation hotspot)
- Fast-path inline serialization for safe arg types (bypasses executor)
- Skip verification roundtrip for known-safe containers (ArrayList, HashMap, etc.)
- Batch SQLite inserts (256/txn) with permanent autocommit-off
- Switch to ArrayBlockingQueue (no per-element Node allocation)
- Add opt-in in-memory SQLite mode (VACUUM INTO at shutdown), enabled in CI
- Add timing instrumentation (onEntry, serialization, writes, dump)
- Add ProfilingWorkload fixture for benchmarking

Benchmark (50k captures): onEntry 5200ms→1200ms (4.3x), avg/capture
0.43ms→0.02ms (21x), writes 3200ms→900ms (3.5x) with in-memory mode.
2026-04-10 04:55:36 -05:00
Kevin Turcios
08aa94c54a perf: reduce java-tracer E2E to single function for ~11 min target
Drop repeatString from the Workload fixture (2→1 function).
computeSum alone exercises the full tracer→optimizer pipeline
(trace → replay tests → optimize → evaluate → rank → explain → review).
The second function added no additional pipeline coverage.
2026-04-10 03:44:54 -05:00
Kevin Turcios
46957e190f fix: update java tracer unit tests for reduced Workload fixture
Remove assertions for filterEvens and instanceMethod which were removed
from the Workload fixture. Adjust expected invocation counts accordingly.
2026-04-10 03:17:46 -05:00
Kevin Turcios
2b0f633c0f perf: reduce java-tracer E2E from ~75 min to ~15 min
Remove filterEvens and instanceMethod from the Workload fixture (4→2
functions) and reduce main() loop from 1000→100 rounds. The E2E test
only needs to verify the tracer→optimizer pipeline works end-to-end;
it doesn't need 4 functions or 1604 replay tests to prove that.

Expected impact: ~2 functions × ~8 candidates × fewer replay tests
should bring the job from ~75 min down to ~10-15 min.
2026-04-10 03:04:29 -05:00
Kevin Turcios
381d1319ea fix: specify utf-8 encoding in benchmark read_text for Windows CI
Windows defaults to cp1252 which can't decode some source file bytes.
2026-04-10 01:48:31 -05:00
Kevin Turcios
5a5b6e46ac bench: add dedicated comparator microbenchmark for frozenset fast-path
5 scenarios: primitives, nested dicts, DB rows, deep nesting,
and identity types (frozenset/range/complex/Decimal/OrderedDict).
2026-04-10 01:05:02 -05:00
Kevin Turcios
accbab4a16 fix: update test_cmd_auth patches for deferred imports
Imports in cmd_auth.py were moved into function bodies, so mock
patches must target the source modules instead of cmd_auth's namespace.
2026-04-10 00:36:02 -05:00
Kevin Turcios
2e2e19f7ae bench: add libcst visitor benchmarks for multi-file and full pipeline
- test_benchmark_libcst_multi_file: discover_functions + get_code_optimization_context across 10 real source files
- test_benchmark_libcst_pipeline: full discover → extract → replace → merge pipeline on one file
2026-04-10 00:21:45 -05:00
Kevin Turcios
1a25f05e14 fix: remove unnecessary Optimizer from benchmark test
The test only needs project_root, not a full Optimizer (which requires
an API key). Also adds missing __init__.py to tests/benchmarks/.
2026-04-10 00:10:36 -05:00
Kevin Turcios
da536db8a2 Clean up Java test skip markers
- Remove dead `import shutil` from test_comparator.py
- Rename `requires_java` → `requires_java_runtime` for consistency with test_run_and_parse.py
- Remove redundant `@requires_java_runtime` on test_behavior_return_value_correctness (class already has it)
2026-04-09 22:22:39 -05:00
Kevin Turcios
3f53309847
Merge branch 'main' into fix/gradle-maven-central-dependency 2026-04-09 18:13:18 -05:00
Kevin Turcios
5ff38597ef test: skip all Java integration test classes when JAR missing
Apply @requires_java_runtime to TestJavaRunAndParseBehavior and
TestJavaRunAndParsePerformance at the class level. The performance
test was failing on Windows with a flaky 10ms timing assertion
(10.515ms actual, 5% tolerance) — pre-existing issue masked by
continue-on-error.
2026-04-09 16:01:53 -05:00
Kevin Turcios
78372bfbfb test: skip test_behavior_return_value_correctness when JAR missing
Same fix as test_comparator.py — uses _find_comparator_jar() to skip
when the codeflash-runtime JAR isn't built. Fixes Windows unit-tests
which don't have Java pre-installed (unlike Linux runners).
2026-04-09 15:47:10 -05:00
Kevin Turcios
e5a18feb61 test: fix requires_java to check for runtime JAR, not just binaries
Ubuntu runners have Java/Maven pre-installed, so checking for java/mvn
binaries doesn't skip. The actual dependency is the codeflash-runtime
JAR which must be built from codeflash-java-runtime/ via Maven.
2026-04-09 12:19:16 -05:00
Kevin Turcios
be446cd8de test: skip Java comparator tests when Maven is unavailable
The requires_java marker only checked for java binary but the tests
also need mvn to build the codeflash-runtime JAR. These 13 tests
were silently failing in unit-tests (masked by continue-on-error).
2026-04-09 12:06:26 -05:00
Mohamed Ashraf
ebd72acb18 merge: resolve conflict with main in test_build_tools.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 15:07:17 +00:00
HeshamHM28
f5777947c6 Merge remote-tracking branch 'origin/main' into cf-java-void-optimization 2026-04-09 08:15:53 +00:00
Aseem Saxena
a958f3182b
Merge pull request #1856 from codeflash-ai/fix/structured-error-output-subagent-mode
fix: output structured XML errors in subagent mode
2026-04-08 12:48:18 -07:00
Mohamed Ashraf
29879f19bc Merge branch 'main' into cf-1085-cap-wildcard-import-expansion 2026-04-08 17:11:37 +00:00
Mohamed Ashraf
3b03249950 Merge remote-tracking branch 'origin/main' into cf-1087-field-injection-class-filter 2026-04-08 17:07:40 +00:00
Mohamed Ashraf
fdc7a52f33 Merge remote-tracking branch 'origin/main' into cf-1087-field-injection-class-filter 2026-04-08 16:39:23 +00:00
Mohamed Ashraf
8961b14d6f fix: update test assertion to match POSIX-normalized paths in Jest config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 12:12:26 +00:00
Mohamed Ashraf
4c70a21294 fix: resolve Windows CI failures from path separator mismatches
Normalize paths to forward slashes in JS/TS code generation and coverage
parsing — backslashes are escape chars in JavaScript strings and cause
silent corruption on Windows. Also relax timing test thresholds for CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 00:15:40 +00:00
Mohamed Ashraf
217544f99e fix: handle multi-line include directives in settings.gradle
The regex for extracting modules from settings.gradle only matched
single-line include statements. Multi-line includes like eureka's
(include 'a',\n 'b',\n 'c') only captured the first module, causing
test_module to be None and breaking multi-module path resolution
(e.g., classfiles lookup for JaCoCo coverage conversion).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 15:03:32 +00:00
Mohamed Ashraf
0ab4800f74 fix: use tree-sitter for Gradle repositories block and add version update logic
- Generalize _find_top_level_dependencies_block() into _find_top_level_block(name)
  so it can find any top-level block (dependencies, repositories, etc.)
- Rewrite _ensure_maven_central_repo() to use tree-sitter instead of regex,
  preventing false matches inside buildscript/subprojects/allprojects blocks
- Add _update_existing_codeflash_dependency() to replace stale versions or
  old files() format with the current Maven Central coordinate
- Wire version update into add_codeflash_dependency() and
  add_codeflash_dependency_multimodule() so old entries get updated instead
  of silently skipped

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 14:46:37 +00:00
HeshamHM28
1fde200bc4 fix: improve multi-module Gradle detection for dynamic settings.gradle.kts
- Parse listOf(...) patterns in settings.gradle.kts for projects that
  build include lists dynamically (e.g. OpenRewrite)
- Use word boundary in include regex to avoid matching variable names
  like 'includedProjects'
- Break module voting ties using codeflash.toml module-root config,
  so the function's own module is preferred over cross-module tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:08:16 +00:00
claude[bot]
8e2bab2e42 fix: add missing return type annotations to test functions
Co-authored-by: Aseem Saxena <aseembits93@users.noreply.github.com>
2026-04-06 23:12:44 +00:00
aseembits93
48e6835990 feat: use sys.argv[1:] for help check and add tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 16:09:47 -07:00
Mohamed Ashraf
e30bdd6748 Merge remote-tracking branch 'origin/main' into cf-1080-spotless-skip 2026-04-06 16:18:05 +00:00