Commit graph

1611 commits

Author SHA1 Message Date
claude[bot]
4c9abfb2aa refactor: remove redundant try/finally; rely on conftest autouse fixture for language cleanup
The conftest.py autouse fixture already resets _current_language before/after
each test, making per-test try/finally cleanup unnecessary.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
2026-03-18 18:23:49 +00:00
claude[bot]
0672e11342 fix: reset language singleton in test to prevent cross-test pollution
test_parse_line_profile_results_non_python_java_json set Language.JAVA
but never reset it, causing test_java_diff_ignored_when_language_is_python
to fail when tests ran in this order.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
2026-03-18 17:34:03 +00:00
Kevin Turcios
115cdba481 fix: address review feedback for attrs init instrumentation
- Fix bug: skip attrs classes with init=False (no __init__ to patch)
- Deduplicate attrs namespace/name sets into shared constants
- Fix _get_attrs_config to resolve import aliases properly
- Add test for init=False case with exact expected output
2026-03-18 03:34:44 -06:00
Kevin Turcios
0cb8a08c7e feat: instrument attrs __init__ via module-level monkey-patch wrapper
Instead of skipping attrs classes entirely (previous approach), emit a
module-level patch block immediately after the class definition:

  _codeflash_orig_ClassName_init = ClassName.__init__
  def _codeflash_patched_ClassName_init(self, *args, **kwargs):
      return _codeflash_orig_ClassName_init(self, *args, **kwargs)
  ClassName.__init__ = codeflash_capture(...)(_codeflash_patched_ClassName_init)

This sidesteps the __class__ cell TypeError that attrs(slots=True) triggers
when a synthetic super().__init__() body is injected into the original class,
because the patched wrapper is a plain module-level function with no __class__
cell.

Changes:
- InitDecorator.__init__: add _attrs_classes_to_patch dict
- visit_ClassDef: for attrs classes, record (name -> decorator) instead of
  returning immediately; set inserted_decorator=True
- visit_Module: splice patch block statements after each attrs ClassDef
- _build_attrs_patch_block: new helper that builds the 3-statement AST block
- Tests: rename *_no_init_skipped -> *_patched_via_module_wrapper and update
  expected strings to assert the exact generated patch block

Co-Authored-By: Oz <oz-agent@warp.dev>
2026-03-18 01:42:29 -06:00
Kevin Turcios
dd5e347bbb fix: skip attrs classes in __init__ instrumentation; add attrs support to code_context_extractor
- instrument_codeflash_capture: detect @attrs.define / @attr.s / etc. in the
  'no explicit __init__' branch and return early, same as dataclass/NamedTuple.
  Prevents a TypeError caused by attrs(slots=True) creating a new class whose
  __class__ cell no longer matches the injected super().__init__ wrapper.

- code_context_extractor: add _get_attrs_config() helper; update
  _collect_synthetic_constructor_type_names, _build_synthetic_init_stub, and
  _extract_synthetic_init_parameters to handle attrs field conventions
  (factory= keyword, init=False, kw_only).

- tests: add 3 exact-output tests for instrumentation skip behaviour and
  3 exact-output tests for attrs stub generation.

Co-Authored-By: Oz <oz-agent@warp.dev>
2026-03-18 01:33:40 -06:00
HeshamHM28
2a35716864
Merge branch 'main' into feat/gradle-executor-from-java 2026-03-18 06:14:47 +02:00
HeshamHM28
ba36a1f7b2 fix jar file problems 2026-03-18 05:47:56 +02:00
HeshamHM28
6b2a852d79 fix running tests 2026-03-18 05:29:00 +02:00
HeshamHM28
cd0e19396c fix multi_root 2026-03-18 04:24:16 +02:00
Kevin Turcios
948bfedfa0
Merge pull request #1852 from codeflash-ai/cf-1846-port-perf-improvements
perf: cache jedi project, batch test cache writes, fix Windows relative_to bug
2026-03-17 18:54:31 -06:00
Kevin Turcios
bf9adf2673 fix: normalize module fallback formatting for import merge 2026-03-17 18:16:54 -06:00
Aseem Saxena
506eb44648
Merge pull request #1855 from codeflash-ai/fix/git-diff-multi-language-extension-filter
fix: git diff auto-detection filters by current language instead of hardcoding .py
2026-03-17 17:02:22 -07:00
Mohamed Ashraf
bc7a5bf4bb fix: output structured XML errors in subagent mode
When codeflash runs with --subagent (e.g., via the Claude Code plugin),
exit_with_message() now outputs <codeflash-error> XML to stdout instead
of Rich panel text. This lets the calling agent parse errors
programmatically rather than receiving unstructured text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:48:17 +00:00
Mohamed Ashraf
87eedac002 fix: git diff auto-detection filters by current language instead of hardcoding .py
get_git_diff() hardcoded `.py` as the only valid file extension, causing
the auto-detect flow (no --file flag) to return 0 functions for Java,
JavaScript, and TypeScript projects. This broke the Claude Code plugin
integration where the hook runs `codeflash --subagent` without --file.

Now uses current_language_support().file_extensions to filter by the
active language's extensions dynamically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:40:41 +00:00
mohammed ahmed
1e92f3d2ed
Merge pull request #1814 from codeflash-ai/fix/js-replacement-stale-line-numbers-after-global-declarations
fix: re-discover function position after add_global_declarations shifts line numbers
2026-03-17 17:17:51 +02:00
ali
3bbaf26008
test: add unit tests for node_modules symlink in JS worktree setup
Covers setup_test_config symlinking node_modules from original repo
to worktree, including edge cases (no worktree, missing node_modules,
already existing node_modules).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 16:54:34 +02:00
ali
1282a103cd
Merge branch 'main' of github.com:codeflash-ai/codeflash into fix/js-replacement-stale-line-numbers-after-global-declarations 2026-03-17 15:11:40 +02:00
claude[bot]
8dc6d9eeda fix: remove test for deleted create_pyproject_toml function
The function was removed in the dead code cleanup but the test file still
imported it and had a TestCreatePyprojectToml class, causing ImportError.

Co-authored-by: Kevin Turcios <undefined@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 07:48:19 +00:00
Kevin Turcios
a0a2a85020
Merge pull request #1660 from codeflash-ai/unstructured-inference
feat: improve function ranking with reference graph and test-based boosting
2026-03-16 23:05:28 -06:00
misrasaurabh1
6c3f626c9e fix: filter Java runtime annotations by class name and fix ordering
Runtime annotations in PR descriptions were broken in two ways:
1. add_runtime_comments() ignored class/method prefixes in keys, causing
   annotations from unrelated test classes to leak across files and sum
   incorrectly at the same line number. Now filters by class names found
   in each test source file.
2. Test functions were removed before annotations were added, shifting
   line numbers so annotations landed on wrong lines. Swapped ordering
   so annotations are applied first, then function removal carries them
   along correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 17:51:57 -07:00
HeshamHM28
16b3c2f5e5 Merge branch 'main' into feat/gradle-executor-from-java 2026-03-16 23:45:14 +02:00
Kevin Turcios
2cafadb980 fix: deduplicate test count calls, guard None, and log effort escalation
Build test_count_cache once before ranking instead of calling
existing_unit_test_count O(2N) times. Guard for None function_to_tests
and add debug logging when effort is escalated from medium to high.
2026-03-16 14:41:55 -06:00
Kevin Turcios
5671562da2 perf: eliminate redundant CST parsing in get_code_optimization_context
Parse each file once instead of up to 16 times by:
- Making remove_unused_definitions_by_function_names accept/return cst.Module
- Making parse_code_and_prune_cst and add_needed_imports_from_module accept cst.Module
- Threading the parsed Module through process_file_context
- Adding extract_all_contexts_from_files that processes all 4 context types
  (READ_WRITABLE, READ_ONLY, HASHING, TESTGEN) in a single per-file pass
2026-03-16 10:11:58 -06:00
Kevin Turcios
282f2ba713 Improve testgen constructor context extraction 2026-03-16 00:47:17 -06:00
Kevin Turcios
cee12fe430 fix ranking boost ordering and statement helper extraction 2026-03-15 23:29:35 -06:00
Kevin Turcios
01847f9acc fix: auto-detect language_version in add_language_metadata when not provided
review_generated_tests and repair_generated_tests called add_language_metadata
without language_version, sending python_version=None to the API which rejects
with "Python version is required". Now falls back to current_language_support().
2026-03-14 19:16:01 -06:00
Kevin Turcios
a90cda2578 feat: boost ranking for tested functions and enable reference graph
- Add existing_unit_test_count() with parametrized test deduplication
- Stable-sort ranked functions so tested ones come first
- Enable reference graph resolver (was disabled) for non-CI runs
- Add per-function logging with ref count and test count
- Auto-upgrade top N functions to high effort when user hasn't set --effort
- Add CallGraph model with traversal (BFS, topological sort, subgraph)
- Add get_call_graph() to DependencyResolver protocol and ReferenceGraph
- Refactor get_callees() to delegate through get_call_graph()

CF-1660
2026-03-14 18:40:08 -06:00
Mohamed Ashraf
fa9d32f1c4 Merge branch 'main' into omni-java
Resolve 7 merge conflicts from main's modular refactoring + JS improvements:

- aiservice.py: combine multi-language metadata (omni-java) with main's structure
- cmd_init.py: adopt main's modular split (init_config, init_auth, github_workflow) + add Java import
- code_replacer.py: main's clean early-return style + omni-java's non-Python single-block fallback
- version.py, test_support_dispatch.py, test_javascript_test_runner.py: take main's versions
- uv.lock: regenerated

Port Java into main's modular structure:
- Fix init_java.py lazy imports to point to new modules (init_config, init_auth, github_workflow)
- Add Java workflow support to github_workflow.py (detection, template, customization)
- Fix broken Java imports (function_optimizer, line_profiler) after main's module moves

Add safety tests for merge-critical functions:
- test_add_language_metadata.py: 10 tests covering per-language payload correctness
- test_code_replacer_matching.py: 8 tests covering fallback chain

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 00:15:19 +00:00
Mohamed Ashraf
068c1a73d0 test: add unit tests for detect_project_language
Cover all detection paths: Java (pom.xml, build.gradle, build.gradle.kts),
TypeScript, JavaScript, Python, empty directory fallback, and priority
resolution when multiple build system markers coexist.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 04:01:49 +00:00
mohammed ahmed
d616501871
Merge pull request #1808 from codeflash-ai/fix/jest-runtime-config-for-external-tests
fix: use runtime Jest config for test files outside project root
2026-03-11 22:34:10 +02:00
ali
6a451ffe03
fix: re-discover function position after add_global_declarations shifts line numbers
When optimized JS/TS code introduces new global declarations (const, Set, etc.),
add_global_declarations inserts them into the original source, shifting the target
function's line numbers. The stale starting_line from function_to_optimize then
causes _replace_function_body to fail with "Could not find function X at line Y".

This affected ~666 function replacements in the Strapi optimization run, including
bytesToHumanReadable, isSelectable, getFileIconComponent, and many others.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 21:30:10 +02:00
Saurabh Misra
7f7591cf29
Merge pull request #1813 from codeflash-ai/fix/detect-deletion-only-diffs
fix: detect functions in deletion-only git diffs
2026-03-11 01:19:21 -04:00
aseembits93
b5a01457d2 test: add unit tests for deletion-only git diff detection
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 17:33:40 -07:00
Kevin Turcios
748094c7e0 Merge remote-tracking branch 'origin/main' into fix-dependabot-vulns 2026-03-10 16:54:29 -06:00
ali
eeeb6eebf8
test: update Jest roots tests to verify runtime config behavior
Update TestJestRootsConfiguration to match the new runtime config
approach: verify no --roots/config when tests are inside the project
root, and verify runtime config creation when tests are outside it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 17:18:16 +02:00
HeshamHM28
1ec829daff refactor: replace find_gradle_executable and find_maven_executable with find_executable method in Maven and Gradle strategies 2026-03-10 08:27:10 +02:00
HeshamHM28
3aea9397fe Refactor Maven strategy for Codeflash integration
- Added functions to install the Codeflash runtime and manage its dependency in the Maven POM file.
- Implemented JaCoCo plugin addition and configuration for test coverage.
- Enhanced XML parsing for POM files to support both namespaced and non-namespaced formats.
- Updated test cases to reflect changes in dependency management and plugin configuration.
- Removed obsolete runtime jar finding logic and consolidated module detection for both Maven and Gradle projects.
- Improved input validation for test run commands.
2026-03-10 04:14:55 +02:00
HeshamHM28
f4a4ac633f Refactor Java test runner tests to use MavenStrategy
- Updated tests in `test_java_multimodule_deps_install.py` to utilize `MavenStrategy` for installing multi-module dependencies.
- Changed function calls from `ensure_multi_module_deps_installed` to `MavenStrategy.install_multi_module_deps`.
- Added a fixture for `MavenStrategy` to streamline test setup.
- Modified assertions and mock setups to align with the new strategy implementation.

- Refactored tests in `test_java_test_filter_validation.py` to replace `_run_maven_tests` with `MavenStrategy.run_tests_via_build_tool`.
- Adjusted test cases to ensure proper handling of empty and valid test filters.
- Updated mock setups for Maven executable and command execution to reflect changes in the strategy.
2026-03-09 23:30:18 +02:00
claude[bot]
dea671073e test: skip reporter junit xml test on Windows CI
Node.js subprocess pipe behavior causes the test to hang on Windows
(returncode=1 but stdout reader thread blocks beyond 10s timeout).

Co-authored-by: Aseem Saxena <aseembits93@users.noreply.github.com>
2026-03-09 19:10:57 +00:00
Sarthak Agarwal
222805730b
Merge pull request #1778 from codeflash-ai/fix/github_actions_init
[Fix] Github Actions init
2026-03-07 23:58:13 +05:30
Kevin Turcios
f43ee06859 refactor: restructure codebase for locality and faster CLI startup
Move files closer to their consumers:
- function_context.py merged into code_context_extractor.py
- FunctionOptimizer base class to languages/function_optimizer.py
- test_runner, instrument_codeflash_capture, parse_line_profile to languages/python/
- oauth_handler.py to cli_cmds/

Split cmd_init.py (1993 lines) into focused modules:
- init_config.py: config types, validation, writing, shared UI
- init_auth.py: API key management + GitHub app installation
- github_workflow.py: GitHub Actions workflow generation
- cmd_init.py: init orchestrator + Python setup (639 lines)

Defer heavy imports (cmd_init, posthog, sentry) from module-level to
usage sites, reducing CLI startup from ~600ms to ~250ms. Replace
set_defaults(func=) with direct args.command dispatch in main().
2026-03-07 08:21:27 -05:00
Kevin Turcios
2fec18c65b fix: build full dotted name in _expr_name for module-qualified decorators/bases
_expr_name now recurses into ast.Attribute to produce the full dotted
path (e.g. "dataclasses.dataclass", "typing.NamedTuple"). Callers use
.endswith() so both bare and module-qualified forms are matched. Adds
test for typing.NamedTuple base class.
2026-03-07 03:32:25 -05:00
Kevin Turcios
9fd5a3d93f fix: address PR review — recurse _expr_name for call-style decorators, guard empty-set superset
Add tests for remove_test_functions qualified name support and
module-qualified dataclass decorator handling.
2026-03-07 03:03:23 -05:00
Kevin Turcios
e33a7da615 Merge branch 'main' into testgen-review 2026-03-06 17:20:42 -05:00
Sarthak Agarwal
417623f930
Merge pull request #1780 from codeflash-ai/fix/normalizer
[Fix] Normalizer and expand its scope
2026-03-07 01:55:44 +05:30
Sarthak Agarwal
3367383824 fix failing unit tests with recent refactoring 2026-03-07 01:48:41 +05:30
Sarthak Agarwal
353feab063 [Fix] Normalizer and expand its scope 2026-03-06 21:31:24 +05:30
Sarthak Agarwal
123067571a add tests 2026-03-06 16:27:46 +05:30
Kevin Turcios
6473145616 fix: skip NamedTuple classes in __init__ instrumentation
NamedTuples have a synthesized __init__ that cannot be overwritten.
The instrumentation was missing a skip check (like the existing one
for dataclasses), causing "Cannot overwrite NamedTuple attribute
__init__" errors that crashed the test subprocess and produced 0%
coverage.

Also removes duplicate docstring in make_ai_service_request.
2026-03-06 02:05:09 -05:00
Kevin Turcios
9c1fb4a397 fix: use PythonFunctionOptimizer in test_mock_candidate_replacement
Base FunctionOptimizer.get_code_optimization_context() raises
NotImplementedError — tests need the Python-specific subclass.
2026-03-06 01:03:01 -05:00
Kevin Turcios
e01236eb29 Merge branch 'main' into testgen-review 2026-03-06 01:02:06 -05:00
Kevin Turcios
5d872e845d
Merge pull request #1650 from codeflash-ai/fix/unused-helper-attribute-refs
fix: detect attribute-referenced methods as used in unused helper detection
2026-03-05 23:55:33 +00:00
Mohamed Ashraf
50957395a9 feat: centralize JAR version, cache runtime setup, add pom.xml backup/restore
- Extract CODEFLASH_RUNTIME_VERSION and CODEFLASH_RUNTIME_JAR_NAME constants
  in build_tools.py, replacing 15+ hardcoded "1.0.0" references across
  test_runner.py, comparator.py, and line_profiler.py
- Cache _ensure_codeflash_runtime() results so it runs once per optimization
  instead of 3 times (behavioral, benchmarking, line profiling phases)
- Add backup_pom/restore_pom/restore_all_pom_backups to build_tools.py so
  pom.xml modifications (codeflash-runtime dependency, JaCoCo plugin) are
  always reverted after optimization completes, even on crashes
- Call restore_all_pom_backups() in function_optimizer.py's finally block

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 23:17:09 +00:00
Kevin Turcios
6330005999 test: update expected markdown ordering to match target-file-first change 2026-03-05 07:04:03 -05:00
claude[bot]
6ad7ea49f6 fix: add missing -> None return type annotations to test functions
Co-authored-by: Kevin Turcios <undefined@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 11:18:54 +00:00
Kevin Turcios
1bcea313a0 test: add unit tests for get_optimized_code_for_module fallback logic 2026-03-05 06:06:21 -05:00
Sarthak Agarwal
ae6b1a3b5f
Merge branch 'main' into fix/js-vitest-benchmarking-and-mocha-cjs 2026-03-04 16:49:43 +05:30
Sarthak Agarwal
f0e4e5326d max loop count in test support 2026-03-04 16:48:38 +05:30
Kevin Turcios
f9e7f2a82a fix: skip codeflash_capture instrumentation for dataclasses without explicit __init__
Dataclass __init__ is auto-generated at class creation time and not
present in the AST. The instrumentor was injecting a synthetic __init__
with super().__init__(*args, **kwargs) which calls object.__init__()
and fails because dataclass fields are passed as kwargs.

Now only skips when the class is a @dataclass AND has no explicit
__init__. Dataclasses with custom __init__ are still instrumented.
2026-03-04 04:47:30 -05:00
misrasaurabh1
df57235d25 test: use full string equality in Java runtime comments tests
Replace substring assertions (e.g. `"// 2.89ms ->" in lines[7]`) with
exact full-output comparisons for better regression detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 01:02:24 -08:00
misrasaurabh1
9367a86771 fix: remove dead Java path-fixing code from base FunctionOptimizer
The base class had duplicate _get_java_sources_root and _fix_java_test_paths
methods that were overridden by JavaFunctionOptimizer. The base class also
had an is_java() block in generate_and_instrument_tests that used undefined
variables (used_behavior_paths, is_java). Removed all dead code since
JavaFunctionOptimizer.fixup_generated_tests handles this properly.

Also updated JavaFunctionOptimizer._fix_java_test_paths to accept
display_source parameter and use whole-word rename for collision handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 01:02:24 -08:00
misrasaurabh1
6ec4db446c Merge remote-tracking branch 'origin/omni-java' into generated-tests-in-pr
# Conflicts:
#	codeflash/languages/java/instrumentation.py
#	codeflash/optimization/function_optimizer.py
#	codeflash/verification/verifier.py
#	codeflash/version.py
#	tests/test_languages/test_java/test_instrumentation.py
#	tests/test_languages/test_java/test_java_test_paths.py
2026-03-04 00:26:46 -08:00
claude[bot]
77b705d2c7 fix: use forward slashes in jest-reporter test paths for Windows compatibility
Windows backslashes in paths embedded into JavaScript strings are
interpreted as escape sequences by Node.js, corrupting the module path.
Use .as_posix() to emit forward slashes which Node accepts on all platforms.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
2026-03-04 08:06:15 +00:00
claude[bot]
887e6384b6 fix: apply consistent conn safety pattern in trace benchmark tests
Initialize conn to None before try blocks and guard finally with
if conn is not None to prevent NameError if sqlite3.connect() raises.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
2026-03-04 07:37:03 +00:00
Kevin Turcios
2faef8ade6 fix: update JS max_loops test assertion to match constant (1_000)
PR #1764 changed JS_BENCHMARKING_MAX_LOOPS from 100_000 to 1_000 but
the test was updated to assert 5_000 instead of 1_000.
2026-03-04 02:26:53 -05:00
claude[bot]
59ec38eec4 fix: resolve mypy type error for conn variable in test_trace_benchmarks
Initialize conn as Optional before try block to allow None assignment.

Co-authored-by: Kevin Turcios <undefined@users.noreply.github.com>
2026-03-04 07:15:22 +00:00
Kevin Turcios
17730663ec fix: close SQLite connections in finally blocks for Windows compatibility
Ensures SQLite connections are always closed before file cleanup to
prevent PermissionError on Windows where open handles block file deletion.
2026-03-04 02:12:18 -05:00
Kevin Turcios
eceac13fc3 Merge remote-tracking branch 'origin/main' into omni-java
# Conflicts:
#	.claude/rules/architecture.md
#	.claude/rules/code-style.md
#	.github/workflows/claude.yml
#	.github/workflows/duplicate-code-detector.yml
#	codeflash/api/aiservice.py
#	codeflash/cli_cmds/console.py
#	codeflash/cli_cmds/logging_config.py
#	codeflash/code_utils/deduplicate_code.py
#	codeflash/discovery/discover_unit_tests.py
#	codeflash/languages/base.py
#	codeflash/languages/code_replacer.py
#	codeflash/languages/javascript/mocha_runner.py
#	codeflash/languages/javascript/support.py
#	codeflash/languages/python/support.py
#	codeflash/optimization/function_optimizer.py
#	codeflash/verification/parse_test_output.py
#	codeflash/verification/verification_utils.py
#	codeflash/verification/verifier.py
#	packages/codeflash/package-lock.json
#	packages/codeflash/package.json
#	tests/languages/javascript/test_support_dispatch.py
#	tests/test_codeflash_capture.py
#	tests/test_languages/test_javascript_test_runner.py
#	tests/test_multi_file_code_replacement.py
2026-03-04 01:52:32 -05:00
Kevin Turcios
dbc04df9df Update test_support_dispatch.py 2026-03-04 01:32:35 -05:00
Kevin Turcios
2fb0145895 Merge remote-tracking branch 'origin/omni-java' into merge/misc-fixes-into-omni-java
# Conflicts:
#	codeflash/api/aiservice.py
#	codeflash/languages/base.py
#	codeflash/languages/java/support.py
#	codeflash/languages/javascript/support.py
#	codeflash/languages/python/support.py
#	codeflash/verification/verifier.py
2026-03-04 01:23:39 -05:00
HeshamHM28
8287f96f05
Merge pull request #1680 from codeflash-ai/feat/java/wire-language-version
feat: add language version support across multiple language implement…
2026-03-03 22:12:54 -08:00
Kevin Turcios
b43b37b6dd fix: update nested class replacement test to match PR #1726 design
Inner-class methods are intentionally skipped by Java discovery
(PR #1726) since instrumentation is name-only and not class-aware.
Update test to expect False from replacement.
2026-03-04 00:29:46 -05:00
Kevin Turcios
ca149fa2d0 fix: use relpath for main.py in E2E test utilities
Take omni-main-java's fix for E2E test runner path resolution —
uses os.path.relpath from __file__ instead of hardcoded relative path.
Also adds codeflash.toml detection for Java projects.
2026-03-04 00:19:02 -05:00
Sarthak Agarwal
12294cafb6 fix looping with JS/TS 2026-03-04 10:46:44 +05:30
Kevin Turcios
92ab600edc fix: resolve remaining test failures from main sync
- Update test_inject_profiling_used_frameworks, test_async_run_and_parse,
  test_pickle_patcher to use new inject_profiling_into_existing_test API
  (test_string param removed)
- Add parse_line_profile_results function to parse_line_profile_test_output
  module (imported by main's PythonFunctionOptimizer and test_instrument_tests)
2026-03-04 00:13:27 -05:00
Kevin Turcios
af7ce7fce2 fix: cherry-pick main improvements into omni-java branch
- Take main's JS improvements: Mocha CJS support, ESM/CJS handling,
  sanitize_mocha_imports, vitest benchmarking fixes
- Update instrument_existing_test API: remove test_string param, read from
  file internally (aligned across Python, JS, Java support classes)
- Take main's equivalence.py with pass_fail_only parameter
- Take main's models.py, critic.py, env_utils.py, replay_test.py fixes
- Take main's PythonFunctionOptimizer parse_line_profile fix
- Skip files where our branch has Java-specific additions main doesn't
  have (create_pr, discover_unit_tests, parse_test_output, optimizer,
  verification_utils, config_parser, cmd_init, detector, support.py
  protocol methods)
2026-03-03 23:59:26 -05:00
Kevin Turcios
bccc02aade merge: incorporate omni-main-java sync work
Merges the omni-main-java branch which synced main into omni-java,
including JavaFunctionOptimizer, removal of is_java()/is_python() guards,
protocol dispatch for parse_test_xml, and deletion of concolic_testing.py.
2026-03-03 23:42:39 -05:00
Kevin Turcios
c550bf1561 fix: restore QName test to match omni-java's enrich_testgen_context behavior
The cherry-picked test fix from main assumed stdlib classes are extracted,
but omni-java's implementation still skips them.
2026-03-03 22:27:44 -05:00
Kevin Turcios
a0249afc7e fix: use PythonFunctionOptimizer in tests that depend on Python-specific hooks 2026-03-03 22:22:14 -05:00
Kevin Turcios
c11f52321e fix: correct pre-existing test failures in test_code_context_extractor
Fix 10 failing tests: remove wrong assertions expecting import statements
inside extracted class code, use substring matching for UserDict class
signature, and rewrite click-dependent tests as project-local equivalents.
Add tests for resolve_instance_class_name, enhanced extract_init_stub_from_class,
and enrich_testgen_context instance resolution.
2026-03-03 22:22:00 -05:00
mashraf-222
82e4627b03
Merge pull request #1740 from codeflash-ai/fix/java-comparator-and-errorprone
fix: suppress Error Prone CheckReturnValue in instrumented tests and fix stale pom dependency
2026-03-04 05:21:24 +02:00
misrasaurabh1
26f2c502fb feat: add inline runtime comments and unique _cf_mod/_cf_cls markers for Java tests
- Use instrumented class name in _cf_mod/_cf_cls markers to disambiguate
  existing vs generated tests sharing the same original class name
- Encode line number in invocation IDs (L{line}_{counter}) for deterministic
  call-site identification in inline runtime comments
- Rewrite add_runtime_comments() to annotate each call line with inline
  performance data instead of a summary block at top
- Strip assertions before instrumenting so both modes share the same base source
- Update test expected strings for new marker format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 18:58:20 -08:00
Kevin Turcios
ec73c2c692 revert: undo Vitest benchmarking and capture.js changes
Reverts ed2594bf which added --pool=forks to Vitest commands and
changed capture.js to use process.stdout.write and Vitest worker API.
These changes broke JS E2E tests (CJS, ESM, TS class) by altering
how all JS tests run, not just Vitest benchmarking.
2026-03-03 21:34:59 -05:00
Kevin Turcios
b2b8201541 fix: update test_get_helper_code expectation for blank line after imports 2026-03-03 21:10:35 -05:00
Kevin Turcios
fb6a4ab587 fix: update test expectations for extra blank line after imports
The CST comment position fix changed how blank lines are preserved
after import blocks, adding a PEP 8-compliant double blank line.
2026-03-03 20:58:32 -05:00
Sarthak Agarwal
f5d48841f0 fix mocha test runner 2026-03-03 20:50:50 -05:00
Sarthak Agarwal
4f2c65daec fix: strip CJS require('vitest') and require('@jest/globals') in Mocha tests
The AI backend generates vitest/jest-style imports for Mocha projects.
Our sanitize_mocha_imports() stripped ESM `import { ... } from 'vitest'`,
but process_generated_test_strings() runs BEFORE postprocessing and calls
ensure_module_system_compatibility() which converts these to CJS requires.
Result: `const { ... } = require('vitest')` survived sanitization.

Added regexes for the CJS variants of vitest and @jest/globals requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:49:04 -05:00
Sarthak Agarwal
74150c2a7f fix: resolve Vitest benchmarking showing wall-clock time instead of per-function timing
Root cause: Vitest performance tests reported "20.0 seconds over 1 loop"
(JUnit XML wall-clock fallback) instead of actual per-function nanosecond
timing. This was a chain of two issues:

1. **stdout interception**: Vitest's default `threads` pool intercepts
   process.stdout.write() and console.log(), preventing timing markers
   from flowing to the parent process. Fixed by adding `--pool=forks`
   to all Vitest commands and config files. The `forks` pool uses child
   processes where stdout flows directly to the parent.

2. **test name detection**: Even after markers flowed through (43,000+
   found in stdout), the parser couldn't match them to JUnit XML
   testcases because all markers had "unknown" as the test name. This
   happened because Vitest doesn't inject `beforeEach` as a global
   (unlike Jest), so capture.js's Jest-style hook to set
   `currentTestName` never fired.

   Fixed by adding Vitest-specific test name detection in capture.js:
   - Primary: `expect.getState().currentTestName` (full describe path)
   - Fallback: `__vitest_worker__.current.fullTestName`
   - Defense-in-depth: parser fallback matches "unknown" markers to
     the first testcase when no name match is found

Result: cheerio's `isHtml` went from "20.0s / 1 loop" to
"902μs / 20,853 loops" with proper speedup analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:48:59 -05:00
Sarthak Agarwal
b39f2a2e9b fix: support Mocha CJS projects and sanitize incorrect framework imports
Three related fixes for Mocha test generation in CommonJS projects:

1. inject_test_globals() now accepts module_system param — emits
   `require('node:assert/strict')` for CJS instead of ESM import syntax
2. ensure_module_system_compatibility() now converts ESM→CJS even when
   the source has mixed imports (was skipping when both ESM and CJS were
   detected, leaving the ESM import from inject_test_globals unconverted)
3. New sanitize_mocha_imports() strips vitest/jest/@jest/globals imports
   that the AI sometimes generates for Mocha projects — Mocha provides
   describe/it/before*/after* as globals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:48:53 -05:00
Sarthak Agarwal
15fcf4741d add mocha support 2026-03-03 20:48:39 -05:00
Sarthak Agarwal
e6b2f7975b feat: bundle JUnit XML reporter for Jest, replacing external jest-junit dependency
Ship a zero-dependency jest-reporter.js inside the codeflash runtime package
instead of requiring the external jest-junit npm package. This ensures the
reporter is always available when codeflash is installed, fixing Jest-based
projects (Strapi, Moleculer) that failed because jest-junit wasn't installed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:48:34 -05:00
Kevin Turcios
8f0ff84ecf fix: update tests for consolidated CST discovery behavior
Nested functions are now skipped by FunctionVisitor, and
discover_functions no longer swallows parse/IO errors — callers
handle them. Update test expectations accordingly.
2026-03-03 20:42:33 -05:00
Sarthak Agarwal
cb1ea5adc3 fix: use pattern matching for collocated tests in monorepos
When testsRoot overlaps moduleRoot (common in JS/TS monorepos like Ghost
where both point to "src"), the directory-based filter incorrectly
excluded ALL source files. Switch to filename/directory pattern matching
(*.test.*, *.spec.*, __tests__/) when roots overlap, preserving the
existing directory-based filter for standard layouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:41:45 -05:00
Sarthak Agarwal
095b8b76d0 feat: add skip_confirm and skip_api_key params to JS init for non-interactive mode
Allow init_js_project(), should_modify_package_json_config(), and
collect_js_setup_info() to run without interactive prompts when
skip_confirm=True. Uses auto-detected defaults instead of prompting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:54 -05:00
Sarthak Agarwal
a14ef40d15 fix: raise clear error for unsupported JS test frameworks instead of silent fallback
Add NotImplementedError guard in all 3 test dispatchers (behavioral,
benchmarking, line-profile) for frameworks other than jest and vitest.
Previously, mocha and other frameworks silently fell through to Jest,
causing confusing failures. Now users get a clear error message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:49 -05:00
Sarthak Agarwal
800ebed837 feat: discover object methods exported via CJS module.exports = variable
Resolve `module.exports = varName` where varName is an object literal
containing methods. For patterns like `const utils = { match() {} };
module.exports = utils;`, the individual methods are now recognized as
exported. This fixes function discovery for CJS libraries like Moleculer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:06 -05:00
Sarthak Agarwal
b3d4e225e5 feat: discover const arrow functions exported via named export clauses
Post-process find_functions() to mark functions as exported when they appear
in named export clauses like `export { joinBy }`. This fixes discovery for
TypeScript codebases (e.g., Strapi) that define const arrow functions and
export them via a separate export statement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:01 -05:00
Kevin Turcios
a89dc4b09b test: add regression tests for comment position preservation 2026-03-03 20:36:56 -05:00
Kevin Turcios
b5fab57499 fix: preserve comment position by passing CST module directly to import adder
parse_code_and_prune_cst now returns cst.Module instead of str.
add_needed_imports_from_module accepts cst.Module | str, skipping re-parse
when a Module is passed. This eliminates the string round-trip that caused
comments to migrate from statement leading_lines to Module.header,
resulting in comments appearing above imports instead of at their
original position.
2026-03-03 20:36:52 -05:00
Kevin Turcios
4cf8d31deb perf: cache module scan in _clear_lru_caches and expand test coverage
Cache inspect.getmembers() results per module so repeated loop
iterations skip the expensive rescan. Add tests for get_runtime_from_stdout,
should_stop, _set_nodeid, _get_total_time, _timed_out, logreport, and
setup/teardown hooks.
2026-03-03 20:36:42 -05:00
Kevin Turcios
d2454a250a fix: handle __slots__-only objects in comparator
Objects with __slots__ but no __dict__ (e.g. textual.cache.LRUCache)
fell through all comparator branches, logging "Unknown comparator input
type" and returning False — causing spurious test mismatches.
2026-03-03 20:34:04 -05:00
aseembits93
ed7601206b fix: add types.UnionType support to comparator
The comparator did not recognize `types.UnionType` (Python 3.10+ `X | Y`
syntax), causing it to fall through to "Unknown comparator input type".
Conditionally include it in the equality-checked types tuple.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:34:00 -05:00
aseembits93
706e33ed5a fix: support Python 3.10-3.14 in comparator itertools tests
Handle itertools.cycle on Python 3.14 where __reduce__ was removed by
falling back to element-by-element sampling. Add version guards for
pairwise (3.10+) and batched (3.12+) tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:56 -05:00
aseembits93
d7f24ad1ea fix: handle all remaining itertools types in comparator
Add a catch-all handler for itertools iterators (chain, islice, product,
permutations, combinations, starmap, accumulate, compress, dropwhile,
takewhile, filterfalse, zip_longest, groupby, pairwise, batched, tee).
Uses module check (type.__module__ == "itertools") so it automatically
covers any itertools type without version-specific enumeration. groupby
gets special handling to also materialize its group iterators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:52 -05:00
aseembits93
b13d2e1623 fix: handle itertools.repeat and itertools.cycle in comparator
itertools.repeat uses repr() comparison (same approach as count).
itertools.cycle uses __reduce__() to extract internal state (saved items,
remaining items, and first-pass flag) since repr() only shows a memory
address. The __reduce__ approach is deprecated in 3.14 but is the only
way to access cycle state without consuming elements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:47 -05:00
aseembits93
3041b0580b fix: handle itertools.count in comparator
The comparator had no handler for itertools.count (an infinite iterator),
causing it to fall through all type checks and return False even for
equal objects. Use repr() comparison which reliably reflects internal
state and avoids the __reduce__ deprecation coming in Python 3.14.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:43 -05:00
Ubuntu
1b3fa91a5c feat: Java testgen class name fix, remove per-test @Timeout, and wire language_version
- Add class_name and qualified_name to /testgen API payload so the backend
  has explicit access to computed FunctionToOptimize properties
- Add client-side _fix_java_test_class_name() to correct wrong class name
  references in LLM-generated Java test code
- Remove per-test @Timeout annotation from Java instrumentation (causes
  timing instability on CI runners; Maven Surefire handles timeouts)
- Remove redundant default_language_version, use language_version as canonical

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 01:06:37 +00:00
Mohamed Ashraf
62aaab87ac fix: pre-install multi-module Maven deps to avoid recompilation failures
Multi-module Maven projects like Guava fail on sequential Maven invocations
because compiler plugin 3.15.0's JDK-8318913 workaround patches module-info.class
timestamps, triggering unnecessary recompilation with -am that fails on partial
reactor rebuilds. This pre-installs deps to .m2 once, then drops -am from all
subsequent test commands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:37:32 +00:00
Mohamed Ashraf
b9135c5f6a fix: address PR review feedback for Error Prone and build_tools
- Remove redundant condition check in add_codeflash_dependency_to_pom
- Use lookahead-based regex to handle arbitrary XML element order in
  system-scope dependency replacement
- Broaden class declaration pattern to match final/abstract modifiers
- Add 7 unit tests for add_codeflash_dependency_to_pom including
  stale system-scope replacement and reordered XML elements
- Clarify comment about @SuppressWarnings in both modes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:00:14 +00:00
misrasaurabh1
6ad0d8b92b fix how tests looks like when PR is created 2026-03-03 14:47:25 -08:00
Sarthak Agarwal
bc5e3e878a fix mocha test runner 2026-03-04 03:42:10 +05:30
Mohamed Ashraf
2261e98953 fix: suppress Error Prone CheckReturnValue in instrumented tests and fix stale pom dependency
Add @SuppressWarnings("CheckReturnValue") to all generated instrumented test
classes. Projects using Error Prone (e.g. Guava) enforce CheckReturnValue as a
compiler error, which rejects our performance-only tests that intentionally
discard return values after assertion stripping.

Also fix add_codeflash_dependency_to_pom to detect and replace stale
system-scope dependencies left by previous runs with the correct test scope.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:57 +00:00
mashraf-222
dc7caa3151
Merge pull request #1651 from codeflash-ai/fix/java-comparator-vacuous-equivalence
fix: reject vacuous equivalence and deserialization error false matches in Java comparator
2026-03-03 22:15:21 +02:00
Mohamed Ashraf
e6ca15c36b test: update empty-table test to expect not-equivalent after vacuous guard
The test previously expected empty databases to return equivalent=True,
which was the exact bug being fixed. Updated to assert equivalent=False.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:40:06 +00:00
Mohamed Ashraf
cda56d1389 fix: handle JUnit 4 message-first assertEquals type inference
The type inference for assertEquals always used the first argument, but
JUnit 4's 3-arg overload is assertEquals(message, expected, actual).
When the first arg was a string message, the type was incorrectly inferred
as String instead of the actual expected value's type. Now detects the
message-first pattern and uses the second argument for type inference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:29:39 +00:00
Mohamed Ashraf
ecd9267b3f fix: update assertion removal tests for type inference and fix ruff lint
Update 41 test expectations in test_java_assertion_removal.py to match
the return type inference behavior introduced in commit 9e5880f0. Tests
now expect inferred types (int, boolean, String, double) instead of
Object for _cf_result variables.

Fix 2 ruff PLR1714 lint issues in remove_asserts.py by using set
membership tests instead of chained or comparisons.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:53:38 +00:00
Mohamed Ashraf
44fa2f8e16 test: update instrumentation test for assertion type inference
The behavior mode instrumentation test expected `Object _cf_result1`
but after the type inference fix, assertEquals(4, call()) now produces
`int _cf_result1 = (int)_cf_result1_1`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:41:40 +00:00
Mohamed Ashraf
9e5880f032 fix: infer Java return types in assertion transformer instead of using Object
The assertion transformer always declared `Object _cf_resultN = call()` when
replacing assertions, losing the actual return type. This caused compilation
failures when the result was used in a context expecting a primitive type
(e.g., int, boolean).

Now infers the return type from assertion context:
- assertEquals(int_literal, call()) -> int
- assertTrue/assertFalse(call()) -> boolean
- assertEquals("string", call()) -> String
- Falls back to Object when type can't be determined

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:41:40 +00:00
Sarthak Agarwal
e8a4f96c0b fix: strip CJS require('vitest') and require('@jest/globals') in Mocha tests
The AI backend generates vitest/jest-style imports for Mocha projects.
Our sanitize_mocha_imports() stripped ESM `import { ... } from 'vitest'`,
but process_generated_test_strings() runs BEFORE postprocessing and calls
ensure_module_system_compatibility() which converts these to CJS requires.
Result: `const { ... } = require('vitest')` survived sanitization.

Added regexes for the CJS variants of vitest and @jest/globals requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 16:01:55 +05:30
Sarthak Agarwal
ed2594bfd1 fix: resolve Vitest benchmarking showing wall-clock time instead of per-function timing
Root cause: Vitest performance tests reported "20.0 seconds over 1 loop"
(JUnit XML wall-clock fallback) instead of actual per-function nanosecond
timing. This was a chain of two issues:

1. **stdout interception**: Vitest's default `threads` pool intercepts
   process.stdout.write() and console.log(), preventing timing markers
   from flowing to the parent process. Fixed by adding `--pool=forks`
   to all Vitest commands and config files. The `forks` pool uses child
   processes where stdout flows directly to the parent.

2. **test name detection**: Even after markers flowed through (43,000+
   found in stdout), the parser couldn't match them to JUnit XML
   testcases because all markers had "unknown" as the test name. This
   happened because Vitest doesn't inject `beforeEach` as a global
   (unlike Jest), so capture.js's Jest-style hook to set
   `currentTestName` never fired.

   Fixed by adding Vitest-specific test name detection in capture.js:
   - Primary: `expect.getState().currentTestName` (full describe path)
   - Fallback: `__vitest_worker__.current.fullTestName`
   - Defense-in-depth: parser fallback matches "unknown" markers to
     the first testcase when no name match is found

Result: cheerio's `isHtml` went from "20.0s / 1 loop" to
"902μs / 20,853 loops" with proper speedup analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:31:10 +05:30
Sarthak Agarwal
a78b4cc299 fix: support Mocha CJS projects and sanitize incorrect framework imports
Three related fixes for Mocha test generation in CommonJS projects:

1. inject_test_globals() now accepts module_system param — emits
   `require('node:assert/strict')` for CJS instead of ESM import syntax
2. ensure_module_system_compatibility() now converts ESM→CJS even when
   the source has mixed imports (was skipping when both ESM and CJS were
   detected, leaving the ESM import from inject_test_globals unconverted)
3. New sanitize_mocha_imports() strips vitest/jest/@jest/globals imports
   that the AI sometimes generates for Mocha projects — Mocha provides
   describe/it/before*/after* as globals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:30:22 +05:30
Sarthak Agarwal
f53272c03b fix: use pattern matching for collocated tests in monorepos
When testsRoot overlaps moduleRoot (common in JS/TS monorepos like Ghost
where both point to "src"), the directory-based filter incorrectly
excluded ALL source files. Switch to filename/directory pattern matching
(*.test.*, *.spec.*, __tests__/) when roots overlap, preserving the
existing directory-based filter for standard layouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:29:05 +05:30
misrasaurabh1
c111635497 fix(java): fix lint error and broken mock in test_filter_validation
- Replace try/except/pass with contextlib.suppress() (ruff SIM105)
- Fix test_run_maven_tests_succeeds_with_valid_filter to mock
  _run_cmd_kill_pg_on_timeout instead of subprocess.run; on Linux
  the function uses Popen not run, so the old mock was never called

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:34:30 +00:00
misrasaurabh1
814ce5388c fix(java): add Windows guard for process-group kill and remove slow timeout tests
- Add sys.platform == "win32" check in _run_cmd_kill_pg_on_timeout so
  Windows machines fall back to plain subprocess.run() (Windows has no
  POSIX process groups / killpg)
- Remove TestRunCmdKillPgOnTimeout test class (5 tests using sleep 60
  commands were adding significant time to the test suite)

Follow-up to the SQLite-locked-error fix merged in #1728.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:34:30 +00:00
misrasaurabh1
7d45efe401 fix(java): exclude apidocs and javadoc directories from file discovery
When running with --all on a Java project, codeflash was discovering .js
files inside apidocs/ and javadoc/ directories (generated Javadoc HTML)
and attempting to optimize them as JavaScript. This caused:
- "Invalid test framework for JavaScript/TypeScript" errors
- Wasted API calls for ~30+ functions from jquery-3.7.1.min.js
- Spurious "NO TESTS GENERATED" warnings for minified jQuery functions

Fix: add "apidocs" and "javadoc" to Java's dir_excludes. Because the
--all mode unions dir_excludes from all languages, these directories are
now skipped in both Java-specific and --all discovery modes.

Adds 5 tests verifying the exclusion works for Java mode and --all mode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:24:27 +00:00
misrasaurabh1
16d314bf90 revert: remove WAL/busy_timeout defense-in-depth SQLite changes
The root cause of 'database is locked' is orphaned Surefire JVM processes
after Maven timeout.  The actual fix is killing the entire process group
(_run_cmd_kill_pg_on_timeout in test_runner.py).

The WAL mode / busy_timeout / sqlite3.connect(timeout=30) changes were
treating the symptom rather than the root cause.  Revert them:

- codeflash/languages/java/instrumentation.py: remove PRAGMA journal_mode=WAL
  and PRAGMA busy_timeout=30000 from inline SQLite write code
- codeflash/verification/parse_test_output.py: revert timeout=30 to default
- codeflash/languages/java/resources/CodeflashHelper.java: revert WAL/busy_timeout PRAGMAs
- codeflash-java-runtime/src/main/java/com/codeflash/Comparator.java: revert busy_timeout PRAGMA
- codeflash-java-runtime/src/main/java/com/codeflash/ResultWriter.java: revert WAL/busy_timeout PRAGMAs
- codeflash/languages/java/resources/codeflash-runtime-1.0.0.jar: restored to pre-change JAR
- tests/test_languages/test_java/test_instrumentation.py: remove TestSQLiteLockedFix
  class and revert snapshot strings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:43:20 +00:00
misrasaurabh1
de0fb91cd3 fix(java): kill entire process group on Maven timeout to prevent orphaned JVMs
Root cause of 'database is locked' errors:
- When Maven times out, subprocess.run() only kills the Maven parent process
- On Linux, Maven's forked Surefire JVM children become orphaned (not killed)
- Orphaned JVMs keep the SQLite result file open, causing SQLITE_BUSY when
  Python reads the file immediately after Maven is killed

Fix: Replace subprocess.run() with _run_cmd_kill_pg_on_timeout() which uses
start_new_session=True + os.killpg() to kill the entire process group on
timeout, ensuring no orphaned JVMs are left behind.

Applied to: _compile_tests, _get_test_classpath, _run_tests_direct,
and _run_maven_tests (the main one).

Also adds 5 unit tests verifying:
- Successful commands return correct output
- Failing commands propagate returncode
- Child processes are killed (not orphaned) on timeout
- returncode is -2 on timeout
- Timeout is described in stderr

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:24:56 +00:00
misrasaurabh1
1fdbd2c04b fix(java): resolve SQLite 'database is locked' errors across the pipeline
Root cause: The instrumented test JVM holds a SQLite connection open while
writing results.  The Python reader and the Java Comparator were trying to
read the same file without a busy_timeout, causing immediate SQLITE_BUSY
failures (~126 occurrences in codeflash_all_3.log).

Fixes applied:

1. instrumentation.py (_generate_sqlite_write_code):
   Emit PRAGMA journal_mode=WAL and PRAGMA busy_timeout=30000 right after
   each inline connection open.  WAL mode lets readers see the last committed
   state while a writer is active; busy_timeout makes lock collisions retry
   instead of immediately failing.

2. parse_test_output.py (parse_sqlite_test_results):
   Add timeout=30 to sqlite3.connect() so Python waits up to 30 s for a
   transient lock to clear (default was 5 s, which was too short for a busy
   Maven/JVM process).

3. Comparator.java (readTestResults):
   Execute PRAGMA busy_timeout=30000 on the same connection before running
   the SELECT, so the Java Comparator also retries instead of failing with
   [SQLITE_BUSY].

4. CodeflashHelper.java (initializeDatabase) and ResultWriter.java (constructor):
   Same WAL + busy_timeout PRAGMAs added after the initial getConnection() call
   for the long-lived database connections used by these helper classes.

5. Updated codeflash-runtime-1.0.0.jar (rebuilt after Comparator/ResultWriter fix).

tests: add TestSQLiteLockedFix with two assertions —
  • _generate_sqlite_write_code emits PRAGMA journal_mode=WAL and
    PRAGMA busy_timeout=30000 before CREATE TABLE
  • parse_sqlite_test_results uses timeout= in sqlite3.connect()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:04:38 +00:00
Kevin Turcios
881b4e9b35 fix: resolve merge conflicts and fix test failures
- Fix MockTestConfig missing tests_project_rootdir field
- Fix Python line profiler existence check on wrong path (no .lprof suffix)
- Add --release 11 to javac in Java line profiler tests for JDK compat
- Resolve merge conflicts with omni-java (replacement, tests)
- Add replace_function_definitions method to JavaSupport
- Guard against wrong method names in optimized code (standalone + class)
- Add tests for anonymous inner class method hoisting
2026-03-02 19:48:06 -05:00
misrasaurabh1
aae13a8e69 feat: skip inner-class methods in Java discovery; revert replacement-level inner-class workarounds
- parser.py: add `is_class_nested` flag to `JavaMethodNode`; track
  `class_depth` in `_walk_tree_for_methods` (incremented each time a
  type declaration is entered) and set `is_class_nested = True` when
  depth ≥ 2 (method lives inside a nested/inner class)

- discovery.py: add early-exit in `_should_include_method` when
  `method.is_class_nested` is True — inner-class methods cannot be
  reliably instrumented or tested in isolation, so we skip them up-front
  rather than wasting LLM tokens on candidates that will always be
  rejected later

- replacement.py: revert Bug-4 replacement-level workarounds that are
  now obsolete:
  * remove `target_class_name` parameter from `_parse_optimization_source`
  * restore simple first-match `break` in target-method selection
  * remove class_name filter that blocked helpers from "other" classes

- tests: update `TestNestedClasses`, `TestExtractCodeContextWithInnerClasses`
  to reflect the new no-inner-class-discovery contract; remove
  `TestInnerClassHelperFilter` (superseded by discovery filter);
  add `TestInnerClassMethodFilter` in test_discovery.py with four
  scenarios covering static nested, non-static inner, outer-only, and
  deeply-nested classes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:47:01 +00:00
Kevin Turcios
dc25d12e3c one more 2026-03-02 19:38:11 -05:00
Kevin Turcios
fe3e2f60c0 11 as per workflow 2026-03-02 19:25:30 -05:00
Kevin Turcios
f79cd9342a release 2026-03-02 19:20:44 -05:00
misrasaurabh1
05a9b61478 fix: replace modified constructors when LLM adds new final fields (Java)
When the LLM optimises a method by introducing a new final field (e.g.
caching Arrays.hashCode in Expression.hashCode, or caching map.values() in
LuaMap.valuesIterator), it also modifies the class constructors to initialise
the field.  Previously codeflash:

  1. Added the new field to the class ✓
  2. Replaced the target method ✓
  3. Did NOT update the constructors ✗

This caused "variable X might not have been initialized" compilation errors.

Changes:
- `JavaAnalyzer.find_constructors` (+ `_walk_tree_for_constructors`,
  `_extract_constructor_info`): new parser methods to locate
  `constructor_declaration` nodes via tree-sitter.
- `JavaMethodNode.formal_parameters_text`: captures the raw parameter list
  text so constructors can be matched by signature.
- `ParsedOptimization.modified_constructors`: new field to carry constructor
  source texts that need to be replaced.
- `_parse_optimization_source`: extract constructors from the same class as
  the target method and store in `modified_constructors`.
- `_replace_constructors`: new helper that replaces constructors in the
  original source by matching on formal parameter signature.
- `replace_function`: call `_replace_constructors` after the main method
  replacement when `modified_constructors` is non-empty.

Fixes regressions observed in codeflash_all_3.log:
  LuaMap.valuesIterator, Expression.hashCode, Bin.hashCode,
  NettyTlsContext.createHandler, Pool.capacity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:03:38 +00:00
Kevin Turcios
626be4d337 use the class parser directly 2026-03-02 19:00:42 -05:00
misrasaurabh1
a880ffc48a fix: skip outer-class methods when target is in a static inner class (Java)
When the optimisation target lives in a static inner class (e.g.
ObjectUnpacker inside Unpacker<T>), the LLM-generated class often wraps the
inner class inside the full outer class.  Previously, methods belonging to the
outer class were extracted as "helpers" and injected into the inner class,
causing compilation errors:

  - "non-static type variable T cannot be referenced from a static context"
  - "non-static variable offset cannot be referenced from a static context"

Two related fixes:

1. When _parse_optimization_source extracts helpers, it now skips any method
   whose class_name differs from the target method's class_name.

2. The function now accepts an optional target_class_name parameter.  When
   there are multiple methods with the same name in the generated code (e.g.
   an abstract outer-class method and the concrete inner-class override), the
   method in the target class is preferred over outer-class methods.

Fixes the Unpacker.ObjectUnpacker.getString regression from codeflash_all_3.log.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:59:31 +00:00
misrasaurabh1
cd02dec79f test: use full string equality in anonymous iterator test
Replace substring-based assertions with a single exact string
comparison in test_anonymous_iterator_methods_not_hoisted_to_class,
matching the convention used elsewhere in the test file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:38:42 +00:00
Kevin Turcios
e67cdb952e fix paths 2026-03-02 17:33:11 -05:00
Kevin Turcios
399441edd2 one more 2026-03-02 17:08:41 -05:00
Kevin Turcios
130953aab1 one more 2026-03-02 17:00:16 -05:00
Kevin Turcios
2a2b42194e fix: resolve remaining Java test failures
- Fix language detection in code_replacer to use lang_support.language
  (was None when function_to_optimize absent, blocking Java class member insertion)
- Update discover_functions calls in test_integration.py to pass source param
- Remove inner_iterations kwarg from test_run_and_parse.py (handled internally)
2026-03-02 16:33:53 -05:00
Kevin Turcios
67b422ffd3 fix: add @property to JavaSupport.function_optimizer_class, prek format fixes 2026-03-02 16:24:10 -05:00
Kevin Turcios
754727c8f2 fix: resolve e2e failures (path, pass_fail_only, Java context, codeflash.toml)
- Use os.path.relpath for main.py path in e2e tests
- Remove pass_fail_only kwarg from JS/Java function optimizers
- Fix Java e2e test to use JavaFunctionOptimizer for code context
- Detect codeflash.toml in e2e test runner (not just pyproject.toml)
2026-03-02 16:08:21 -05:00
Kevin Turcios
61bcc37449 Update test_java_e2e.py 2026-03-02 16:04:03 -05:00
Kevin Turcios
83831ac25e fix: resolve e2e test path and pass_fail_only issues
- Use os.path.relpath for main.py path (works for any cwd depth)
- Remove pass_fail_only kwarg from JS/Java compare_test_results fallback
  (main removed this parameter from equivalence.compare_test_results)
2026-03-02 16:01:46 -05:00
Kevin Turcios
f7fd593de3 fix: resolve remaining test failures after main sync
- Fix min/max_outer_loops → pytest_min/max_loops in Java test_run_and_parse
- Update test_replacement.py for new replace_function_definitions_for_language API
- Update JavaSupport.discover_functions signature to match protocol
- Migrate _get_java_sources_root/_fix_java_test_paths to JavaFunctionOptimizer
- Fix test_java_tests_project_rootdir to use set_current_language
2026-03-02 15:47:23 -05:00
Kevin Turcios
e7687f2448 fix: rename min/max_outer_loops to pytest_min/max_loops and add Java cleanup patterns
Omni-java tests used renamed params that don't match main's API.
Also adds Java instrumented file patterns to leftover cleanup regex.
2026-03-02 15:40:26 -05:00
Kevin Turcios
a14bd09fdc fix: update Java tests for protocol-dispatch refactoring
Updates canary test to check JavaFunctionOptimizer instead of base
function_optimizer (comparison logic moved to subclass). Renames
min/max_outer_loops back to pytest_min/max_loops to match main's API.
2026-03-02 15:36:40 -05:00
Kevin Turcios
f37b37209c fix: update Java test_replacement import for moved code_replacer module 2026-03-02 15:32:57 -05:00
Kevin Turcios
bd3ec8f09d test: sync dual-changed test files from main with omni-java fixes
Updates inject_profiling_into_existing_test calls to include test_string
parameter. Takes main's test refactoring for multi-file code replacement
and codeflash capture.
2026-03-02 15:30:16 -05:00
Kevin Turcios
19bd6e4bad test: sync test files from main (safe, main-only changes)
34 test files updated with main's refactored tests for new language
support protocol, JS/TS improvements, and code context extraction.
2026-03-02 15:25:50 -05:00
Sarthak Agarwal
c53740df2e
Merge branch 'main' into fix/jest-junit-and-misc 2026-03-02 22:46:13 +05:30
Sarthak Agarwal
d5dba8ce71 add mocha support 2026-03-02 22:44:48 +05:30
Sarthak Agarwal
b0afe2ef9c feat: add skip_confirm and skip_api_key params to JS init for non-interactive mode
Allow init_js_project(), should_modify_package_json_config(), and
collect_js_setup_info() to run without interactive prompts when
skip_confirm=True. Uses auto-detected defaults instead of prompting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:03:46 +05:30
Sarthak Agarwal
0490b221f2 fix: raise clear error for unsupported JS test frameworks instead of silent fallback
Add NotImplementedError guard in all 3 test dispatchers (behavioral,
benchmarking, line-profile) for frameworks other than jest and vitest.
Previously, mocha and other frameworks silently fell through to Jest,
causing confusing failures. Now users get a clear error message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:03:13 +05:30
Sarthak Agarwal
4bf8cb7412 feat: discover object methods exported via CJS module.exports = variable
Resolve `module.exports = varName` where varName is an object literal
containing methods. For patterns like `const utils = { match() {} };
module.exports = utils;`, the individual methods are now recognized as
exported. This fixes function discovery for CJS libraries like Moleculer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:01:56 +05:30
Sarthak Agarwal
76d994e87c feat: discover const arrow functions exported via named export clauses
Post-process find_functions() to mark functions as exported when they appear
in named export clauses like `export { joinBy }`. This fixes discovery for
TypeScript codebases (e.g., Strapi) that define const arrow functions and
export them via a separate export statement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:00:33 +05:30
Sarthak Agarwal
3c2a2b3694 feat: bundle JUnit XML reporter for Jest, replacing external jest-junit dependency
Ship a zero-dependency jest-reporter.js inside the codeflash runtime package
instead of requiring the external jest-junit npm package. This ensures the
reporter is always available when codeflash is installed, fixing Jest-based
projects (Strapi, Moleculer) that failed because jest-junit wasn't installed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:59:41 +05:30
Kevin Turcios
04a94f2b03 test: update tests for refactored language support
- Update discover_functions calls to new (source, file_path) signature
- Use language-specific FunctionOptimizer subclasses in tests
- Add explicit utf-8 encoding to read_text()/write_text() for Windows
- Fix pytest fixture in TestTsJestSkipsConversion (was __init__)
- Update nonexistent file tests for source-based discover_functions
- Remove unused imports
2026-03-02 06:09:06 -05:00
Kevin Turcios
b6af185998 fix: split discovery and instrumentation log messages for E2E harnesses
Log "Discovered N existing unit test files" after counting tests, and
"Instrumented N existing unit test files" after injecting profiling.
Python E2E harness matches "Discovered", JS harness matches "Instrumented".
2026-03-02 02:16:50 -05:00
Kevin Turcios
6a916ac83f fix: address review feedback on PythonFunctionOptimizer extraction
- Add clarifying comment on shared replace_function_definitions_in_module import
- Remove misleading alias in test_unused_helper_revert.py, use PythonFunctionOptimizer directly
- Align base line_profiler_step return type to dict[str, Any]
- Fix latent bug: handle non-empty TestResults in line_profiler_step
2026-03-01 23:51:43 -05:00
Kevin Turcios
a55841b978 fix: use PythonFunctionOptimizer in tests that depend on Python-specific hooks 2026-03-01 23:22:19 -05:00
misrasaurabh1
2371540386 fix: guard Java replacement against wrong-method-name candidates and anonymous-class method hoisting
Two bugs in _parse_optimization_source (replacement.py) caused Maven compilation
failures when codeflash optimised aerospike-client-java:

Bug 1 – standalone method with wrong name replaces target
When the LLM generated a standalone method whose name did not match the
optimisation target (e.g. generated `unpackMap` for target `unpackObjectMap`,
or generated `sizeTxn` for target `estimateKeySize`), the function fell back to
using the entire generated snippet as `target_method_source`.  This silently
replaced the target with the wrong method, producing:
  • a duplicate definition of the wrong method
  • removal of the target method (breaking all callers)

Fix: after parsing standalone (class-free) code, verify that at least one
discovered method matches the target name.  If no match is found, set
`target_method_source` to the empty string and log a warning.  A corresponding
guard in `replace_function` returns the original source unchanged when
`target_method_source` is empty.

The same guard is applied to the full-class path: if the generated class does
not contain the target method, the candidate is also rejected.

Bug 2 – anonymous inner-class methods hoisted as top-level helpers
When an optimised method returned an anonymous class (e.g. `keySetIterator`
returning `new Iterator<LuaValue>() { … }`), tree-sitter's recursive walk
found the anonymous class's `hasNext`, `next`, and `remove` method_declaration
nodes and classified them as helpers to be inserted at the outer-class level.
The inserted methods carried `@Override` annotations that matched nothing in the
outer class and referenced local variables (`it`) that were only in scope inside
the optimised method body, producing compilation errors.

Fix: when extracting helpers from the optimised class, skip any method whose
line range is entirely contained within the target method's line range.  Such
methods belong to anonymous/nested classes inside the method body and must not
be hoisted out as standalone class members.

Tests added for both bugs in TestWrongMethodNameGeneration and
TestAnonymousInnerClassMethods.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 03:35:42 +00:00
Kevin Turcios
64de536471 fix: restore base replace_function_and_helpers, fix test imports, move ast to TYPE_CHECKING
- Base class keeps the language-routing replacement logic (used by both
  Python and JS); Python subclass adds unused-helper revert on top via super()
- Tests that exercise Python-specific replace+revert use PythonFunctionOptimizer
- Move `ast` to TYPE_CHECKING in optimizer.py (fixes prek)
2026-03-01 21:30:19 -05:00
misrasaurabh1
e1fb4b81e8 fix runtime calculation for java 2026-02-28 21:35:45 -08:00
Saurabh Misra
86202d40e5
Merge pull request #1690 from codeflash-ai/fix/comparator-itertools-count
fix: handle itertools types in comparator with Python 3.9-3.14 support
2026-02-27 15:21:59 -08:00
aseembits93
3a33fe43a4 fix: support Python 3.10-3.14 in comparator itertools tests
Handle itertools.cycle on Python 3.14 where __reduce__ was removed by
falling back to element-by-element sampling. Add version guards for
pairwise (3.10+) and batched (3.12+) tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 00:05:36 +05:30
aseembits93
eeda6c2d32 fix: handle all remaining itertools types in comparator
Add a catch-all handler for itertools iterators (chain, islice, product,
permutations, combinations, starmap, accumulate, compress, dropwhile,
takewhile, filterfalse, zip_longest, groupby, pairwise, batched, tee).
Uses module check (type.__module__ == "itertools") so it automatically
covers any itertools type without version-specific enumeration. groupby
gets special handling to also materialize its group iterators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:14:59 +05:30
aseembits93
456a18837b fix: handle itertools.repeat and itertools.cycle in comparator
itertools.repeat uses repr() comparison (same approach as count).
itertools.cycle uses __reduce__() to extract internal state (saved items,
remaining items, and first-pass flag) since repr() only shows a memory
address. The __reduce__ approach is deprecated in 3.14 but is the only
way to access cycle state without consuming elements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:06:46 +05:30
aseembits93
12e8f7009c fix: handle itertools.count in comparator
The comparator had no handler for itertools.count (an infinite iterator),
causing it to fall through all type checks and return False even for
equal objects. Use repr() comparison which reliably reflects internal
state and avoids the __reduce__ deprecation coming in Python 3.14.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:01:06 +05:30
Kevin Turcios
91cf6ea2aa fix: update tests for consolidated CST discovery behavior
Nested functions are now skipped by FunctionVisitor, and
discover_functions no longer swallows parse/IO errors — callers
handle them. Update test expectations accordingly.
2026-02-27 12:08:42 -05:00
Kevin Turcios
94755489e7 should be tests 2026-02-27 09:23:11 -05:00
Kevin Turcios
5cee1b5b48 feat: improve test generation context for external library types
Extend extract_parameter_type_constructors to scan function bodies for
isinstance/type() patterns and collect base class names from enclosing
classes. Add one-level transitive stub extraction so the LLM also sees
constructor signatures for types referenced in __init__ parameters.

In enrich_testgen_context, branch on source: project classes get full
definitions, third-party (site-packages) classes get compact __init__
stubs to avoid blowing token limits.
2026-02-26 09:40:04 -05:00
HeshamHM28
9b6d645ab6 fix: update profiling logic and improve loop index handling in tests 2026-02-25 07:43:14 +02:00
HeshamHM28
24d38b6fae Merge branch 'omni-java' into fix/java/line-profiler 2026-02-25 07:09:00 +02:00
aseembits93
5c829cd4de test: compare final_content to complete expected output string
Replace substring assertions with exact equality check against the full
expected output (EXPECTED_OUTPUT constant). Extract shared setup into a
run_replacement helper.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:16:59 +05:30
aseembits93
14bdc3c1cf fix: detect attribute-referenced methods as used in unused helper detection
detect_unused_helper_functions only walked ast.Call nodes, missing methods
referenced via attribute assignment (e.g., self._parse1 = self._parse_literal).
This caused optimized helper methods used as callbacks to be incorrectly
reverted to their original code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:09:51 +05:30
Kevin Turcios
f747b66252 missed there 2026-02-23 05:52:43 -05:00
Kevin Turcios
94a99a980f fix failing test 2026-02-23 05:43:16 -05:00
Kevin Turcios
d8582c328a fix: handle __slots__-only objects in comparator
Objects with __slots__ but no __dict__ (e.g. textual.cache.LRUCache)
fell through all comparator branches, logging "Unknown comparator input
type" and returning False — causing spurious test mismatches.
2026-02-23 05:29:40 -05:00
misrasaurabh1
62db2360b5 fix: use correct iteration_id format for Java performance mode
Changed iteration_id in performance mode markers to properly encode
inner loop iterations for test case grouping:

- Single call: iteration_id = innerIteration (0, 1, 2...)
- Multiple calls: iteration_id = callId_innerIteration (1_0, 1_1, 2_0, 2_1...)

This allows test results to be properly grouped by InvocationId, where
each unique (call, inner_iteration) pair gets its own group for
calculating minimum runtimes across outer loops.

Fixed test expectations to match the new format.

All 43 Java performance tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:42:26 -08:00
Kevin Turcios
a2764084bd test: add regression tests for comment position preservation 2026-02-23 01:12:49 -05:00
Kevin Turcios
6c4378db51 fix: preserve comment position by passing CST module directly to import adder
parse_code_and_prune_cst now returns cst.Module instead of str.
add_needed_imports_from_module accepts cst.Module | str, skipping re-parse
when a Module is passed. This eliminates the string round-trip that caused
comments to migrate from statement leading_lines to Module.header,
resulting in comments appearing above imports instead of at their
original position.
2026-02-23 01:08:39 -05:00
misrasaurabh1
67c4d34813 fix: Java loop ID calculation and assertion transformer bug
Implemented CUDA-style loop ID calculation for performance mode:
- loopId = outerLoop * maxInnerIterations + innerIteration
- Behavior mode uses simple loop index (no inner iterations)
- Invocation ID simplified to call counter only
- Default CODEFLASH_INNER_ITERATIONS set to 10

Fixed critical bug in JavaAssertTransformer:
- Removed duplicate _special_re assignment that was missing parentheses
- Combined patterns into single regex: [\"'{}()]
- This fixes _find_balanced_parens and enables assertion transformation

Updated test expectations to match new marker format and loop ID calculation.

All 41 Java instrumentation tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 21:04:15 -08:00
Kevin Turcios
1689a7bbb5 perf: cache module scan in _clear_lru_caches and expand test coverage
Cache inspect.getmembers() results per module so repeated loop
iterations skip the expensive rescan. Add tests for get_runtime_from_stdout,
should_stop, _set_nodeid, _get_total_time, _timed_out, logreport, and
setup/teardown hooks.
2026-02-22 01:17:05 -05:00
Kevin Turcios
5a73fe2101
Merge pull request #1617 from codeflash-ai/comparator-uniontype
fix: add types.UnionType support to comparator
2026-02-22 05:09:01 +00:00
Kevin Turcios
c6fbdfa535 chore: merge main into fixes-for-core-unstructured-experimental 2026-02-21 00:57:33 -05:00
Kevin Turcios
c1703a2d71 Revert "commit"
This reverts commit 2966e15775.
2026-02-21 00:50:31 -05:00
Kevin Turcios
2966e15775 commit
feat: extend testgen type context to include function body references

Extract types referenced in the function body (constructor calls, attribute
access, isinstance/issubclass args) in addition to parameter annotations.
Use full class extraction instead of init-stub-only, with instance resolution
fallback and project/site-packages filtering.
2026-02-21 00:50:04 -05:00
misrasaurabh1
7fa7eeabfe instrumentation bugs with multiple function calls 2026-02-20 21:16:07 -08:00
HeshamHM28
49f89fbe2c merge to main 2026-02-21 01:49:31 +02:00
Mohamed Ashraf
f06acba354 fix: add test method name to Java stdout markers for unique identification
Java stdout markers now include the test method name in the class field
(e.g., "TestClass.testMethod") matching the Python marker format. The
parser extracts the test method name from this combined field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 20:13:04 +00:00
Mohamed Ashraf
0001fb5921 fix: store actual test method name in SQLite for Java behavior tests
The instrumented Java test code was storing "{class_name}Test" as the
test_function_name in SQLite instead of the actual test method name
(e.g., "testAdd"). This fixes parity with Python instrumentation.

- Add _extract_test_method_name() with compiled regex patterns
- Inject _cf_test variable with actual method name in behavior code
- Fix setString(3, ...) to use _cf_test instead of hardcoded class name
- Optimize _byte_to_line_index() with bisect.bisect_right()
- Update all behavior mode test expectations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 20:12:58 +00:00
aseembits93
002189582b fix: add types.UnionType support to comparator
The comparator did not recognize `types.UnionType` (Python 3.10+ `X | Y`
syntax), causing it to fall through to "Unknown comparator input type".
Conditionally include it in the equality-checked types tuple.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 00:30:27 +05:30
Kevin Turcios
0dbb3a47a2 not used 2026-02-20 08:47:27 -05:00
misrasaurabh1
eb7c1f00d5 more lenient testing 2026-02-20 00:26:48 -08:00
misrasaurabh1
c9cb60a21d test: relax Java timing tolerances to account for JIT warmup
Increase tolerance for individual timing measurements from ±2% to ±5%
to accommodate JIT warmup effects where first iterations run slower
than subsequent optimized runs. Maintain ±2% tolerance for
total_passed_runtime since it uses minimums that filter out cold starts.

- CV threshold: 0.02 → 0.05 (5%)
- Mean runtime: ±2% → ±5%
- total_passed_runtime: ±2% (unchanged, using filtered minimums)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 21:17:48 -08:00
HeshamHM28
5b00ab370b Refactor Java line profiler integration tests
- Rename test class to TestLineProfilerInstrumentation for clarity.
- Add tests for instrumenting Java classes with and without package declarations.
- Enhance instrumentation tests to verify that source files remain unmodified.
- Implement checks for generated configuration files, ensuring correct content and structure.
- Introduce tests for deeply nested packages and verify line contents extraction.
- Add end-to-end tests for spin-timer profiling, validating timing accuracy and hit counts.
2026-02-20 06:59:26 +02:00
misrasaurabh1
2353fb2b86 test: add comprehensive Java run-and-parse integration tests
Add end-to-end tests for Java test instrumentation, execution, and result
parsing, covering both behavior and performance testing modes.

Key additions:
- PreciseWaiter: monotonic timing implementation with <2% variance
- 3 behavior tests: single/multiple test methods, return value validation
- 2 performance tests: timing accuracy, inner/outer loop counts
- Validation of total_passed_runtime() aggregation

Infrastructure improvements:
- Add inner_iterations parameter to benchmarking call chain
- Rename pytest_* parameters to language-agnostic names:
  - pytest_min_loops → min_outer_loops
  - pytest_max_loops → max_outer_loops
  - pytest_inner_iterations → inner_iterations
- Pass inner_iterations from tests through function_optimizer → test_runner → language_support

All tests validate timing accuracy (±2%), variance (<2% CV), and correct
result grouping by test case including iteration_id.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 20:57:29 -08:00
claude[bot]
f321f836f5 fix: add from __future__ import annotations for Python 3.9 compat
The `list[X] | None` union syntax (PEP 604) requires Python 3.10+ at
runtime. Adding the future annotations import defers evaluation and
fixes the import error on Python 3.9.

Co-authored-by: Saurabh Misra <misrasaurabh1@users.noreply.github.com>
2026-02-20 03:32:33 +00:00