Commit graph

1611 commits

Author SHA1 Message Date
Kevin Turcios
e01236eb29 Merge branch 'main' into testgen-review 2026-03-06 01:02:06 -05:00
Kevin Turcios
5d872e845d
Merge pull request #1650 from codeflash-ai/fix/unused-helper-attribute-refs
fix: detect attribute-referenced methods as used in unused helper detection
2026-03-05 23:55:33 +00:00
Mohamed Ashraf
50957395a9 feat: centralize JAR version, cache runtime setup, add pom.xml backup/restore
- Extract CODEFLASH_RUNTIME_VERSION and CODEFLASH_RUNTIME_JAR_NAME constants
  in build_tools.py, replacing 15+ hardcoded "1.0.0" references across
  test_runner.py, comparator.py, and line_profiler.py
- Cache _ensure_codeflash_runtime() results so it runs once per optimization
  instead of 3 times (behavioral, benchmarking, line profiling phases)
- Add backup_pom/restore_pom/restore_all_pom_backups to build_tools.py so
  pom.xml modifications (codeflash-runtime dependency, JaCoCo plugin) are
  always reverted after optimization completes, even on crashes
- Call restore_all_pom_backups() in function_optimizer.py's finally block

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 23:17:09 +00:00
Kevin Turcios
6330005999 test: update expected markdown ordering to match target-file-first change 2026-03-05 07:04:03 -05:00
claude[bot]
6ad7ea49f6 fix: add missing -> None return type annotations to test functions
Co-authored-by: Kevin Turcios <undefined@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 11:18:54 +00:00
Kevin Turcios
1bcea313a0 test: add unit tests for get_optimized_code_for_module fallback logic 2026-03-05 06:06:21 -05:00
Sarthak Agarwal
ae6b1a3b5f
Merge branch 'main' into fix/js-vitest-benchmarking-and-mocha-cjs 2026-03-04 16:49:43 +05:30
Sarthak Agarwal
f0e4e5326d max loop count in test support 2026-03-04 16:48:38 +05:30
Kevin Turcios
f9e7f2a82a fix: skip codeflash_capture instrumentation for dataclasses without explicit __init__
Dataclass __init__ is auto-generated at class creation time and not
present in the AST. The instrumentor was injecting a synthetic __init__
with super().__init__(*args, **kwargs) which calls object.__init__()
and fails because dataclass fields are passed as kwargs.

Now only skips when the class is a @dataclass AND has no explicit
__init__. Dataclasses with custom __init__ are still instrumented.
2026-03-04 04:47:30 -05:00
misrasaurabh1
df57235d25 test: use full string equality in Java runtime comments tests
Replace substring assertions (e.g. `"// 2.89ms ->" in lines[7]`) with
exact full-output comparisons for better regression detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 01:02:24 -08:00
misrasaurabh1
9367a86771 fix: remove dead Java path-fixing code from base FunctionOptimizer
The base class had duplicate _get_java_sources_root and _fix_java_test_paths
methods that were overridden by JavaFunctionOptimizer. The base class also
had an is_java() block in generate_and_instrument_tests that used undefined
variables (used_behavior_paths, is_java). Removed all dead code since
JavaFunctionOptimizer.fixup_generated_tests handles this properly.

Also updated JavaFunctionOptimizer._fix_java_test_paths to accept
display_source parameter and use whole-word rename for collision handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 01:02:24 -08:00
misrasaurabh1
6ec4db446c Merge remote-tracking branch 'origin/omni-java' into generated-tests-in-pr
# Conflicts:
#	codeflash/languages/java/instrumentation.py
#	codeflash/optimization/function_optimizer.py
#	codeflash/verification/verifier.py
#	codeflash/version.py
#	tests/test_languages/test_java/test_instrumentation.py
#	tests/test_languages/test_java/test_java_test_paths.py
2026-03-04 00:26:46 -08:00
claude[bot]
77b705d2c7 fix: use forward slashes in jest-reporter test paths for Windows compatibility
Windows backslashes in paths embedded into JavaScript strings are
interpreted as escape sequences by Node.js, corrupting the module path.
Use .as_posix() to emit forward slashes which Node accepts on all platforms.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
2026-03-04 08:06:15 +00:00
claude[bot]
887e6384b6 fix: apply consistent conn safety pattern in trace benchmark tests
Initialize conn to None before try blocks and guard finally with
if conn is not None to prevent NameError if sqlite3.connect() raises.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
2026-03-04 07:37:03 +00:00
Kevin Turcios
2faef8ade6 fix: update JS max_loops test assertion to match constant (1_000)
PR #1764 changed JS_BENCHMARKING_MAX_LOOPS from 100_000 to 1_000 but
the test was updated to assert 5_000 instead of 1_000.
2026-03-04 02:26:53 -05:00
claude[bot]
59ec38eec4 fix: resolve mypy type error for conn variable in test_trace_benchmarks
Initialize conn as Optional before try block to allow None assignment.

Co-authored-by: Kevin Turcios <undefined@users.noreply.github.com>
2026-03-04 07:15:22 +00:00
Kevin Turcios
17730663ec fix: close SQLite connections in finally blocks for Windows compatibility
Ensures SQLite connections are always closed before file cleanup to
prevent PermissionError on Windows where open handles block file deletion.
2026-03-04 02:12:18 -05:00
Kevin Turcios
eceac13fc3 Merge remote-tracking branch 'origin/main' into omni-java
# Conflicts:
#	.claude/rules/architecture.md
#	.claude/rules/code-style.md
#	.github/workflows/claude.yml
#	.github/workflows/duplicate-code-detector.yml
#	codeflash/api/aiservice.py
#	codeflash/cli_cmds/console.py
#	codeflash/cli_cmds/logging_config.py
#	codeflash/code_utils/deduplicate_code.py
#	codeflash/discovery/discover_unit_tests.py
#	codeflash/languages/base.py
#	codeflash/languages/code_replacer.py
#	codeflash/languages/javascript/mocha_runner.py
#	codeflash/languages/javascript/support.py
#	codeflash/languages/python/support.py
#	codeflash/optimization/function_optimizer.py
#	codeflash/verification/parse_test_output.py
#	codeflash/verification/verification_utils.py
#	codeflash/verification/verifier.py
#	packages/codeflash/package-lock.json
#	packages/codeflash/package.json
#	tests/languages/javascript/test_support_dispatch.py
#	tests/test_codeflash_capture.py
#	tests/test_languages/test_javascript_test_runner.py
#	tests/test_multi_file_code_replacement.py
2026-03-04 01:52:32 -05:00
Kevin Turcios
dbc04df9df Update test_support_dispatch.py 2026-03-04 01:32:35 -05:00
Kevin Turcios
2fb0145895 Merge remote-tracking branch 'origin/omni-java' into merge/misc-fixes-into-omni-java
# Conflicts:
#	codeflash/api/aiservice.py
#	codeflash/languages/base.py
#	codeflash/languages/java/support.py
#	codeflash/languages/javascript/support.py
#	codeflash/languages/python/support.py
#	codeflash/verification/verifier.py
2026-03-04 01:23:39 -05:00
HeshamHM28
8287f96f05
Merge pull request #1680 from codeflash-ai/feat/java/wire-language-version
feat: add language version support across multiple language implement…
2026-03-03 22:12:54 -08:00
Kevin Turcios
b43b37b6dd fix: update nested class replacement test to match PR #1726 design
Inner-class methods are intentionally skipped by Java discovery
(PR #1726) since instrumentation is name-only and not class-aware.
Update test to expect False from replacement.
2026-03-04 00:29:46 -05:00
Kevin Turcios
ca149fa2d0 fix: use relpath for main.py in E2E test utilities
Take omni-main-java's fix for E2E test runner path resolution —
uses os.path.relpath from __file__ instead of hardcoded relative path.
Also adds codeflash.toml detection for Java projects.
2026-03-04 00:19:02 -05:00
Sarthak Agarwal
12294cafb6 fix looping with JS/TS 2026-03-04 10:46:44 +05:30
Kevin Turcios
92ab600edc fix: resolve remaining test failures from main sync
- Update test_inject_profiling_used_frameworks, test_async_run_and_parse,
  test_pickle_patcher to use new inject_profiling_into_existing_test API
  (test_string param removed)
- Add parse_line_profile_results function to parse_line_profile_test_output
  module (imported by main's PythonFunctionOptimizer and test_instrument_tests)
2026-03-04 00:13:27 -05:00
Kevin Turcios
af7ce7fce2 fix: cherry-pick main improvements into omni-java branch
- Take main's JS improvements: Mocha CJS support, ESM/CJS handling,
  sanitize_mocha_imports, vitest benchmarking fixes
- Update instrument_existing_test API: remove test_string param, read from
  file internally (aligned across Python, JS, Java support classes)
- Take main's equivalence.py with pass_fail_only parameter
- Take main's models.py, critic.py, env_utils.py, replay_test.py fixes
- Take main's PythonFunctionOptimizer parse_line_profile fix
- Skip files where our branch has Java-specific additions main doesn't
  have (create_pr, discover_unit_tests, parse_test_output, optimizer,
  verification_utils, config_parser, cmd_init, detector, support.py
  protocol methods)
2026-03-03 23:59:26 -05:00
Kevin Turcios
bccc02aade merge: incorporate omni-main-java sync work
Merges the omni-main-java branch which synced main into omni-java,
including JavaFunctionOptimizer, removal of is_java()/is_python() guards,
protocol dispatch for parse_test_xml, and deletion of concolic_testing.py.
2026-03-03 23:42:39 -05:00
Kevin Turcios
c550bf1561 fix: restore QName test to match omni-java's enrich_testgen_context behavior
The cherry-picked test fix from main assumed stdlib classes are extracted,
but omni-java's implementation still skips them.
2026-03-03 22:27:44 -05:00
Kevin Turcios
a0249afc7e fix: use PythonFunctionOptimizer in tests that depend on Python-specific hooks 2026-03-03 22:22:14 -05:00
Kevin Turcios
c11f52321e fix: correct pre-existing test failures in test_code_context_extractor
Fix 10 failing tests: remove wrong assertions expecting import statements
inside extracted class code, use substring matching for UserDict class
signature, and rewrite click-dependent tests as project-local equivalents.
Add tests for resolve_instance_class_name, enhanced extract_init_stub_from_class,
and enrich_testgen_context instance resolution.
2026-03-03 22:22:00 -05:00
mashraf-222
82e4627b03
Merge pull request #1740 from codeflash-ai/fix/java-comparator-and-errorprone
fix: suppress Error Prone CheckReturnValue in instrumented tests and fix stale pom dependency
2026-03-04 05:21:24 +02:00
misrasaurabh1
26f2c502fb feat: add inline runtime comments and unique _cf_mod/_cf_cls markers for Java tests
- Use instrumented class name in _cf_mod/_cf_cls markers to disambiguate
  existing vs generated tests sharing the same original class name
- Encode line number in invocation IDs (L{line}_{counter}) for deterministic
  call-site identification in inline runtime comments
- Rewrite add_runtime_comments() to annotate each call line with inline
  performance data instead of a summary block at top
- Strip assertions before instrumenting so both modes share the same base source
- Update test expected strings for new marker format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 18:58:20 -08:00
Kevin Turcios
ec73c2c692 revert: undo Vitest benchmarking and capture.js changes
Reverts ed2594bf which added --pool=forks to Vitest commands and
changed capture.js to use process.stdout.write and Vitest worker API.
These changes broke JS E2E tests (CJS, ESM, TS class) by altering
how all JS tests run, not just Vitest benchmarking.
2026-03-03 21:34:59 -05:00
Kevin Turcios
b2b8201541 fix: update test_get_helper_code expectation for blank line after imports 2026-03-03 21:10:35 -05:00
Kevin Turcios
fb6a4ab587 fix: update test expectations for extra blank line after imports
The CST comment position fix changed how blank lines are preserved
after import blocks, adding a PEP 8-compliant double blank line.
2026-03-03 20:58:32 -05:00
Sarthak Agarwal
f5d48841f0 fix mocha test runner 2026-03-03 20:50:50 -05:00
Sarthak Agarwal
4f2c65daec fix: strip CJS require('vitest') and require('@jest/globals') in Mocha tests
The AI backend generates vitest/jest-style imports for Mocha projects.
Our sanitize_mocha_imports() stripped ESM `import { ... } from 'vitest'`,
but process_generated_test_strings() runs BEFORE postprocessing and calls
ensure_module_system_compatibility() which converts these to CJS requires.
Result: `const { ... } = require('vitest')` survived sanitization.

Added regexes for the CJS variants of vitest and @jest/globals requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:49:04 -05:00
Sarthak Agarwal
74150c2a7f fix: resolve Vitest benchmarking showing wall-clock time instead of per-function timing
Root cause: Vitest performance tests reported "20.0 seconds over 1 loop"
(JUnit XML wall-clock fallback) instead of actual per-function nanosecond
timing. This was a chain of two issues:

1. **stdout interception**: Vitest's default `threads` pool intercepts
   process.stdout.write() and console.log(), preventing timing markers
   from flowing to the parent process. Fixed by adding `--pool=forks`
   to all Vitest commands and config files. The `forks` pool uses child
   processes where stdout flows directly to the parent.

2. **test name detection**: Even after markers flowed through (43,000+
   found in stdout), the parser couldn't match them to JUnit XML
   testcases because all markers had "unknown" as the test name. This
   happened because Vitest doesn't inject `beforeEach` as a global
   (unlike Jest), so capture.js's Jest-style hook to set
   `currentTestName` never fired.

   Fixed by adding Vitest-specific test name detection in capture.js:
   - Primary: `expect.getState().currentTestName` (full describe path)
   - Fallback: `__vitest_worker__.current.fullTestName`
   - Defense-in-depth: parser fallback matches "unknown" markers to
     the first testcase when no name match is found

Result: cheerio's `isHtml` went from "20.0s / 1 loop" to
"902μs / 20,853 loops" with proper speedup analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:48:59 -05:00
Sarthak Agarwal
b39f2a2e9b fix: support Mocha CJS projects and sanitize incorrect framework imports
Three related fixes for Mocha test generation in CommonJS projects:

1. inject_test_globals() now accepts module_system param — emits
   `require('node:assert/strict')` for CJS instead of ESM import syntax
2. ensure_module_system_compatibility() now converts ESM→CJS even when
   the source has mixed imports (was skipping when both ESM and CJS were
   detected, leaving the ESM import from inject_test_globals unconverted)
3. New sanitize_mocha_imports() strips vitest/jest/@jest/globals imports
   that the AI sometimes generates for Mocha projects — Mocha provides
   describe/it/before*/after* as globals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:48:53 -05:00
Sarthak Agarwal
15fcf4741d add mocha support 2026-03-03 20:48:39 -05:00
Sarthak Agarwal
e6b2f7975b feat: bundle JUnit XML reporter for Jest, replacing external jest-junit dependency
Ship a zero-dependency jest-reporter.js inside the codeflash runtime package
instead of requiring the external jest-junit npm package. This ensures the
reporter is always available when codeflash is installed, fixing Jest-based
projects (Strapi, Moleculer) that failed because jest-junit wasn't installed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:48:34 -05:00
Kevin Turcios
8f0ff84ecf fix: update tests for consolidated CST discovery behavior
Nested functions are now skipped by FunctionVisitor, and
discover_functions no longer swallows parse/IO errors — callers
handle them. Update test expectations accordingly.
2026-03-03 20:42:33 -05:00
Sarthak Agarwal
cb1ea5adc3 fix: use pattern matching for collocated tests in monorepos
When testsRoot overlaps moduleRoot (common in JS/TS monorepos like Ghost
where both point to "src"), the directory-based filter incorrectly
excluded ALL source files. Switch to filename/directory pattern matching
(*.test.*, *.spec.*, __tests__/) when roots overlap, preserving the
existing directory-based filter for standard layouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:41:45 -05:00
Sarthak Agarwal
095b8b76d0 feat: add skip_confirm and skip_api_key params to JS init for non-interactive mode
Allow init_js_project(), should_modify_package_json_config(), and
collect_js_setup_info() to run without interactive prompts when
skip_confirm=True. Uses auto-detected defaults instead of prompting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:54 -05:00
Sarthak Agarwal
a14ef40d15 fix: raise clear error for unsupported JS test frameworks instead of silent fallback
Add NotImplementedError guard in all 3 test dispatchers (behavioral,
benchmarking, line-profile) for frameworks other than jest and vitest.
Previously, mocha and other frameworks silently fell through to Jest,
causing confusing failures. Now users get a clear error message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:49 -05:00
Sarthak Agarwal
800ebed837 feat: discover object methods exported via CJS module.exports = variable
Resolve `module.exports = varName` where varName is an object literal
containing methods. For patterns like `const utils = { match() {} };
module.exports = utils;`, the individual methods are now recognized as
exported. This fixes function discovery for CJS libraries like Moleculer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:06 -05:00
Sarthak Agarwal
b3d4e225e5 feat: discover const arrow functions exported via named export clauses
Post-process find_functions() to mark functions as exported when they appear
in named export clauses like `export { joinBy }`. This fixes discovery for
TypeScript codebases (e.g., Strapi) that define const arrow functions and
export them via a separate export statement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:40:01 -05:00
Kevin Turcios
a89dc4b09b test: add regression tests for comment position preservation 2026-03-03 20:36:56 -05:00
Kevin Turcios
b5fab57499 fix: preserve comment position by passing CST module directly to import adder
parse_code_and_prune_cst now returns cst.Module instead of str.
add_needed_imports_from_module accepts cst.Module | str, skipping re-parse
when a Module is passed. This eliminates the string round-trip that caused
comments to migrate from statement leading_lines to Module.header,
resulting in comments appearing above imports instead of at their
original position.
2026-03-03 20:36:52 -05:00
Kevin Turcios
4cf8d31deb perf: cache module scan in _clear_lru_caches and expand test coverage
Cache inspect.getmembers() results per module so repeated loop
iterations skip the expensive rescan. Add tests for get_runtime_from_stdout,
should_stop, _set_nodeid, _get_total_time, _timed_out, logreport, and
setup/teardown hooks.
2026-03-03 20:36:42 -05:00
Kevin Turcios
d2454a250a fix: handle __slots__-only objects in comparator
Objects with __slots__ but no __dict__ (e.g. textual.cache.LRUCache)
fell through all comparator branches, logging "Unknown comparator input
type" and returning False — causing spurious test mismatches.
2026-03-03 20:34:04 -05:00
aseembits93
ed7601206b fix: add types.UnionType support to comparator
The comparator did not recognize `types.UnionType` (Python 3.10+ `X | Y`
syntax), causing it to fall through to "Unknown comparator input type".
Conditionally include it in the equality-checked types tuple.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:34:00 -05:00
aseembits93
706e33ed5a fix: support Python 3.10-3.14 in comparator itertools tests
Handle itertools.cycle on Python 3.14 where __reduce__ was removed by
falling back to element-by-element sampling. Add version guards for
pairwise (3.10+) and batched (3.12+) tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:56 -05:00
aseembits93
d7f24ad1ea fix: handle all remaining itertools types in comparator
Add a catch-all handler for itertools iterators (chain, islice, product,
permutations, combinations, starmap, accumulate, compress, dropwhile,
takewhile, filterfalse, zip_longest, groupby, pairwise, batched, tee).
Uses module check (type.__module__ == "itertools") so it automatically
covers any itertools type without version-specific enumeration. groupby
gets special handling to also materialize its group iterators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:52 -05:00
aseembits93
b13d2e1623 fix: handle itertools.repeat and itertools.cycle in comparator
itertools.repeat uses repr() comparison (same approach as count).
itertools.cycle uses __reduce__() to extract internal state (saved items,
remaining items, and first-pass flag) since repr() only shows a memory
address. The __reduce__ approach is deprecated in 3.14 but is the only
way to access cycle state without consuming elements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:47 -05:00
aseembits93
3041b0580b fix: handle itertools.count in comparator
The comparator had no handler for itertools.count (an infinite iterator),
causing it to fall through all type checks and return False even for
equal objects. Use repr() comparison which reliably reflects internal
state and avoids the __reduce__ deprecation coming in Python 3.14.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:43 -05:00
Ubuntu
1b3fa91a5c feat: Java testgen class name fix, remove per-test @Timeout, and wire language_version
- Add class_name and qualified_name to /testgen API payload so the backend
  has explicit access to computed FunctionToOptimize properties
- Add client-side _fix_java_test_class_name() to correct wrong class name
  references in LLM-generated Java test code
- Remove per-test @Timeout annotation from Java instrumentation (causes
  timing instability on CI runners; Maven Surefire handles timeouts)
- Remove redundant default_language_version, use language_version as canonical

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 01:06:37 +00:00
Mohamed Ashraf
62aaab87ac fix: pre-install multi-module Maven deps to avoid recompilation failures
Multi-module Maven projects like Guava fail on sequential Maven invocations
because compiler plugin 3.15.0's JDK-8318913 workaround patches module-info.class
timestamps, triggering unnecessary recompilation with -am that fails on partial
reactor rebuilds. This pre-installs deps to .m2 once, then drops -am from all
subsequent test commands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:37:32 +00:00
Mohamed Ashraf
b9135c5f6a fix: address PR review feedback for Error Prone and build_tools
- Remove redundant condition check in add_codeflash_dependency_to_pom
- Use lookahead-based regex to handle arbitrary XML element order in
  system-scope dependency replacement
- Broaden class declaration pattern to match final/abstract modifiers
- Add 7 unit tests for add_codeflash_dependency_to_pom including
  stale system-scope replacement and reordered XML elements
- Clarify comment about @SuppressWarnings in both modes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:00:14 +00:00
misrasaurabh1
6ad0d8b92b fix how tests looks like when PR is created 2026-03-03 14:47:25 -08:00
Sarthak Agarwal
bc5e3e878a fix mocha test runner 2026-03-04 03:42:10 +05:30
Mohamed Ashraf
2261e98953 fix: suppress Error Prone CheckReturnValue in instrumented tests and fix stale pom dependency
Add @SuppressWarnings("CheckReturnValue") to all generated instrumented test
classes. Projects using Error Prone (e.g. Guava) enforce CheckReturnValue as a
compiler error, which rejects our performance-only tests that intentionally
discard return values after assertion stripping.

Also fix add_codeflash_dependency_to_pom to detect and replace stale
system-scope dependencies left by previous runs with the correct test scope.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:33:57 +00:00
mashraf-222
dc7caa3151
Merge pull request #1651 from codeflash-ai/fix/java-comparator-vacuous-equivalence
fix: reject vacuous equivalence and deserialization error false matches in Java comparator
2026-03-03 22:15:21 +02:00
Mohamed Ashraf
e6ca15c36b test: update empty-table test to expect not-equivalent after vacuous guard
The test previously expected empty databases to return equivalent=True,
which was the exact bug being fixed. Updated to assert equivalent=False.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:40:06 +00:00
Mohamed Ashraf
cda56d1389 fix: handle JUnit 4 message-first assertEquals type inference
The type inference for assertEquals always used the first argument, but
JUnit 4's 3-arg overload is assertEquals(message, expected, actual).
When the first arg was a string message, the type was incorrectly inferred
as String instead of the actual expected value's type. Now detects the
message-first pattern and uses the second argument for type inference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:29:39 +00:00
Mohamed Ashraf
ecd9267b3f fix: update assertion removal tests for type inference and fix ruff lint
Update 41 test expectations in test_java_assertion_removal.py to match
the return type inference behavior introduced in commit 9e5880f0. Tests
now expect inferred types (int, boolean, String, double) instead of
Object for _cf_result variables.

Fix 2 ruff PLR1714 lint issues in remove_asserts.py by using set
membership tests instead of chained or comparisons.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:53:38 +00:00
Mohamed Ashraf
44fa2f8e16 test: update instrumentation test for assertion type inference
The behavior mode instrumentation test expected `Object _cf_result1`
but after the type inference fix, assertEquals(4, call()) now produces
`int _cf_result1 = (int)_cf_result1_1`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:41:40 +00:00
Mohamed Ashraf
9e5880f032 fix: infer Java return types in assertion transformer instead of using Object
The assertion transformer always declared `Object _cf_resultN = call()` when
replacing assertions, losing the actual return type. This caused compilation
failures when the result was used in a context expecting a primitive type
(e.g., int, boolean).

Now infers the return type from assertion context:
- assertEquals(int_literal, call()) -> int
- assertTrue/assertFalse(call()) -> boolean
- assertEquals("string", call()) -> String
- Falls back to Object when type can't be determined

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:41:40 +00:00
Sarthak Agarwal
e8a4f96c0b fix: strip CJS require('vitest') and require('@jest/globals') in Mocha tests
The AI backend generates vitest/jest-style imports for Mocha projects.
Our sanitize_mocha_imports() stripped ESM `import { ... } from 'vitest'`,
but process_generated_test_strings() runs BEFORE postprocessing and calls
ensure_module_system_compatibility() which converts these to CJS requires.
Result: `const { ... } = require('vitest')` survived sanitization.

Added regexes for the CJS variants of vitest and @jest/globals requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 16:01:55 +05:30
Sarthak Agarwal
ed2594bfd1 fix: resolve Vitest benchmarking showing wall-clock time instead of per-function timing
Root cause: Vitest performance tests reported "20.0 seconds over 1 loop"
(JUnit XML wall-clock fallback) instead of actual per-function nanosecond
timing. This was a chain of two issues:

1. **stdout interception**: Vitest's default `threads` pool intercepts
   process.stdout.write() and console.log(), preventing timing markers
   from flowing to the parent process. Fixed by adding `--pool=forks`
   to all Vitest commands and config files. The `forks` pool uses child
   processes where stdout flows directly to the parent.

2. **test name detection**: Even after markers flowed through (43,000+
   found in stdout), the parser couldn't match them to JUnit XML
   testcases because all markers had "unknown" as the test name. This
   happened because Vitest doesn't inject `beforeEach` as a global
   (unlike Jest), so capture.js's Jest-style hook to set
   `currentTestName` never fired.

   Fixed by adding Vitest-specific test name detection in capture.js:
   - Primary: `expect.getState().currentTestName` (full describe path)
   - Fallback: `__vitest_worker__.current.fullTestName`
   - Defense-in-depth: parser fallback matches "unknown" markers to
     the first testcase when no name match is found

Result: cheerio's `isHtml` went from "20.0s / 1 loop" to
"902μs / 20,853 loops" with proper speedup analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:31:10 +05:30
Sarthak Agarwal
a78b4cc299 fix: support Mocha CJS projects and sanitize incorrect framework imports
Three related fixes for Mocha test generation in CommonJS projects:

1. inject_test_globals() now accepts module_system param — emits
   `require('node:assert/strict')` for CJS instead of ESM import syntax
2. ensure_module_system_compatibility() now converts ESM→CJS even when
   the source has mixed imports (was skipping when both ESM and CJS were
   detected, leaving the ESM import from inject_test_globals unconverted)
3. New sanitize_mocha_imports() strips vitest/jest/@jest/globals imports
   that the AI sometimes generates for Mocha projects — Mocha provides
   describe/it/before*/after* as globals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:30:22 +05:30
Sarthak Agarwal
f53272c03b fix: use pattern matching for collocated tests in monorepos
When testsRoot overlaps moduleRoot (common in JS/TS monorepos like Ghost
where both point to "src"), the directory-based filter incorrectly
excluded ALL source files. Switch to filename/directory pattern matching
(*.test.*, *.spec.*, __tests__/) when roots overlap, preserving the
existing directory-based filter for standard layouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 14:29:05 +05:30
misrasaurabh1
c111635497 fix(java): fix lint error and broken mock in test_filter_validation
- Replace try/except/pass with contextlib.suppress() (ruff SIM105)
- Fix test_run_maven_tests_succeeds_with_valid_filter to mock
  _run_cmd_kill_pg_on_timeout instead of subprocess.run; on Linux
  the function uses Popen not run, so the old mock was never called

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:34:30 +00:00
misrasaurabh1
814ce5388c fix(java): add Windows guard for process-group kill and remove slow timeout tests
- Add sys.platform == "win32" check in _run_cmd_kill_pg_on_timeout so
  Windows machines fall back to plain subprocess.run() (Windows has no
  POSIX process groups / killpg)
- Remove TestRunCmdKillPgOnTimeout test class (5 tests using sleep 60
  commands were adding significant time to the test suite)

Follow-up to the SQLite-locked-error fix merged in #1728.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:34:30 +00:00
misrasaurabh1
7d45efe401 fix(java): exclude apidocs and javadoc directories from file discovery
When running with --all on a Java project, codeflash was discovering .js
files inside apidocs/ and javadoc/ directories (generated Javadoc HTML)
and attempting to optimize them as JavaScript. This caused:
- "Invalid test framework for JavaScript/TypeScript" errors
- Wasted API calls for ~30+ functions from jquery-3.7.1.min.js
- Spurious "NO TESTS GENERATED" warnings for minified jQuery functions

Fix: add "apidocs" and "javadoc" to Java's dir_excludes. Because the
--all mode unions dir_excludes from all languages, these directories are
now skipped in both Java-specific and --all discovery modes.

Adds 5 tests verifying the exclusion works for Java mode and --all mode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:24:27 +00:00
misrasaurabh1
16d314bf90 revert: remove WAL/busy_timeout defense-in-depth SQLite changes
The root cause of 'database is locked' is orphaned Surefire JVM processes
after Maven timeout.  The actual fix is killing the entire process group
(_run_cmd_kill_pg_on_timeout in test_runner.py).

The WAL mode / busy_timeout / sqlite3.connect(timeout=30) changes were
treating the symptom rather than the root cause.  Revert them:

- codeflash/languages/java/instrumentation.py: remove PRAGMA journal_mode=WAL
  and PRAGMA busy_timeout=30000 from inline SQLite write code
- codeflash/verification/parse_test_output.py: revert timeout=30 to default
- codeflash/languages/java/resources/CodeflashHelper.java: revert WAL/busy_timeout PRAGMAs
- codeflash-java-runtime/src/main/java/com/codeflash/Comparator.java: revert busy_timeout PRAGMA
- codeflash-java-runtime/src/main/java/com/codeflash/ResultWriter.java: revert WAL/busy_timeout PRAGMAs
- codeflash/languages/java/resources/codeflash-runtime-1.0.0.jar: restored to pre-change JAR
- tests/test_languages/test_java/test_instrumentation.py: remove TestSQLiteLockedFix
  class and revert snapshot strings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:43:20 +00:00
misrasaurabh1
de0fb91cd3 fix(java): kill entire process group on Maven timeout to prevent orphaned JVMs
Root cause of 'database is locked' errors:
- When Maven times out, subprocess.run() only kills the Maven parent process
- On Linux, Maven's forked Surefire JVM children become orphaned (not killed)
- Orphaned JVMs keep the SQLite result file open, causing SQLITE_BUSY when
  Python reads the file immediately after Maven is killed

Fix: Replace subprocess.run() with _run_cmd_kill_pg_on_timeout() which uses
start_new_session=True + os.killpg() to kill the entire process group on
timeout, ensuring no orphaned JVMs are left behind.

Applied to: _compile_tests, _get_test_classpath, _run_tests_direct,
and _run_maven_tests (the main one).

Also adds 5 unit tests verifying:
- Successful commands return correct output
- Failing commands propagate returncode
- Child processes are killed (not orphaned) on timeout
- returncode is -2 on timeout
- Timeout is described in stderr

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:24:56 +00:00
misrasaurabh1
1fdbd2c04b fix(java): resolve SQLite 'database is locked' errors across the pipeline
Root cause: The instrumented test JVM holds a SQLite connection open while
writing results.  The Python reader and the Java Comparator were trying to
read the same file without a busy_timeout, causing immediate SQLITE_BUSY
failures (~126 occurrences in codeflash_all_3.log).

Fixes applied:

1. instrumentation.py (_generate_sqlite_write_code):
   Emit PRAGMA journal_mode=WAL and PRAGMA busy_timeout=30000 right after
   each inline connection open.  WAL mode lets readers see the last committed
   state while a writer is active; busy_timeout makes lock collisions retry
   instead of immediately failing.

2. parse_test_output.py (parse_sqlite_test_results):
   Add timeout=30 to sqlite3.connect() so Python waits up to 30 s for a
   transient lock to clear (default was 5 s, which was too short for a busy
   Maven/JVM process).

3. Comparator.java (readTestResults):
   Execute PRAGMA busy_timeout=30000 on the same connection before running
   the SELECT, so the Java Comparator also retries instead of failing with
   [SQLITE_BUSY].

4. CodeflashHelper.java (initializeDatabase) and ResultWriter.java (constructor):
   Same WAL + busy_timeout PRAGMAs added after the initial getConnection() call
   for the long-lived database connections used by these helper classes.

5. Updated codeflash-runtime-1.0.0.jar (rebuilt after Comparator/ResultWriter fix).

tests: add TestSQLiteLockedFix with two assertions —
  • _generate_sqlite_write_code emits PRAGMA journal_mode=WAL and
    PRAGMA busy_timeout=30000 before CREATE TABLE
  • parse_sqlite_test_results uses timeout= in sqlite3.connect()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:04:38 +00:00
Kevin Turcios
881b4e9b35 fix: resolve merge conflicts and fix test failures
- Fix MockTestConfig missing tests_project_rootdir field
- Fix Python line profiler existence check on wrong path (no .lprof suffix)
- Add --release 11 to javac in Java line profiler tests for JDK compat
- Resolve merge conflicts with omni-java (replacement, tests)
- Add replace_function_definitions method to JavaSupport
- Guard against wrong method names in optimized code (standalone + class)
- Add tests for anonymous inner class method hoisting
2026-03-02 19:48:06 -05:00
misrasaurabh1
aae13a8e69 feat: skip inner-class methods in Java discovery; revert replacement-level inner-class workarounds
- parser.py: add `is_class_nested` flag to `JavaMethodNode`; track
  `class_depth` in `_walk_tree_for_methods` (incremented each time a
  type declaration is entered) and set `is_class_nested = True` when
  depth ≥ 2 (method lives inside a nested/inner class)

- discovery.py: add early-exit in `_should_include_method` when
  `method.is_class_nested` is True — inner-class methods cannot be
  reliably instrumented or tested in isolation, so we skip them up-front
  rather than wasting LLM tokens on candidates that will always be
  rejected later

- replacement.py: revert Bug-4 replacement-level workarounds that are
  now obsolete:
  * remove `target_class_name` parameter from `_parse_optimization_source`
  * restore simple first-match `break` in target-method selection
  * remove class_name filter that blocked helpers from "other" classes

- tests: update `TestNestedClasses`, `TestExtractCodeContextWithInnerClasses`
  to reflect the new no-inner-class-discovery contract; remove
  `TestInnerClassHelperFilter` (superseded by discovery filter);
  add `TestInnerClassMethodFilter` in test_discovery.py with four
  scenarios covering static nested, non-static inner, outer-only, and
  deeply-nested classes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:47:01 +00:00
Kevin Turcios
dc25d12e3c one more 2026-03-02 19:38:11 -05:00
Kevin Turcios
fe3e2f60c0 11 as per workflow 2026-03-02 19:25:30 -05:00
Kevin Turcios
f79cd9342a release 2026-03-02 19:20:44 -05:00
misrasaurabh1
05a9b61478 fix: replace modified constructors when LLM adds new final fields (Java)
When the LLM optimises a method by introducing a new final field (e.g.
caching Arrays.hashCode in Expression.hashCode, or caching map.values() in
LuaMap.valuesIterator), it also modifies the class constructors to initialise
the field.  Previously codeflash:

  1. Added the new field to the class ✓
  2. Replaced the target method ✓
  3. Did NOT update the constructors ✗

This caused "variable X might not have been initialized" compilation errors.

Changes:
- `JavaAnalyzer.find_constructors` (+ `_walk_tree_for_constructors`,
  `_extract_constructor_info`): new parser methods to locate
  `constructor_declaration` nodes via tree-sitter.
- `JavaMethodNode.formal_parameters_text`: captures the raw parameter list
  text so constructors can be matched by signature.
- `ParsedOptimization.modified_constructors`: new field to carry constructor
  source texts that need to be replaced.
- `_parse_optimization_source`: extract constructors from the same class as
  the target method and store in `modified_constructors`.
- `_replace_constructors`: new helper that replaces constructors in the
  original source by matching on formal parameter signature.
- `replace_function`: call `_replace_constructors` after the main method
  replacement when `modified_constructors` is non-empty.

Fixes regressions observed in codeflash_all_3.log:
  LuaMap.valuesIterator, Expression.hashCode, Bin.hashCode,
  NettyTlsContext.createHandler, Pool.capacity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:03:38 +00:00
Kevin Turcios
626be4d337 use the class parser directly 2026-03-02 19:00:42 -05:00
misrasaurabh1
a880ffc48a fix: skip outer-class methods when target is in a static inner class (Java)
When the optimisation target lives in a static inner class (e.g.
ObjectUnpacker inside Unpacker<T>), the LLM-generated class often wraps the
inner class inside the full outer class.  Previously, methods belonging to the
outer class were extracted as "helpers" and injected into the inner class,
causing compilation errors:

  - "non-static type variable T cannot be referenced from a static context"
  - "non-static variable offset cannot be referenced from a static context"

Two related fixes:

1. When _parse_optimization_source extracts helpers, it now skips any method
   whose class_name differs from the target method's class_name.

2. The function now accepts an optional target_class_name parameter.  When
   there are multiple methods with the same name in the generated code (e.g.
   an abstract outer-class method and the concrete inner-class override), the
   method in the target class is preferred over outer-class methods.

Fixes the Unpacker.ObjectUnpacker.getString regression from codeflash_all_3.log.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:59:31 +00:00
misrasaurabh1
cd02dec79f test: use full string equality in anonymous iterator test
Replace substring-based assertions with a single exact string
comparison in test_anonymous_iterator_methods_not_hoisted_to_class,
matching the convention used elsewhere in the test file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:38:42 +00:00
Kevin Turcios
e67cdb952e fix paths 2026-03-02 17:33:11 -05:00
Kevin Turcios
399441edd2 one more 2026-03-02 17:08:41 -05:00
Kevin Turcios
130953aab1 one more 2026-03-02 17:00:16 -05:00
Kevin Turcios
2a2b42194e fix: resolve remaining Java test failures
- Fix language detection in code_replacer to use lang_support.language
  (was None when function_to_optimize absent, blocking Java class member insertion)
- Update discover_functions calls in test_integration.py to pass source param
- Remove inner_iterations kwarg from test_run_and_parse.py (handled internally)
2026-03-02 16:33:53 -05:00
Kevin Turcios
67b422ffd3 fix: add @property to JavaSupport.function_optimizer_class, prek format fixes 2026-03-02 16:24:10 -05:00
Kevin Turcios
754727c8f2 fix: resolve e2e failures (path, pass_fail_only, Java context, codeflash.toml)
- Use os.path.relpath for main.py path in e2e tests
- Remove pass_fail_only kwarg from JS/Java function optimizers
- Fix Java e2e test to use JavaFunctionOptimizer for code context
- Detect codeflash.toml in e2e test runner (not just pyproject.toml)
2026-03-02 16:08:21 -05:00
Kevin Turcios
61bcc37449 Update test_java_e2e.py 2026-03-02 16:04:03 -05:00
Kevin Turcios
83831ac25e fix: resolve e2e test path and pass_fail_only issues
- Use os.path.relpath for main.py path (works for any cwd depth)
- Remove pass_fail_only kwarg from JS/Java compare_test_results fallback
  (main removed this parameter from equivalence.compare_test_results)
2026-03-02 16:01:46 -05:00
Kevin Turcios
f7fd593de3 fix: resolve remaining test failures after main sync
- Fix min/max_outer_loops → pytest_min/max_loops in Java test_run_and_parse
- Update test_replacement.py for new replace_function_definitions_for_language API
- Update JavaSupport.discover_functions signature to match protocol
- Migrate _get_java_sources_root/_fix_java_test_paths to JavaFunctionOptimizer
- Fix test_java_tests_project_rootdir to use set_current_language
2026-03-02 15:47:23 -05:00
Kevin Turcios
e7687f2448 fix: rename min/max_outer_loops to pytest_min/max_loops and add Java cleanup patterns
Omni-java tests used renamed params that don't match main's API.
Also adds Java instrumented file patterns to leftover cleanup regex.
2026-03-02 15:40:26 -05:00
Kevin Turcios
a14bd09fdc fix: update Java tests for protocol-dispatch refactoring
Updates canary test to check JavaFunctionOptimizer instead of base
function_optimizer (comparison logic moved to subclass). Renames
min/max_outer_loops back to pytest_min/max_loops to match main's API.
2026-03-02 15:36:40 -05:00
Kevin Turcios
f37b37209c fix: update Java test_replacement import for moved code_replacer module 2026-03-02 15:32:57 -05:00
Kevin Turcios
bd3ec8f09d test: sync dual-changed test files from main with omni-java fixes
Updates inject_profiling_into_existing_test calls to include test_string
parameter. Takes main's test refactoring for multi-file code replacement
and codeflash capture.
2026-03-02 15:30:16 -05:00
Kevin Turcios
19bd6e4bad test: sync test files from main (safe, main-only changes)
34 test files updated with main's refactored tests for new language
support protocol, JS/TS improvements, and code context extraction.
2026-03-02 15:25:50 -05:00
Sarthak Agarwal
c53740df2e
Merge branch 'main' into fix/jest-junit-and-misc 2026-03-02 22:46:13 +05:30
Sarthak Agarwal
d5dba8ce71 add mocha support 2026-03-02 22:44:48 +05:30
Sarthak Agarwal
b0afe2ef9c feat: add skip_confirm and skip_api_key params to JS init for non-interactive mode
Allow init_js_project(), should_modify_package_json_config(), and
collect_js_setup_info() to run without interactive prompts when
skip_confirm=True. Uses auto-detected defaults instead of prompting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:03:46 +05:30
Sarthak Agarwal
0490b221f2 fix: raise clear error for unsupported JS test frameworks instead of silent fallback
Add NotImplementedError guard in all 3 test dispatchers (behavioral,
benchmarking, line-profile) for frameworks other than jest and vitest.
Previously, mocha and other frameworks silently fell through to Jest,
causing confusing failures. Now users get a clear error message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:03:13 +05:30
Sarthak Agarwal
4bf8cb7412 feat: discover object methods exported via CJS module.exports = variable
Resolve `module.exports = varName` where varName is an object literal
containing methods. For patterns like `const utils = { match() {} };
module.exports = utils;`, the individual methods are now recognized as
exported. This fixes function discovery for CJS libraries like Moleculer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:01:56 +05:30
Sarthak Agarwal
76d994e87c feat: discover const arrow functions exported via named export clauses
Post-process find_functions() to mark functions as exported when they appear
in named export clauses like `export { joinBy }`. This fixes discovery for
TypeScript codebases (e.g., Strapi) that define const arrow functions and
export them via a separate export statement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:00:33 +05:30
Sarthak Agarwal
3c2a2b3694 feat: bundle JUnit XML reporter for Jest, replacing external jest-junit dependency
Ship a zero-dependency jest-reporter.js inside the codeflash runtime package
instead of requiring the external jest-junit npm package. This ensures the
reporter is always available when codeflash is installed, fixing Jest-based
projects (Strapi, Moleculer) that failed because jest-junit wasn't installed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:59:41 +05:30
Kevin Turcios
04a94f2b03 test: update tests for refactored language support
- Update discover_functions calls to new (source, file_path) signature
- Use language-specific FunctionOptimizer subclasses in tests
- Add explicit utf-8 encoding to read_text()/write_text() for Windows
- Fix pytest fixture in TestTsJestSkipsConversion (was __init__)
- Update nonexistent file tests for source-based discover_functions
- Remove unused imports
2026-03-02 06:09:06 -05:00
Kevin Turcios
b6af185998 fix: split discovery and instrumentation log messages for E2E harnesses
Log "Discovered N existing unit test files" after counting tests, and
"Instrumented N existing unit test files" after injecting profiling.
Python E2E harness matches "Discovered", JS harness matches "Instrumented".
2026-03-02 02:16:50 -05:00
Kevin Turcios
6a916ac83f fix: address review feedback on PythonFunctionOptimizer extraction
- Add clarifying comment on shared replace_function_definitions_in_module import
- Remove misleading alias in test_unused_helper_revert.py, use PythonFunctionOptimizer directly
- Align base line_profiler_step return type to dict[str, Any]
- Fix latent bug: handle non-empty TestResults in line_profiler_step
2026-03-01 23:51:43 -05:00
Kevin Turcios
a55841b978 fix: use PythonFunctionOptimizer in tests that depend on Python-specific hooks 2026-03-01 23:22:19 -05:00
misrasaurabh1
2371540386 fix: guard Java replacement against wrong-method-name candidates and anonymous-class method hoisting
Two bugs in _parse_optimization_source (replacement.py) caused Maven compilation
failures when codeflash optimised aerospike-client-java:

Bug 1 – standalone method with wrong name replaces target
When the LLM generated a standalone method whose name did not match the
optimisation target (e.g. generated `unpackMap` for target `unpackObjectMap`,
or generated `sizeTxn` for target `estimateKeySize`), the function fell back to
using the entire generated snippet as `target_method_source`.  This silently
replaced the target with the wrong method, producing:
  • a duplicate definition of the wrong method
  • removal of the target method (breaking all callers)

Fix: after parsing standalone (class-free) code, verify that at least one
discovered method matches the target name.  If no match is found, set
`target_method_source` to the empty string and log a warning.  A corresponding
guard in `replace_function` returns the original source unchanged when
`target_method_source` is empty.

The same guard is applied to the full-class path: if the generated class does
not contain the target method, the candidate is also rejected.

Bug 2 – anonymous inner-class methods hoisted as top-level helpers
When an optimised method returned an anonymous class (e.g. `keySetIterator`
returning `new Iterator<LuaValue>() { … }`), tree-sitter's recursive walk
found the anonymous class's `hasNext`, `next`, and `remove` method_declaration
nodes and classified them as helpers to be inserted at the outer-class level.
The inserted methods carried `@Override` annotations that matched nothing in the
outer class and referenced local variables (`it`) that were only in scope inside
the optimised method body, producing compilation errors.

Fix: when extracting helpers from the optimised class, skip any method whose
line range is entirely contained within the target method's line range.  Such
methods belong to anonymous/nested classes inside the method body and must not
be hoisted out as standalone class members.

Tests added for both bugs in TestWrongMethodNameGeneration and
TestAnonymousInnerClassMethods.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 03:35:42 +00:00
Kevin Turcios
64de536471 fix: restore base replace_function_and_helpers, fix test imports, move ast to TYPE_CHECKING
- Base class keeps the language-routing replacement logic (used by both
  Python and JS); Python subclass adds unused-helper revert on top via super()
- Tests that exercise Python-specific replace+revert use PythonFunctionOptimizer
- Move `ast` to TYPE_CHECKING in optimizer.py (fixes prek)
2026-03-01 21:30:19 -05:00
misrasaurabh1
e1fb4b81e8 fix runtime calculation for java 2026-02-28 21:35:45 -08:00
Saurabh Misra
86202d40e5
Merge pull request #1690 from codeflash-ai/fix/comparator-itertools-count
fix: handle itertools types in comparator with Python 3.9-3.14 support
2026-02-27 15:21:59 -08:00
aseembits93
3a33fe43a4 fix: support Python 3.10-3.14 in comparator itertools tests
Handle itertools.cycle on Python 3.14 where __reduce__ was removed by
falling back to element-by-element sampling. Add version guards for
pairwise (3.10+) and batched (3.12+) tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 00:05:36 +05:30
aseembits93
eeda6c2d32 fix: handle all remaining itertools types in comparator
Add a catch-all handler for itertools iterators (chain, islice, product,
permutations, combinations, starmap, accumulate, compress, dropwhile,
takewhile, filterfalse, zip_longest, groupby, pairwise, batched, tee).
Uses module check (type.__module__ == "itertools") so it automatically
covers any itertools type without version-specific enumeration. groupby
gets special handling to also materialize its group iterators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:14:59 +05:30
aseembits93
456a18837b fix: handle itertools.repeat and itertools.cycle in comparator
itertools.repeat uses repr() comparison (same approach as count).
itertools.cycle uses __reduce__() to extract internal state (saved items,
remaining items, and first-pass flag) since repr() only shows a memory
address. The __reduce__ approach is deprecated in 3.14 but is the only
way to access cycle state without consuming elements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:06:46 +05:30
aseembits93
12e8f7009c fix: handle itertools.count in comparator
The comparator had no handler for itertools.count (an infinite iterator),
causing it to fall through all type checks and return False even for
equal objects. Use repr() comparison which reliably reflects internal
state and avoids the __reduce__ deprecation coming in Python 3.14.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:01:06 +05:30
Kevin Turcios
91cf6ea2aa fix: update tests for consolidated CST discovery behavior
Nested functions are now skipped by FunctionVisitor, and
discover_functions no longer swallows parse/IO errors — callers
handle them. Update test expectations accordingly.
2026-02-27 12:08:42 -05:00
Kevin Turcios
94755489e7 should be tests 2026-02-27 09:23:11 -05:00
Kevin Turcios
5cee1b5b48 feat: improve test generation context for external library types
Extend extract_parameter_type_constructors to scan function bodies for
isinstance/type() patterns and collect base class names from enclosing
classes. Add one-level transitive stub extraction so the LLM also sees
constructor signatures for types referenced in __init__ parameters.

In enrich_testgen_context, branch on source: project classes get full
definitions, third-party (site-packages) classes get compact __init__
stubs to avoid blowing token limits.
2026-02-26 09:40:04 -05:00
HeshamHM28
9b6d645ab6 fix: update profiling logic and improve loop index handling in tests 2026-02-25 07:43:14 +02:00
HeshamHM28
24d38b6fae Merge branch 'omni-java' into fix/java/line-profiler 2026-02-25 07:09:00 +02:00
aseembits93
5c829cd4de test: compare final_content to complete expected output string
Replace substring assertions with exact equality check against the full
expected output (EXPECTED_OUTPUT constant). Extract shared setup into a
run_replacement helper.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:16:59 +05:30
aseembits93
14bdc3c1cf fix: detect attribute-referenced methods as used in unused helper detection
detect_unused_helper_functions only walked ast.Call nodes, missing methods
referenced via attribute assignment (e.g., self._parse1 = self._parse_literal).
This caused optimized helper methods used as callbacks to be incorrectly
reverted to their original code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:09:51 +05:30
Kevin Turcios
f747b66252 missed there 2026-02-23 05:52:43 -05:00
Kevin Turcios
94a99a980f fix failing test 2026-02-23 05:43:16 -05:00
Kevin Turcios
d8582c328a fix: handle __slots__-only objects in comparator
Objects with __slots__ but no __dict__ (e.g. textual.cache.LRUCache)
fell through all comparator branches, logging "Unknown comparator input
type" and returning False — causing spurious test mismatches.
2026-02-23 05:29:40 -05:00
misrasaurabh1
62db2360b5 fix: use correct iteration_id format for Java performance mode
Changed iteration_id in performance mode markers to properly encode
inner loop iterations for test case grouping:

- Single call: iteration_id = innerIteration (0, 1, 2...)
- Multiple calls: iteration_id = callId_innerIteration (1_0, 1_1, 2_0, 2_1...)

This allows test results to be properly grouped by InvocationId, where
each unique (call, inner_iteration) pair gets its own group for
calculating minimum runtimes across outer loops.

Fixed test expectations to match the new format.

All 43 Java performance tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:42:26 -08:00
Kevin Turcios
a2764084bd test: add regression tests for comment position preservation 2026-02-23 01:12:49 -05:00
Kevin Turcios
6c4378db51 fix: preserve comment position by passing CST module directly to import adder
parse_code_and_prune_cst now returns cst.Module instead of str.
add_needed_imports_from_module accepts cst.Module | str, skipping re-parse
when a Module is passed. This eliminates the string round-trip that caused
comments to migrate from statement leading_lines to Module.header,
resulting in comments appearing above imports instead of at their
original position.
2026-02-23 01:08:39 -05:00
misrasaurabh1
67c4d34813 fix: Java loop ID calculation and assertion transformer bug
Implemented CUDA-style loop ID calculation for performance mode:
- loopId = outerLoop * maxInnerIterations + innerIteration
- Behavior mode uses simple loop index (no inner iterations)
- Invocation ID simplified to call counter only
- Default CODEFLASH_INNER_ITERATIONS set to 10

Fixed critical bug in JavaAssertTransformer:
- Removed duplicate _special_re assignment that was missing parentheses
- Combined patterns into single regex: [\"'{}()]
- This fixes _find_balanced_parens and enables assertion transformation

Updated test expectations to match new marker format and loop ID calculation.

All 41 Java instrumentation tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 21:04:15 -08:00
Kevin Turcios
1689a7bbb5 perf: cache module scan in _clear_lru_caches and expand test coverage
Cache inspect.getmembers() results per module so repeated loop
iterations skip the expensive rescan. Add tests for get_runtime_from_stdout,
should_stop, _set_nodeid, _get_total_time, _timed_out, logreport, and
setup/teardown hooks.
2026-02-22 01:17:05 -05:00
Kevin Turcios
5a73fe2101
Merge pull request #1617 from codeflash-ai/comparator-uniontype
fix: add types.UnionType support to comparator
2026-02-22 05:09:01 +00:00
Kevin Turcios
c6fbdfa535 chore: merge main into fixes-for-core-unstructured-experimental 2026-02-21 00:57:33 -05:00
Kevin Turcios
c1703a2d71 Revert "commit"
This reverts commit 2966e15775.
2026-02-21 00:50:31 -05:00
Kevin Turcios
2966e15775 commit
feat: extend testgen type context to include function body references

Extract types referenced in the function body (constructor calls, attribute
access, isinstance/issubclass args) in addition to parameter annotations.
Use full class extraction instead of init-stub-only, with instance resolution
fallback and project/site-packages filtering.
2026-02-21 00:50:04 -05:00
misrasaurabh1
7fa7eeabfe instrumentation bugs with multiple function calls 2026-02-20 21:16:07 -08:00
HeshamHM28
49f89fbe2c merge to main 2026-02-21 01:49:31 +02:00
Mohamed Ashraf
f06acba354 fix: add test method name to Java stdout markers for unique identification
Java stdout markers now include the test method name in the class field
(e.g., "TestClass.testMethod") matching the Python marker format. The
parser extracts the test method name from this combined field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 20:13:04 +00:00
Mohamed Ashraf
0001fb5921 fix: store actual test method name in SQLite for Java behavior tests
The instrumented Java test code was storing "{class_name}Test" as the
test_function_name in SQLite instead of the actual test method name
(e.g., "testAdd"). This fixes parity with Python instrumentation.

- Add _extract_test_method_name() with compiled regex patterns
- Inject _cf_test variable with actual method name in behavior code
- Fix setString(3, ...) to use _cf_test instead of hardcoded class name
- Optimize _byte_to_line_index() with bisect.bisect_right()
- Update all behavior mode test expectations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 20:12:58 +00:00
aseembits93
002189582b fix: add types.UnionType support to comparator
The comparator did not recognize `types.UnionType` (Python 3.10+ `X | Y`
syntax), causing it to fall through to "Unknown comparator input type".
Conditionally include it in the equality-checked types tuple.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 00:30:27 +05:30
Kevin Turcios
0dbb3a47a2 not used 2026-02-20 08:47:27 -05:00
misrasaurabh1
eb7c1f00d5 more lenient testing 2026-02-20 00:26:48 -08:00
misrasaurabh1
c9cb60a21d test: relax Java timing tolerances to account for JIT warmup
Increase tolerance for individual timing measurements from ±2% to ±5%
to accommodate JIT warmup effects where first iterations run slower
than subsequent optimized runs. Maintain ±2% tolerance for
total_passed_runtime since it uses minimums that filter out cold starts.

- CV threshold: 0.02 → 0.05 (5%)
- Mean runtime: ±2% → ±5%
- total_passed_runtime: ±2% (unchanged, using filtered minimums)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 21:17:48 -08:00
HeshamHM28
5b00ab370b Refactor Java line profiler integration tests
- Rename test class to TestLineProfilerInstrumentation for clarity.
- Add tests for instrumenting Java classes with and without package declarations.
- Enhance instrumentation tests to verify that source files remain unmodified.
- Implement checks for generated configuration files, ensuring correct content and structure.
- Introduce tests for deeply nested packages and verify line contents extraction.
- Add end-to-end tests for spin-timer profiling, validating timing accuracy and hit counts.
2026-02-20 06:59:26 +02:00
misrasaurabh1
2353fb2b86 test: add comprehensive Java run-and-parse integration tests
Add end-to-end tests for Java test instrumentation, execution, and result
parsing, covering both behavior and performance testing modes.

Key additions:
- PreciseWaiter: monotonic timing implementation with <2% variance
- 3 behavior tests: single/multiple test methods, return value validation
- 2 performance tests: timing accuracy, inner/outer loop counts
- Validation of total_passed_runtime() aggregation

Infrastructure improvements:
- Add inner_iterations parameter to benchmarking call chain
- Rename pytest_* parameters to language-agnostic names:
  - pytest_min_loops → min_outer_loops
  - pytest_max_loops → max_outer_loops
  - pytest_inner_iterations → inner_iterations
- Pass inner_iterations from tests through function_optimizer → test_runner → language_support

All tests validate timing accuracy (±2%), variance (<2% CV), and correct
result grouping by test case including iteration_id.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 20:57:29 -08:00
claude[bot]
f321f836f5 fix: add from __future__ import annotations for Python 3.9 compat
The `list[X] | None` union syntax (PEP 604) requires Python 3.10+ at
runtime. Adding the future annotations import defers evaluation and
fixes the import error on Python 3.9.

Co-authored-by: Saurabh Misra <misrasaurabh1@users.noreply.github.com>
2026-02-20 03:32:33 +00:00
claude[bot]
4730f342a1 style: auto-fix formatting and mypy type annotations 2026-02-20 03:24:31 +00:00
misrasaurabh1
df67ec305a better coverage numbers 2026-02-19 19:20:15 -08:00
Kevin Turcios
c74782757b Merge commit '6346c740' into sync-main-batch-4
# Conflicts:
#	.github/workflows/windows-unit-tests.yml
#	codeflash/code_utils/config_consts.py
#	codeflash/code_utils/instrument_existing_tests.py
#	codeflash/languages/python/context/unused_definition_remover.py
#	codeflash/languages/python/static_analysis/code_replacer.py
#	codeflash/optimization/function_optimizer.py
#	codeflash/optimization/optimizer.py
#	pyproject.toml
2026-02-19 21:26:23 -05:00
Kevin Turcios
7c7eeb5bc9 fix: update test import for moved code_context_extractor module 2026-02-19 20:39:42 -05:00
Kevin Turcios
85d1d4fbf6 Merge commit '6020c4fa' into sync-main-batch-3 2026-02-19 20:33:09 -05:00
Kevin Turcios
c66953d110 Merge commit 'd578d996' into sync-main-batch-2
# Conflicts:
#	codeflash/github/PrComment.py
#	codeflash/optimization/function_optimizer.py
#	codeflash/tracer.py
#	codeflash/verification/parse_test_output.py
#	codeflash/verification/verification_utils.py
2026-02-19 20:27:14 -05:00
Kevin Turcios
7d7a2a21c0 Merge commit '3dd19c62' into sync-main-batch-1
# Conflicts:
#	codeflash/optimization/function_optimizer.py
#	codeflash/verification/verification_utils.py
#	codeflash/version.py
2026-02-19 20:10:05 -05:00
misrasaurabh1
72afada84c fix: correct field ordering, helper placement, indentation, and blank lines in Java code replacer
Four bugs in _insert_class_members / replace_function:
1. Extra indentation on injected methods (textwrap.dedent now normalises source before re-indenting)
2. New fields were prepended before existing ones (now inserted after the last existing field)
3. Helper methods were always appended at end of class (now placed before/after target based on their position in the optimised code)
4. No blank lines between consecutively injected helpers (each helper is now followed by a blank line)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 16:35:50 -08:00
Kevin Turcios
09a0a33aa8 fix: update stale @patch paths in test_is_numerical_code.py 2026-02-19 05:00:38 -05:00
Kevin Turcios
ef99747697 refactor: move code_extractor, code_replacer to languages/python/static_analysis/ 2026-02-19 03:21:34 -05:00
Kevin Turcios
ae2ee47142 refactor: move line_profile_utils, edit_generated_tests to languages/python/static_analysis/ 2026-02-19 03:18:43 -05:00
Kevin Turcios
9431952ca2 refactor: move static_analysis, concolic_utils, coverage_utils to languages/python/static_analysis/ 2026-02-19 03:16:51 -05:00
Kevin Turcios
cb91158312 refactor: rename test file and imports to match reference graph rename 2026-02-19 02:30:27 -05:00
Kevin Turcios
2652e71617 Merge remote-tracking branch 'origin/main' into call-graphee
# Conflicts:
#	.codex/skills/.gitignore
#	.gemini/skills/.gitignore
#	codeflash/languages/python/context/code_context_extractor.py
2026-02-19 01:05:42 -05:00
claude[bot]
89952791be style: add return type annotations to test methods 2026-02-19 03:56:13 +00:00
Kevin Turcios
b19e1bda00 perf: replace backtracking regexes with character-class patterns in parse_test_output
Replace lazy `.*?` quantifiers in matches_re_start/matches_re_end with
negated character classes (`[^:]`, `[^#]`, `[^.:]`) to eliminate
quadratic backtracking. Replace per-line regex search for the pytest
FAILURES header with a simple `"= FAILURES =" in line` string check.
Add tests for the regex patterns and failure header detection.
2026-02-18 22:19:02 -05:00
Kevin Turcios
6a5c9c1b40 test: update expected context values for statement-type helpers 2026-02-18 21:07:03 -05:00
Mohamed Ashraf
0e753e199a fix: improve Java type context sent to AI for test generation
- Increase imported type skeleton token budget from 2000 to 4000
- Add constructor signature summary headers to skeleton output
- Expand wildcard imports (e.g., import com.foo.*) into individual types
  instead of silently skipping them
- Prioritize skeleton processing for types referenced in the target method
  so parameter types are guaranteed context before less-critical types
- Fix invalid [no-arg] annotation in constructor summaries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:55:37 +00:00
Kevin Turcios
b269212edd context extraction imporvements 2026-02-18 09:09:36 -05:00
claude[bot]
7f5e163e38 fix: resolve mypy attr-defined errors in new test functions 2026-02-18 13:46:54 +00:00
Kevin Turcios
bfcfa44d15 fix: correct pre-existing test failures in test_code_context_extractor
Fix 10 failing tests: remove wrong assertions expecting import statements
inside extracted class code, use substring matching for UserDict class
signature, and rewrite click-dependent tests as project-local equivalents.
Add tests for resolve_instance_class_name, enhanced extract_init_stub_from_class,
and enrich_testgen_context instance resolution.
2026-02-18 08:43:42 -05:00
Kevin Turcios
d480061064 temp 2026-02-18 08:26:37 -05:00
Kevin Turcios
2367b4c02c feat: extract parameter type constructor signatures into testgen context
Add enrichment step that parses FTO parameter type annotations, resolves
types via jedi (following re-exports), and extracts full __init__ source
to give the LLM constructor context for typed parameters.
2026-02-18 07:51:53 -05:00
Kevin Turcios
68c148c876 Merge branch 'main' into fixes-for-core-unstructured-experimental 2026-02-18 07:23:47 -05:00
Kevin Turcios
aa1082338c
Merge branch 'main' into call-graphee 2026-02-18 11:34:32 +00:00
Saurabh Misra
be7f6fc243
Merge pull request #1521 from codeflash-ai/java-fixes-instrumentation
Java fixes instrumentation
2026-02-18 02:21:38 -08:00
mashraf-222
0b8284f519
Merge pull request #1514 from codeflash-ai/fix/java-e2e-critical-bugs
fix: resolve 4 critical Java E2E pipeline bugs
2026-02-18 12:05:48 +02:00
Kevin Turcios
a2238168a3 refactor: remove 13 unused functions from code_context_extractor
Remove safe_relative_to, resolve_classes_from_modules,
extract_classes_from_type_hint, resolve_transitive_type_deps,
extract_init_stub, _is_project_module_cached, is_project_path,
_is_project_module, extract_imports_for_class,
collect_names_from_annotation, is_dunder_method, _qualified_name,
and _validate_classdef. Inline trivial helpers into prune_cst and
clean up enrich_testgen_context and get_function_sources_from_jedi.
Remove corresponding tests.
2026-02-18 05:03:54 -05:00
Mohamed Ashraf
ed767da78c test: add Bug 4 early exit tests and strengthen Bug 3 edge case coverage
Bug 4 (candidate_early_exit.py - 6 tests):
- All tests failed → 0 total passed (guard triggers)
- Some tests passed → nonzero (guard does not trigger)
- Empty results → 0 passed (guard triggers)
- Only non-loop1 results → ignored by report (guard triggers)
- Mixed test types all failing → 0 across all types
- Single passing among many failures → prevents early exit

Bug 3 edge cases (context.py - 8 tests):
- Wildcard imports are skipped (class_name=None)
- Import to nonexistent class returns None skeleton
- Skeleton output is well-formed Java (has braces)
- Protected and package-private methods excluded
- Overloaded public methods all extracted
- Generic method signatures extracted correctly
- Round-trip: _extract_type_skeleton → _format_skeleton_for_context
- Round-trip with real MathHelper fixture file

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:01:44 +00:00
Mohamed Ashraf
d236d5dd33 test: add tests for imported type skeleton extraction
Add 13 tests covering:
- get_java_imported_type_skeletons(): internal import resolution,
  method signature extraction, external import filtering, deduplication,
  empty input handling, and token budget enforcement
- _extract_public_method_signatures(): public method extraction,
  constructor exclusion, empty class handling, class name filtering
- _format_skeleton_for_context(): basic class formatting, enum
  constants, empty class edge case

Also resolve merge conflict from PR #1515 optimization (bytes-based
single-pass method signature extraction).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:54:56 +00:00
Kevin Turcios
6c092b5e7f fix: update expected coverage lines for optimized async e2e code
The optimized code removes `import time`, shifting all function lines
up by 1. Update expected_lines from [10-20] to [9-19] to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:45:51 +00:00
KRRT7
64a18c9870 refactor: use helper file for async decorator instrumentation
Replace inline code injection with a helper file approach that writes
decorator implementations to a separate codeflash_async_wrapper.py file.
This removes the codeflash package import dependency from instrumented
source files while keeping line numbers stable (only 1 import + 1
decorator line added, same as before).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 08:17:08 +00:00
misrasaurabh1
f52a0b704b multiple function calls in the same test case 2026-02-17 23:49:14 -08:00
KRRT7
325ec7d741 refactor: inline async decorators to remove codeflash import dependency
Instead of injecting `from codeflash.code_utils.codeflash_wrap_decorator import ...`
into instrumented source files, inject the decorator function definitions directly.
This removes the hard dependency on the codeflash package being importable at runtime
in the target environment, matching the pattern already used for sync instrumentation.
2026-02-18 06:30:16 +00:00
misrasaurabh1
c9813c0a84 Merge branch 'omni-java' into instrumentation-fixes 2026-02-17 20:32:50 -08:00
misrasaurabh1
8df8076d44 fix the line profiler implementation 2026-02-17 20:31:00 -08:00
misrasaurabh1
b20e1bf281 Line profile parsing for java 2026-02-17 19:37:32 -08:00
HeshamHM28
c4da93c41b fix windows tests 2026-02-18 05:13:46 +02:00
misrasaurabh1
68d8bf7d19 refine some behavioral instrumentation 2026-02-17 19:05:16 -08:00
misrasaurabh1
e534c6c927 Attempt at fixing performance instrumentation 2026-02-17 18:48:00 -08:00
HeshamHM28
dc1083b3f9 add java jdk 2026-02-18 03:47:42 +02:00
HeshamHM28
ce442e38ce Merge branch 'omni-java' into fix/java/e2e/test 2026-02-18 02:13:15 +02:00
HeshamHM28
6ee61fd383 fix tests 2026-02-18 02:08:43 +02:00
mashraf-222
7a2a48bb38
Merge pull request #1511 from codeflash-ai/fix/java-stale-line-numbers-all-mode
fix: use tree-sitter name-based lookup for Java function extraction
2026-02-18 01:49:04 +02:00
HeshamHM28
60a28c0843 prek 2026-02-17 23:27:05 +02:00
HeshamHM28
22541e085a Replace the substring with the entire codebase. 2026-02-17 23:23:51 +02:00
Mohamed Ashraf
09374c1a72 fix: use tree-sitter name-based lookup for Java function extraction
In --all mode, stale line numbers in FunctionToOptimize caused
InvalidJavaSyntaxError when a prior optimization modified the same file.
Now extract_function_source re-parses with tree-sitter to find methods
by name, matching how Python (jedi) and Java replacement already work.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 19:39:16 +00:00
HeshamHM28
fb6de47c1f fix nested targeted function got removed 2026-02-17 21:32:37 +02:00
Kevin Turcios
11543f0d0b fix: use (file_path, qualified_name) key in count_callees_per_function
Bare qualified_name keys could collide across files (e.g. `helper` in
both a.py and b.py), causing counts to be silently overwritten.
2026-02-16 23:03:05 -05:00
Kevin Turcios
0bcc483a95 Merge branch 'main' into call-graphee 2026-02-16 22:50:24 -05:00
Kevin Turcios
bba3e0aa4d Merge branch 'main' into call-graphee 2026-02-16 22:43:03 -05:00
HeshamHM28
83f335ed04 fix asserts 2026-02-17 03:29:58 +02:00
Kevin Turcios
547c02e8bc refactor: move context extraction modules to languages/python/context/
Move code_context_extractor.py and unused_definition_remover.py from
codeflash/context/ to codeflash/languages/python/context/ and update
all import sites.
2026-02-16 14:49:04 -05:00
Kevin Turcios
fa00422fea refactor: simplify and deduplicate code_context_extractor
Consolidate three enricher functions (get_imported_class_definitions,
get_external_base_class_inits, get_external_class_inits) into a single
enrich_testgen_context that parses code context once. Extract shared
helpers, unify prune_cst variants, deduplicate loop bodies, and remove
dead UsedNameCollector class.
2026-02-16 13:34:07 -05:00
HeshamHM28
99f77da4eb Fix falling tests 2026-02-16 20:11:28 +02:00
HeshamHM28
cbc48a1811 add bytes Kryo-serialized for unit test 2026-02-16 09:11:50 +02:00
HeshamHM28
4303ffc24f Fix Bytes test 2026-02-16 09:05:17 +02:00
HeshamHM28
2df9f024c3 Refactor FunctionInfo parameters in Java tests for clarity 2026-02-16 08:49:42 +02:00
HeshamHM28
ca4f01f7c5 Add Java end to end tests 2026-02-16 08:43:51 +02:00
HeshamHM28
4c976415ef Replace Regex with tree-sitter 2026-02-16 08:32:55 +02:00
Kevin Turcios
83c6d5cdd2 fix: import jest patterns from source module instead of re-export
The formatter correctly removed the unused re-exports from
parse_test_output.py. Update the test to import directly from
codeflash.languages.javascript.parse.
2026-02-13 09:55:14 -05:00
Kevin Turcios
e837ad9d17 feat: resolve transitive type dependencies in get_external_class_inits
Add BFS-based transitive resolution so that classes referenced in __init__
type annotations of imported external classes are also extracted. This gives
the LLM the constructor signatures it needs to instantiate parameter types.
2026-02-13 09:35:30 -05:00
claude[bot]
8eb1c86245 fix: resolve mypy union-attr error in test_get_external_class_inits
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 14:09:28 +00:00
Kevin Turcios
f4c0208f49 test: add unit tests for get_external_class_inits
Tests cover: extracting __init__ from site-packages classes (click.Option),
skipping project classes, non-classes, already-defined classes, builtins,
classes with trivial object.__init__, and empty import scenarios.
2026-02-13 09:03:09 -05:00
Kevin Turcios
e962f73125 Merge branch 'main' into call-graphee 2026-02-13 07:19:08 -05:00
Sarthak Agarwal
5b43dc903e
Merge branch 'main' into fix/js-jest30-loop-runner 2026-02-13 17:16:03 +05:30
claude[bot]
7337d030fa style: auto-fix linting issues 2026-02-12 22:03:57 +00:00
Kevin Turcios
fe6363556b fix: filter test_*.py files and pytest fixtures from optimization
When tests_root overlaps with module_root (e.g., both set to "."),
the pattern matching in is_test_file() missed Python's standard
test_*.py naming convention and conftest.py files. Also adds pytest
fixture filtering in the libcst FunctionVisitor to prevent fixtures
from being discovered as optimizable functions.
2026-02-12 16:59:48 -05:00
ali
9937fe0967
fixes for unit tests 2026-02-12 19:30:46 +02:00
ali
6b77be56ef
ignore calls inside string litrals for instrumentation and fix e2e test 2026-02-12 18:14:33 +02:00
ali
e07fd1d439
fix tests 2026-02-12 17:20:32 +02:00
ali
e9b7154361
Merge branch 'main' of github.com:codeflash-ai/codeflash into fix/js-jest30-loop-runner 2026-02-12 16:39:33 +02:00
Kevin Turcios
c3fdf31a96 refactor: use batch count_callees_per_function for dependency ranking and summary 2026-02-12 01:11:39 -05:00
Kevin Turcios
fc42548f9f test: update token limit tests for 64K default 2026-02-12 01:03:34 -05:00
Kevin Turcios
9e904483d8 fix: use explicit token limits in tests to decouple from global constant 2026-02-12 01:02:21 -05:00
Kevin Turcios
13f8490c96 Merge branch 'main' into call-graphee 2026-02-12 00:15:41 -05:00
claude[bot]
773e5a55ca style: fix mypy type annotation in test coverage utils 2026-02-12 04:26:57 +00:00
Kevin Turcios
1181f6a2ac fix: use qualified_name for coverage function identification
The coverage system was using bare function_name (e.g., "__init__")
instead of qualified_name (e.g., "HttpInterface.__init__"), causing
it to match the wrong class's method when multiple classes define
the same method name (like __init__).

Changes:
- function_optimizer.py: pass qualified_name to parse_test_results
- build_fully_qualified_name: skip re-qualifying already-qualified names
- extract_dependent_function: compare using bare name from qualified input
- grab_dependent_function_from_coverage_data: replace substring match with
  exact or dot-bounded suffix match
2026-02-11 23:24:18 -05:00
Kevin Turcios
f3f0b0e020 refactor: move CallGraph into Python language support layer
Add DependencyResolver protocol and IndexResult to base.py, move
call_graph.py to languages/python/, and use factory method in optimizer
instead of is_python() gating.
2026-02-11 20:48:34 -05:00
Mohamed Ashraf
7f66a176d5 fix: update large number comparison test for float precision limits
- test_large_number_different now expects equivalent=True for 99999999999999999 vs 99999999999999998
- Both numbers convert to 1e+17 as floats, making them indistinguishable
- Added test_large_number_significantly_different to verify detection of actual differences
- This is a known limitation of floating-point comparison for very large integers

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 13:37:45 +00:00
Mohamed Ashraf
fc26b4b1e3 fix: update failing unit tests to match current behavior
Fixed two test failures in omni-java:

1. test_formatter_cmds_non_existent:
   - Default formatter-cmds changed from ["black $file"] to [] (commit c587c475)
   - Updated test expectation to match new default
   - Formatter detection now handled by project detector
   - Empty list prevents "Could not find formatter: black" errors for Java projects

2. test_float_values_slightly_different:
   - Python comparator now uses math.isclose(rel_tol=1e-9) for numeric comparison (commit 98a5a438)
   - Updated test to expect equivalent=True for values within epsilon tolerance
   - Added test_float_values_significantly_different to verify detection of actual differences
   - Test added before epsilon-based comparison was implemented, causing mismatch

Both tests now pass and accurately reflect current codebase behavior.

Test results: 2 fixed tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 13:33:52 +00:00
Mohamed Ashraf
abf2c98994 chore: merge omni-java into fix/java-exception-assignment-instrumentation
Resolved conflicts by merging the best of both branches:
- Kept exception_class field from PR for better exception type detection
- Adopted more general variable assignment detection from omni-java
- Combined exception replacement logic to use exception_class with fallback
- Added double catch (specific exception + generic Exception) for robustness
- Merged test cases from both branches with updated expectations

Changes:
- Updated AssertionMatch to include all fields: assigned_var_type, assigned_var_name, exception_class
- Lambda extraction now works for all exception assertions
- Exception class extraction specifically for assertThrows
- Variable assignment detection handles final modifier and fully qualified types
- Exception replacement uses exception_class or falls back to assigned_var_type
- All 80 tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 12:52:15 +00:00
Kevin Turcios
fada8397c7 Merge branch 'main' into call-graphee 2026-02-10 21:19:20 -05:00
Kevin Turcios
fb5ee232a5
Merge branch 'main' into pyarrow-comparator 2026-02-10 20:52:42 -05:00
Kevin Turcios
88d6e8b619
Merge branch 'main' into comparator-nn-module 2026-02-10 20:51:58 -05:00
HeshamHM28
4740725af7 fix asserts 2026-02-11 01:59:04 +02:00
Mohamed Ashraf
e207b83a87 fix: handle assertThrows variable assignment in Java instrumentation
When assertThrows was assigned to a variable to validate exception
properties, the transformation generated invalid Java syntax by
replacing the assertThrows call with try-catch while leaving the
variable assignment intact.

Example of invalid output:
  IllegalArgumentException e = try { code(); } catch (Exception) {}

This fix detects variable assignments, extracts the exception type
from assertThrows arguments, and generates proper exception capture:
  IllegalArgumentException e = null;
  try { code(); } catch (IllegalArgumentException _cf_caught1) { e = _cf_caught1; } catch (Exception _cf_ignored1) {}

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 21:22:21 +00:00
Sarthak Agarwal
fa56eb7abe refactor 2026-02-11 02:05:54 +05:30
Sarthak Agarwal
b8597b2e85 wrapped functions default export support 2026-02-11 02:04:42 +05:30
Sarthak Agarwal
0b611c722a Install cli post cloning in npm 2026-02-11 00:10:08 +05:30
Mohamed Ashraf
b2f258362f chore: merge omni-java base into fix/behavioral-equivalence-improvements
Merge latest changes from base branch including:
- Java compilation error detection (PR #1394)
- Java formatter detection via google-java-format (PR #1400)
- Enhanced test coverage for comparator logic

Conflict resolution:
- tests/test_languages/test_java/test_comparison_decision.py: Used PR version
  that enforces strict correctness (no pass_fail_only fallback tests)
  to align with PR 1401's goal of removing pass_fail_only mode entirely.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 17:43:32 +00:00
Mohamed Ashraf
58d4b94422 chore: merge omni-java base into fix/behavioral-equivalence-improvements
Resolved conflicts in test_runner.py by keeping both _extract_source_dirs_from_pom
from the PR branch and run_line_profile_tests from the base branch.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 17:22:54 +00:00
Mohamed Ashraf
c6703d65c6 chore: merge omni-java base into fix/wire-java-formatter
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 17:20:14 +00:00
Mohamed Ashraf
3d3abcff90 chore: merge omni-java base into feat/java-line-profiling
Resolved conflicts in test_runner.py by keeping the run_line_profile_tests
function from the feature branch and maintaining the get_test_run_command
signature from omni-java.

The line profiling feature is now up-to-date with the latest omni-java changes.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 16:36:08 +00:00
mashraf-222
2d64b35b2e
Merge pull request #1283 from codeflash-ai/feat/java-async-concurrent-support
Add concurrency pattern detection for Java optimization
2026-02-10 16:51:14 +02:00
Mohamed Ashraf
9be69106f6 fix: resolve merge conflicts with omni-java base
Merged omni-java base into PR #1279 to resolve conflicts.

Resolution approach:
1. test_discovery.py: Used refactored method call resolution from base
   - New approach uses sophisticated type tracking (jedi-like "goto")
   - Already includes duplicate checking (line 141)
   - Removed old Strategy 3 (class-based fallback) as it's not needed
     and caused single-function optimization issues

2. test_instrumentation.py: Combined both changes
   - Added API key setup from PR #1279
   - Kept FunctionToOptimize imports from base

The refactored code is more accurate and fixes the single-function
optimization issue that existed in the original PR.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 14:31:25 +00:00
Kevin Turcios
0c1397e814 feat: enrich call graph display with cross-file tracking and dependency summary
Add cross-file edge detection to IndexResult, replace tree sub-entries
with flat per-file dependency labels using plain language, and add a
post-indexing summary panel showing per-function dependency stats.
2026-02-10 06:12:32 -05:00
Kevin Turcios
5c4a65c183 feat: add Rich Live visualization for call graph indexing
Replace the simple progress bar with a Live + Tree + Panel display
that shows files being analyzed, call edges discovered, cache hits,
and summary stats during call graph indexing.
2026-02-10 05:27:59 -05:00
Kevin Turcios
b604bf0a0e test: add unit and caching tests for CallGraph
Covers same-file calls, cross-file calls, class instantiation,
nested function exclusion, module-level exclusion, site-packages
exclusion, empty/syntax-error files, and cache persistence.
2026-02-10 04:57:45 -05:00
Kevin Turcios
a96918766f refactor: replace jedi_definition with definition_type on FunctionSource
Store only the type string instead of the full Jedi Name object,
removing the need for arbitrary_types_allowed and the runtime
dependency on jedi in the model layer.
2026-02-10 04:57:10 -05:00