Commit graph

6782 commits

Author SHA1 Message Date
Kevin Turcios
08dad4e6a4 chore: sync docs, rules, and workflows from main 2026-03-02 22:48:28 -05:00
Kevin Turcios
e8e1e2bb1a feat: bring safe incremental changes from main
- test_framework: add Java framework literals, use protocol dispatch
- code_utils: cache temp dir Path, remove lru_cache from humanize_runtime
- tabulate: remove unused param from _column_type call
- checkpoint: add subagent/yes flag support
- import_resolver: refactor helper extraction
- treesitter: add named export clause discovery
2026-03-02 22:45:29 -05:00
Kevin Turcios
d73daba5fd feat: add foundation for language optimizer extraction from main
Bring additive changes from main into omni-java:
- LanguageSupport protocol: new methods/properties for language-specific optimizers
- PythonFunctionOptimizer and JavaScriptFunctionOptimizer subclasses
- Python/JS normalizers (moved from code_utils/normalizers/)
- Python/JS optimizer helpers (AST resolution, module prep)
- Mocha test runner for JavaScript
- Config constants for token limit errors
2026-03-02 22:42:07 -05:00
Saurabh Misra
96b99ea85d
Merge pull request #1732 from codeflash-ai/fix/java/exclude-apidocs-from-discovery
fix(java): exclude apidocs and javadoc from file discovery
2026-03-02 18:41:42 -08:00
Saurabh Misra
8b0f1ea26a
Merge pull request #1731 from codeflash-ai/fix/java/sqlite-process-group-windows
fix(java): Windows guard for process-group kill + remove slow timeout tests
2026-03-02 18:38:04 -08:00
misrasaurabh1
c111635497 fix(java): fix lint error and broken mock in test_filter_validation
- Replace try/except/pass with contextlib.suppress() (ruff SIM105)
- Fix test_run_maven_tests_succeeds_with_valid_filter to mock
  _run_cmd_kill_pg_on_timeout instead of subprocess.run; on Linux
  the function uses Popen not run, so the old mock was never called

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:34:30 +00:00
misrasaurabh1
814ce5388c fix(java): add Windows guard for process-group kill and remove slow timeout tests
- Add sys.platform == "win32" check in _run_cmd_kill_pg_on_timeout so
  Windows machines fall back to plain subprocess.run() (Windows has no
  POSIX process groups / killpg)
- Remove TestRunCmdKillPgOnTimeout test class (5 tests using sleep 60
  commands were adding significant time to the test suite)

Follow-up to the SQLite-locked-error fix merged in #1728.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:34:30 +00:00
misrasaurabh1
1573506db7 fix(java): fix lint error and broken mock in test_filter_validation
- Replace try/except/pass with contextlib.suppress() (ruff SIM105)
- Fix test_run_maven_tests_succeeds_with_valid_filter to mock
  _run_cmd_kill_pg_on_timeout instead of subprocess.run; on Linux
  the function uses Popen not run, so the old mock was never called

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:31:54 +00:00
misrasaurabh1
7d45efe401 fix(java): exclude apidocs and javadoc directories from file discovery
When running with --all on a Java project, codeflash was discovering .js
files inside apidocs/ and javadoc/ directories (generated Javadoc HTML)
and attempting to optimize them as JavaScript. This caused:
- "Invalid test framework for JavaScript/TypeScript" errors
- Wasted API calls for ~30+ functions from jquery-3.7.1.min.js
- Spurious "NO TESTS GENERATED" warnings for minified jQuery functions

Fix: add "apidocs" and "javadoc" to Java's dir_excludes. Because the
--all mode unions dir_excludes from all languages, these directories are
now skipped in both Java-specific and --all discovery modes.

Adds 5 tests verifying the exclusion works for Java mode and --all mode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:24:27 +00:00
misrasaurabh1
f34dd7b372 fix(java): add Windows guard for process-group kill and remove slow timeout tests
- Add sys.platform == "win32" check in _run_cmd_kill_pg_on_timeout so
  Windows machines fall back to plain subprocess.run() (Windows has no
  POSIX process groups / killpg)
- Remove TestRunCmdKillPgOnTimeout test class (5 tests using sleep 60
  commands were adding significant time to the test suite)

Follow-up to the SQLite-locked-error fix merged in #1728.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 02:10:22 +00:00
Saurabh Misra
cc2e5fbe4e
Merge pull request #1728 from codeflash-ai/fix/java/sqlite-locked-error
fix(java): resolve SQLite 'database is locked' errors across the pipeline
2026-03-02 17:59:57 -08:00
misrasaurabh1
16d314bf90 revert: remove WAL/busy_timeout defense-in-depth SQLite changes
The root cause of 'database is locked' is orphaned Surefire JVM processes
after Maven timeout.  The actual fix is killing the entire process group
(_run_cmd_kill_pg_on_timeout in test_runner.py).

The WAL mode / busy_timeout / sqlite3.connect(timeout=30) changes were
treating the symptom rather than the root cause.  Revert them:

- codeflash/languages/java/instrumentation.py: remove PRAGMA journal_mode=WAL
  and PRAGMA busy_timeout=30000 from inline SQLite write code
- codeflash/verification/parse_test_output.py: revert timeout=30 to default
- codeflash/languages/java/resources/CodeflashHelper.java: revert WAL/busy_timeout PRAGMAs
- codeflash-java-runtime/src/main/java/com/codeflash/Comparator.java: revert busy_timeout PRAGMA
- codeflash-java-runtime/src/main/java/com/codeflash/ResultWriter.java: revert WAL/busy_timeout PRAGMAs
- codeflash/languages/java/resources/codeflash-runtime-1.0.0.jar: restored to pre-change JAR
- tests/test_languages/test_java/test_instrumentation.py: remove TestSQLiteLockedFix
  class and revert snapshot strings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:43:20 +00:00
misrasaurabh1
de0fb91cd3 fix(java): kill entire process group on Maven timeout to prevent orphaned JVMs
Root cause of 'database is locked' errors:
- When Maven times out, subprocess.run() only kills the Maven parent process
- On Linux, Maven's forked Surefire JVM children become orphaned (not killed)
- Orphaned JVMs keep the SQLite result file open, causing SQLITE_BUSY when
  Python reads the file immediately after Maven is killed

Fix: Replace subprocess.run() with _run_cmd_kill_pg_on_timeout() which uses
start_new_session=True + os.killpg() to kill the entire process group on
timeout, ensuring no orphaned JVMs are left behind.

Applied to: _compile_tests, _get_test_classpath, _run_tests_direct,
and _run_maven_tests (the main one).

Also adds 5 unit tests verifying:
- Successful commands return correct output
- Failing commands propagate returncode
- Child processes are killed (not orphaned) on timeout
- returncode is -2 on timeout
- Timeout is described in stderr

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:24:56 +00:00
misrasaurabh1
1fdbd2c04b fix(java): resolve SQLite 'database is locked' errors across the pipeline
Root cause: The instrumented test JVM holds a SQLite connection open while
writing results.  The Python reader and the Java Comparator were trying to
read the same file without a busy_timeout, causing immediate SQLITE_BUSY
failures (~126 occurrences in codeflash_all_3.log).

Fixes applied:

1. instrumentation.py (_generate_sqlite_write_code):
   Emit PRAGMA journal_mode=WAL and PRAGMA busy_timeout=30000 right after
   each inline connection open.  WAL mode lets readers see the last committed
   state while a writer is active; busy_timeout makes lock collisions retry
   instead of immediately failing.

2. parse_test_output.py (parse_sqlite_test_results):
   Add timeout=30 to sqlite3.connect() so Python waits up to 30 s for a
   transient lock to clear (default was 5 s, which was too short for a busy
   Maven/JVM process).

3. Comparator.java (readTestResults):
   Execute PRAGMA busy_timeout=30000 on the same connection before running
   the SELECT, so the Java Comparator also retries instead of failing with
   [SQLITE_BUSY].

4. CodeflashHelper.java (initializeDatabase) and ResultWriter.java (constructor):
   Same WAL + busy_timeout PRAGMAs added after the initial getConnection() call
   for the long-lived database connections used by these helper classes.

5. Updated codeflash-runtime-1.0.0.jar (rebuilt after Comparator/ResultWriter fix).

tests: add TestSQLiteLockedFix with two assertions —
  • _generate_sqlite_write_code emits PRAGMA journal_mode=WAL and
    PRAGMA busy_timeout=30000 before CREATE TABLE
  • parse_sqlite_test_results uses timeout= in sqlite3.connect()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 01:04:38 +00:00
Saurabh Misra
400cf065e6
Merge pull request #1726 from codeflash-ai/fix/java/inner-class-helpers-filter
feat(java): skip inner-class methods in discovery; revert replacement-level workarounds
2026-03-02 16:50:57 -08:00
misrasaurabh1
aae13a8e69 feat: skip inner-class methods in Java discovery; revert replacement-level inner-class workarounds
- parser.py: add `is_class_nested` flag to `JavaMethodNode`; track
  `class_depth` in `_walk_tree_for_methods` (incremented each time a
  type declaration is entered) and set `is_class_nested = True` when
  depth ≥ 2 (method lives inside a nested/inner class)

- discovery.py: add early-exit in `_should_include_method` when
  `method.is_class_nested` is True — inner-class methods cannot be
  reliably instrumented or tested in isolation, so we skip them up-front
  rather than wasting LLM tokens on candidates that will always be
  rejected later

- replacement.py: revert Bug-4 replacement-level workarounds that are
  now obsolete:
  * remove `target_class_name` parameter from `_parse_optimization_source`
  * restore simple first-match `break` in target-method selection
  * remove class_name filter that blocked helpers from "other" classes

- tests: update `TestNestedClasses`, `TestExtractCodeContextWithInnerClasses`
  to reflect the new no-inner-class-discovery contract; remove
  `TestInnerClassHelperFilter` (superseded by discovery filter);
  add `TestInnerClassMethodFilter` in test_discovery.py with four
  scenarios covering static nested, non-static inner, outer-only, and
  deeply-nested classes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:47:01 +00:00
Saurabh Misra
d455afbdc9
Merge pull request #1727 from codeflash-ai/fix/java/constructor-replacement
fix(java): replace modified constructors when LLM adds new final fields
2026-03-02 16:15:48 -08:00
misrasaurabh1
05a9b61478 fix: replace modified constructors when LLM adds new final fields (Java)
When the LLM optimises a method by introducing a new final field (e.g.
caching Arrays.hashCode in Expression.hashCode, or caching map.values() in
LuaMap.valuesIterator), it also modifies the class constructors to initialise
the field.  Previously codeflash:

  1. Added the new field to the class ✓
  2. Replaced the target method ✓
  3. Did NOT update the constructors ✗

This caused "variable X might not have been initialized" compilation errors.

Changes:
- `JavaAnalyzer.find_constructors` (+ `_walk_tree_for_constructors`,
  `_extract_constructor_info`): new parser methods to locate
  `constructor_declaration` nodes via tree-sitter.
- `JavaMethodNode.formal_parameters_text`: captures the raw parameter list
  text so constructors can be matched by signature.
- `ParsedOptimization.modified_constructors`: new field to carry constructor
  source texts that need to be replaced.
- `_parse_optimization_source`: extract constructors from the same class as
  the target method and store in `modified_constructors`.
- `_replace_constructors`: new helper that replaces constructors in the
  original source by matching on formal parameter signature.
- `replace_function`: call `_replace_constructors` after the main method
  replacement when `modified_constructors` is non-empty.

Fixes regressions observed in codeflash_all_3.log:
  LuaMap.valuesIterator, Expression.hashCode, Bin.hashCode,
  NettyTlsContext.createHandler, Pool.capacity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 00:03:38 +00:00
misrasaurabh1
a880ffc48a fix: skip outer-class methods when target is in a static inner class (Java)
When the optimisation target lives in a static inner class (e.g.
ObjectUnpacker inside Unpacker<T>), the LLM-generated class often wraps the
inner class inside the full outer class.  Previously, methods belonging to the
outer class were extracted as "helpers" and injected into the inner class,
causing compilation errors:

  - "non-static type variable T cannot be referenced from a static context"
  - "non-static variable offset cannot be referenced from a static context"

Two related fixes:

1. When _parse_optimization_source extracts helpers, it now skips any method
   whose class_name differs from the target method's class_name.

2. The function now accepts an optional target_class_name parameter.  When
   there are multiple methods with the same name in the generated code (e.g.
   an abstract outer-class method and the concrete inner-class override), the
   method in the target class is preferred over outer-class methods.

Fixes the Unpacker.ObjectUnpacker.getString regression from codeflash_all_3.log.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:59:31 +00:00
Saurabh Misra
1c0fe48741
Merge pull request #1704 from codeflash-ai/fix/java/wrong-method-name-replacement
fix: guard Java replacement against wrong-method-name candidates and anonymous-class method hoisting
2026-03-02 15:45:28 -08:00
misrasaurabh1
cd02dec79f test: use full string equality in anonymous iterator test
Replace substring-based assertions with a single exact string
comparison in test_anonymous_iterator_methods_not_hoisted_to_class,
matching the convention used elsewhere in the test file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 23:38:42 +00:00
claude[bot]
7dd25a08dd style: auto-fix ruff formatting issues 2026-03-02 03:37:52 +00:00
misrasaurabh1
2371540386 fix: guard Java replacement against wrong-method-name candidates and anonymous-class method hoisting
Two bugs in _parse_optimization_source (replacement.py) caused Maven compilation
failures when codeflash optimised aerospike-client-java:

Bug 1 – standalone method with wrong name replaces target
When the LLM generated a standalone method whose name did not match the
optimisation target (e.g. generated `unpackMap` for target `unpackObjectMap`,
or generated `sizeTxn` for target `estimateKeySize`), the function fell back to
using the entire generated snippet as `target_method_source`.  This silently
replaced the target with the wrong method, producing:
  • a duplicate definition of the wrong method
  • removal of the target method (breaking all callers)

Fix: after parsing standalone (class-free) code, verify that at least one
discovered method matches the target name.  If no match is found, set
`target_method_source` to the empty string and log a warning.  A corresponding
guard in `replace_function` returns the original source unchanged when
`target_method_source` is empty.

The same guard is applied to the full-class path: if the generated class does
not contain the target method, the candidate is also rejected.

Bug 2 – anonymous inner-class methods hoisted as top-level helpers
When an optimised method returned an anonymous class (e.g. `keySetIterator`
returning `new Iterator<LuaValue>() { … }`), tree-sitter's recursive walk
found the anonymous class's `hasNext`, `next`, and `remove` method_declaration
nodes and classified them as helpers to be inserted at the outer-class level.
The inserted methods carried `@Override` annotations that matched nothing in the
outer class and referenced local variables (`it`) that were only in scope inside
the optimised method body, producing compilation errors.

Fix: when extracting helpers from the optimised class, skip any method whose
line range is entirely contained within the target method's line range.  Such
methods belong to anonymous/nested classes inside the method body and must not
be hoisted out as standalone class members.

Tests added for both bugs in TestWrongMethodNameGeneration and
TestAnonymousInnerClassMethods.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 03:35:42 +00:00
Saurabh Misra
24e26c3944
Merge pull request #1698 from codeflash-ai/fix-runtime-calculation
fix runtime calculation for java
2026-03-01 02:32:15 -08:00
misrasaurabh1
d9de2cf459 add the jar back again 2026-03-01 02:24:15 -08:00
misrasaurabh1
1bcca07df3 revert 2026-03-01 02:11:16 -08:00
misrasaurabh1
4ecb6260fc fix: Java comparator key collision and add A/A baseline test
The Java Comparator used only iteration_id as the map key when comparing
test results. Since every test method had iteration_id="1", all rows
collapsed to a single entry and only the last row survived. Because JUnit
test execution order is non-deterministic, the surviving row differed
between baseline and candidate runs, causing false correctness failures.

Fix the key to include test_module_path:test_class_name:test_function_name
for unique identification. Also add an A/A test that runs the original code
through the candidate pipeline to detect verification bugs like this.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 02:05:07 -08:00
misrasaurabh1
00a5dc7712 remove logging statements 2026-02-28 23:47:40 -08:00
misrasaurabh1
e1fb4b81e8 fix runtime calculation for java 2026-02-28 21:35:45 -08:00
HeshamHM28
04b0dbdaf5
Merge pull request #1674 from codeflash-ai/fix/java/pr-test-path-resolution
fix: resolve Java test paths with .java extension in PR results
2026-02-25 20:56:19 -08:00
HeshamHM28
23db0d75b8 fix: add Java support for test module path resolution in existing_tests_source_for 2026-02-26 06:39:06 +02:00
mashraf-222
380e55b848
Merge pull request #1654 from codeflash-ai/feat/direct-jvm-behavioral-tests
feat: direct JVM execution for behavioral and line-profile tests
2026-02-25 17:21:27 +02:00
HeshamHM28
a3dd3be347
Merge pull request #1543 from codeflash-ai/fix/java/line-profiler
feat: implement line-level profiling agent with ASM instrumentation
2026-02-24 22:33:05 -08:00
HeshamHM28
5447c83013
Merge pull request #1659 from codeflash-ai/codeflash/optimize-pr1543-2026-02-25T06.07.49
️ Speed up method `JavaLineProfiler.instrument_function` by 26% in PR #1543 (`fix/java/line-profiler`)
2026-02-24 22:17:34 -08:00
HeshamHM28
bde5185334 fix: optimize hit method to reduce profiler overhead and improve timing accuracy 2026-02-25 08:08:00 +02:00
codeflash-ai[bot]
b69f4d533b
Optimize JavaLineProfiler.instrument_function
Runtime improvement: the optimized version runs ~25% faster (2.06 ms -> 1.64 ms), with larger wins on functions with many instrumentable lines.

What changed
- Precompute file_path.as_posix() once per function: file_posix = file_path.as_posix() and reuse it for content_key and the profiled f-string instead of calling file_path.as_posix() repeatedly inside the loop.
- Combined multiple startswith checks into a single startswith(("//", "/*", "*")) call to replace three separate str.startswith() calls.

Why this speeds things up
- Attribute lookups and small method calls are relatively expensive in Python. In the original code file_path.as_posix() was called twice for every instrumented line (once for content_key and once inside the profiled f-string). Moving that result into a local variable removes those repeated method calls and attribute lookups and replaces them with a fast local variable load.
- Multiple str.startswith() calls were replaced by one startswith(tuple) call, cutting the number of Python-level function calls and condition checks for candidate comment lines.
- These savings multiply with the number of lines inside a function. The loop is the hot path: each instrumented statement previously did several extra method calls and string operations; removing them reduces per-line overhead and thus the total runtime.

Observed evidence
- Line-profiling shows heavy time spent on the two expressions that used file_path.as_posix() repeatedly; those costs drop in the optimized profile.
- The annotated tests show the biggest improvements on the large-scale test (many lines), which matches the expectation that per-line micro-optimizations are most beneficial when the loop has many iterations.

Behavioral impact and trade-offs
- No functional change: the instrumented output and semantics are unchanged.
- A tiny upfront cost (computing file_posix once) is paid even if no lines are instrumented, but that's negligible and worth the per-line savings.
- The exception path (logger.warning) shows a larger percent of the shorter total time in the optimized profile — this is not a regression in practice, just a profiling artifact because the overall runtime decreased; exception handling remains unchanged.

When this matters most
- Hot paths that instrument long functions or many functions (the large-scale test) gain the most.
- Small functions still benefit (the overall runtime measured improved), but the relative improvement is smaller because fixed costs (parse, etc.) dominate.

Summary
Precomputing file_posix and reducing redundant startswith/attribute calls cut down repeated Python-level work inside the main loop. That directly lowers per-line overhead and yields the measured ~25% runtime improvement, especially on workloads with many instrumentable lines.
2026-02-25 06:07:52 +00:00
HeshamHM28
96e231edaa Merge branch 'fix/java/line-profiler' of github.com:codeflash-ai/codeflash into fix/java/line-profiler 2026-02-25 07:56:27 +02:00
HeshamHM28
9b6d645ab6 fix: update profiling logic and improve loop index handling in tests 2026-02-25 07:43:14 +02:00
HeshamHM28
24d38b6fae Merge branch 'omni-java' into fix/java/line-profiler 2026-02-25 07:09:00 +02:00
Mohamed Ashraf
2e078f775e revert: restore Maven-based execution for line-profile tests
Line-profile direct JVM execution is being handled in a separate PR.
Reverting to avoid conflicts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 00:23:56 +00:00
Mohamed Ashraf
aaee7aa951 feat: direct JVM execution for behavioral and line-profile tests
Behavioral tests (non-coverage) and line-profile tests now compile once
with Maven then run directly via `java -cp ... ConsoleLauncher`, matching
the pattern already used by benchmarking tests. Falls back to Maven on
failure. Also caches classpath resolution across all test modes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 23:03:40 +00:00
aseembits93
efb5cd9987 reverting workflows to original state 2026-02-24 21:54:58 +05:30
aseembits93
bdd22afe20 chore: sync workflow files from main
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 21:49:11 +05:30
aseembits93
51028a6edf updating workflows to mains 2026-02-24 21:41:39 +05:30
Saurabh Misra
82a8c41c48
Merge pull request #1638 from codeflash-ai/loop-id-calculation-fix-bug
fix: Java loop ID calculation and assertion transformer bug
2026-02-23 00:02:52 -08:00
claude[bot]
00d02c1f13 style: fix mypy type errors in Java formatter and comparator 2026-02-23 07:49:46 +00:00
misrasaurabh1
3ba1ac4a9b Merge remote-tracking branch 'origin/loop-id-calculation-fix-bug' into loop-id-calculation-fix-bug 2026-02-22 23:45:49 -08:00
misrasaurabh1
62db2360b5 fix: use correct iteration_id format for Java performance mode
Changed iteration_id in performance mode markers to properly encode
inner loop iterations for test case grouping:

- Single call: iteration_id = innerIteration (0, 1, 2...)
- Multiple calls: iteration_id = callId_innerIteration (1_0, 1_1, 2_0, 2_1...)

This allows test results to be properly grouped by InvocationId, where
each unique (call, inner_iteration) pair gets its own group for
calculating minimum runtimes across outer loops.

Fixed test expectations to match the new format.

All 43 Java performance tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-22 23:42:26 -08:00
claude[bot]
fcdb406238 style: auto-fix linting issues 2026-02-23 06:59:18 +00:00
misrasaurabh1
83fa09743a Merge remote-tracking branch 'origin/loop-id-calculation-fix-bug' into loop-id-calculation-fix-bug 2026-02-22 22:55:37 -08:00