Commit graph

7918 commits

Author SHA1 Message Date
Kevin Turcios
d7a4c762cf Remove test fixture lockfile: code_to_optimize_vitest 2026-04-23 04:14:16 -05:00
Kevin Turcios
d4f4563f0d Remove test fixture lockfile: code_to_optimize_ts 2026-04-23 04:14:08 -05:00
Kevin Turcios
3080c1df80 Remove test fixture lockfile: code_to_optimize_mocha 2026-04-23 04:14:01 -05:00
Kevin Turcios
a1d822801a Remove test fixture lockfile: code_to_optimize_js_esm 2026-04-23 04:13:58 -05:00
Kevin Turcios
6c58ac462b Remove test fixture lockfile: code_to_optimize_js_cjs 2026-04-23 04:13:56 -05:00
Kevin Turcios
14be2aa1f8 Remove test fixture lockfile: code_to_optimize_js 2026-04-23 04:13:54 -05:00
Kevin Turcios
d6d40ed431 Gitignore code_to_optimize lockfiles, re-enable Dependabot updates
- Add code_to_optimize/**/package-lock.json to .gitignore
- Re-enable Dependabot version updates with limit of 5 PRs per ecosystem
- Keep code_to_optimize/ ignore comment in dependabot.yml
2026-04-23 04:13:23 -05:00
Kevin Turcios
e1a7569c94
Merge pull request #2061 from codeflash-ai/dependabot/uv/uv-0.11.6
chore(deps-dev): bump uv from 0.11.2 to 0.11.6
2026-04-23 03:26:11 -05:00
Kevin Turcios
d76c516e84
Merge pull request #2078 from codeflash-ai/dependabot/uv/lxml-6.1.0
chore(deps): bump lxml from 6.0.2 to 6.1.0
2026-04-23 03:25:08 -05:00
Kevin Turcios
8956fdac22
Merge pull request #2094 from codeflash-ai/test/coverage-infrastructure
test: set up coverage infrastructure in CI
2026-04-23 03:18:57 -05:00
Kevin Turcios
970aeb4430
Merge branch 'main' into dependabot/uv/uv-0.11.6 2026-04-23 03:12:55 -05:00
Kevin Turcios
a3bb01243e
Merge branch 'main' into dependabot/uv/lxml-6.1.0 2026-04-23 03:12:54 -05:00
Kevin Turcios
14ca2c897d
Merge pull request #2093 from codeflash-ai/chore/require-linked-issue-on-prs
chore: require PRs to link an issue or discussion
2026-04-23 03:08:29 -05:00
Kevin Turcios
0232d84a7d fix: exclude test_tracer.py from coverage run and lower floor to 58%
pytest-cov's trace function conflicts with the Tracer class under test,
causing it to self-disable in CI. Linux also reports ~1% lower coverage
than macOS due to platform-specific branches.
2026-04-23 03:04:49 -05:00
Kevin Turcios
a4b74fa500
Merge pull request #2095 from codeflash-ai/chore/add-codeowners
chore: add CODEOWNERS based on git history
2026-04-23 02:56:39 -05:00
Kevin Turcios
9d9e7cd0ee chore: add CODEOWNERS based on git history
Assigns per-directory code ownership to current org members based on
full commit history analysis, so PRs automatically request reviews from
the right people.
2026-04-23 02:55:05 -05:00
Kevin Turcios
2c79e50d68 test: set up coverage infrastructure in CI
- Add pytest-cov to dev dependencies
- Add .coveragerc with branch coverage, 60% floor (current baseline),
  and source/omit configuration
- Add coverage CI job (ubuntu/py3.13) that runs pytest with --cov,
  enforces the floor, and uploads coverage.xml as an artifact
- Wire coverage into the required-checks-passed gate

Closes #2080
2026-04-23 02:30:46 -05:00
Kevin Turcios
972d88c108 chore: require PRs to link an issue or discussion
- Add PR template with required linked issue/discussion section
- Add check-linked-issue CI job that validates PR body contains a
  reference (#123, Closes/Fixes/Relates, GitHub URL, or CF-# ticket)
- Wire into required-checks-passed gate so it blocks merge
- Update CONTRIBUTING.md with the policy and motivation
2026-04-23 02:27:49 -05:00
dependabot[bot]
83c87a75a1
chore(deps): bump lxml from 6.0.2 to 6.1.0
Bumps [lxml](https://github.com/lxml/lxml) from 6.0.2 to 6.1.0.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-6.0.2...lxml-6.1.0)

---
updated-dependencies:
- dependency-name: lxml
  dependency-version: 6.1.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-04-21 23:06:07 +00:00
mashraf-222
67cf123929
Merge pull request #2064 from codeflash-ai/fix/tracer-subprocess-exit-codes
fix: check subprocess exit codes in Java tracer
2026-04-21 15:35:46 +02:00
Kevin Turcios
1d26014d61
Merge pull request #2077 from mvanhorn/cf-522-contributing-guide
docs: add CONTRIBUTING.md
2026-04-21 03:30:52 -05:00
Matt Van Horn
1112646e4e
docs: add CONTRIBUTING.md
Closes #522

Covers the two audiences the issue calls out:

1. People contributing changes back to Codeflash - development
   environment setup with uv, the single-command verification via
   uv run prek, test runner invocation, code-style pointers to
   .claude/rules/code-style.md, and the branch / commit / PR
   conventions from .claude/rules/git.md and CLAUDE.md.

2. People using Codeflash in editable mode from a source checkout
   to optimize their own projects, including the install commands
   for uv and pip, when editable mode is appropriate vs installing
   the PyPI package, and a pointer to the README Quick Start for
   the codeflash init flow.
2026-04-21 01:29:03 -07:00
mashraf-222
ef535b8834
Merge pull request #2065 from codeflash-ai/fix/gradle-configure-on-demand
fix: add --configure-on-demand to all Gradle commands
2026-04-21 03:44:10 +02:00
Mohamed Ashraf
a4473c3684 merge: resolve conflict with main — adapt exit-code handling to combined invocation
Keep the combined JFR + tracing agent single JVM invocation from main while
preserving the fix's intent: raise when trace-db was not created, warn when
exit code is non-zero but trace-db exists. Integration tests rewritten to
match the combined-invocation semantics.
2026-04-21 01:40:26 +00:00
Sarthak Agarwal
d8b62367ce
Merge pull request #2067 from codeflash-ai/update_docs
update Docs for Plugin
2026-04-15 00:38:31 +05:30
Sarthak Agarwal
3b8a2e5c82 update Docs for Plugin 2026-04-15 00:37:17 +05:30
dependabot[bot]
ced8a746cd
chore(deps-dev): bump uv from 0.11.2 to 0.11.6
Bumps [uv](https://github.com/astral-sh/uv) from 0.11.2 to 0.11.6.
- [Release notes](https://github.com/astral-sh/uv/releases)
- [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/uv/compare/0.11.2...0.11.6)

---
updated-dependencies:
- dependency-name: uv
  dependency-version: 0.11.6
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-04-13 10:07:46 +00:00
Kevin Turcios
4d4cb5f517
Merge pull request #2059 from codeflash-ai/refactor/benchmarks-to-dotcodeflash
Move benchmarks to .codeflash/benchmarks/
2026-04-13 05:06:00 -05:00
Kevin Turcios
819a56c33e
Merge pull request #2058 from codeflash-ai/perf/reduce-java-tracer-e2e
perf: optimize Java tracing agent (E2E reduction + serialization + writes)
2026-04-10 18:43:58 -05:00
Mohamed Ashraf
a7371b55ca fix: add --configure-on-demand to all Gradle commands
Gradle evaluates all project configurations during the configuration
phase, even when only one module is targeted. Multi-module projects with
diverse toolchain requirements (e.g., OpenRewrite's rewrite-gradle needs
JDK 8) fail when an unrelated module's toolchain isn't available.

Adds --configure-on-demand to all 8 Gradle command construction sites
so Gradle only configures projects needed for the requested task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 21:46:42 +00:00
Mohamed Ashraf
470482e824 fix: check subprocess exit codes in Java tracer
_run_java_with_graceful_timeout() discarded the subprocess exit code in
both the no-timeout and timeout paths. If Maven/Gradle failed (compilation
error, OOM, etc.), the tracer silently continued with missing/stale data.

Now returns the exit code. Stage 1 (JFR profiling) warns on failure but
continues. Stage 2 (argument capture) raises RuntimeError since trace
data is essential for replay test generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 21:46:11 +00:00
Kevin Turcios
b737f71e46 fix: update test assertions to match simplified Workload fixture
The Workload.java fixture was trimmed to only repeatString but test
files still asserted computeSum, filterEvens, and instanceMethod.
2026-04-10 16:05:27 -05:00
Kevin Turcios
0cb67c1a17 fix: add --no-pr to codeflash optimize workflow to prevent CI-opened PRs 2026-04-10 15:12:48 -05:00
Kevin Turcios
5c778dfad4 perf: trim tracer E2E workload to single function (repeatString)
Keep only repeatString which reliably produces 284% improvement.
Drop computeSum (marginal 16%), filterEvens and instanceMethod (no
optimization found). Reduces tracer E2E from ~1h27m to ~21m.
2026-04-10 15:08:03 -05:00
Kevin Turcios
40f16b565a ci: add standalone Java E2E workflow for isolated testing 2026-04-10 13:09:36 -05:00
Kevin Turcios
cb87763a2d fix: skip environment approval gate for trusted users on workflow_dispatch 2026-04-10 12:58:54 -05:00
Kevin Turcios
013c83f5e4 fix: drop jdk.ExecutionSample#period from combined JFR opts (unsupported on Java 11) 2026-04-10 09:11:02 -05:00
Kevin Turcios
0d928f2b49 perf: merge Java tracer into single-pass JVM invocation
Combine JFR profiling and argument capture agent into one
JAVA_TOOL_OPTIONS string, running the target program once instead of
twice. JFR and javaagent are orthogonal JVM features that coexist
without conflict. Keeps build_jfr_env/build_agent_env for standalone
use.
2026-04-10 09:05:30 -05:00
Kevin Turcios
ecf4e63eca perf: reduce Java E2E looping time to 5s and cache runtime JAR build
Make TOTAL_LOOPING_TIME configurable via CODEFLASH_LOOPING_TIME env var
(defaults to 10s). Set to 5s in Java E2E CI jobs to cut verification
time per candidate. Also cache the codeflash-runtime JAR keyed on
source hash to skip mvn install when unchanged.
2026-04-10 09:02:45 -05:00
Kevin Turcios
8959ead2f9 fix: resolve Windows 8.3 short paths in get_run_tmp_file and fix ruff lint errors
Add .resolve() to TemporaryDirectory path to expand Windows 8.3 short
paths (e.g. RUNNER~1) to canonical long form, fixing test_pickle_patcher
failures on Windows CI. Also add missing return type annotations and
noqa suppressions for benchmark test file.
2026-04-10 08:51:10 -05:00
Kevin Turcios
ec14860d29 Move benchmarks to .codeflash/benchmarks/ and auto-discover
Move codeflash's own benchmarks to .codeflash/benchmarks/. Add
auto-discovery of .codeflash/benchmarks/ in codeflash compare and
benchmark mode -- when benchmarks-root is not explicitly configured,
the CLI checks for .codeflash/benchmarks/ before erroring.

Backwards compatible: users with existing benchmarks-root config
are unaffected. Docs continue to show tests/benchmarks as the
example path.
2026-04-10 08:39:15 -05:00
Kevin Turcios
151df774a4 perf: use --effort low for java-tracer E2E to reduce CI time 2026-04-10 08:29:46 -05:00
Kevin Turcios
b05561ef9e chore: replace console.print with logger.info for Java project detection 2026-04-10 07:51:08 -05:00
Kevin Turcios
70260f22b3 fix: ensure language_version is detected before optimization API calls
JavaSupport.ensure_runtime_environment() was never called during the
optimization flow, so _language_version stayed None and the backend
received language_version=null. The LLM had no Java version constraint,
causing it to generate Java 16+ APIs (e.g. Stream.toList()) for Java 11
projects.
2026-04-10 07:39:49 -05:00
Kevin Turcios
82ec301fad chore: remove diagnostic logging from compare_test_results 2026-04-10 06:49:43 -05:00
Kevin Turcios
986654b7e6 fix: pin PYTHONHASHSEED=0 in test env and enhance diff diagnostics
Set PYTHONHASHSEED=0 in test subprocess environments so original and
candidate runs use identical hash behavior, eliminating a source of
non-deterministic return-value comparisons.

Also upgrade diff logging from debug to info level with actual types
and repr values for DID_PASS, RETURN_VALUE, and STDOUT diffs.
2026-04-10 06:38:08 -05:00
Kevin Turcios
e191f74aa6 chore: add diagnostic logging to compare_test_results
Temporary instrumentation to debug flaky futurehouse E2E test.
Logs matched/skipped/timed-out counts and did_all_timeout state.
2026-04-10 06:16:39 -05:00
Kevin Turcios
fefccd5935 fix: drop JFR inline event config that breaks JDK 11
The jdk.ExecutionSample#period=1ms syntax in -XX:StartFlightRecording
is only supported on JDK 13+. On JDK 11 (CI), it causes
"Failure when starting JFR on_create_vm_2" and no JFR file is created.
The settings=profile preset still provides 10ms CPU sampling.
2026-04-10 05:28:34 -05:00
Kevin Turcios
bfe6f3a828 Remove debug timing instrumentation from tracer
Strip AtomicLong accumulators, System.nanoTime() timing, and
getTimingSummary() that were added for profiling. No functional change.
2026-04-10 05:16:49 -05:00
Kevin Turcios
01e22152c7 flexing 2026-04-10 05:07:53 -05:00