ci.yaml was in all three check_paths calls, so creating/modifying
the workflow itself triggered all test jobs. Workflow-only PRs
should skip tests — the gate job still validates the pattern.
Replace 7 individual required-check workflows (unit-tests, mypy,
5 E2E tests) with a single ci.yaml following the astral-sh/ruff
gate pattern:
- determine-changes job uses native git diff (no third-party deps)
- Each test job skipped at job level when paths don't match
- Single required-checks-passed gate job accepts success + skipped
- E2E security preserved: environment gating, author allowlists
This fixes the long-standing issue where workflow-level path filters
leave required checks "Pending" on PRs that don't touch code paths,
blocking merge without admin override.
Estimated savings: ~$1.05/skipped PR ($0.64 unit-tests + $0.01
type-check + $0.40 E2E), ~$50-100/yr in compute, plus eliminating
all admin-merge workarounds.
Editing a workflow YAML file should not trigger that same workflow
to run. Removes .github/workflows/<file> from its own paths filter
in mypy.yml, prek.yaml, and unit-tests.yaml.
- codeflash-optimize.yaml: replace paths: ['**'] wildcard with targeted filters
- mypy.yml: add path filters (was firing on every PR/push including docs)
- prek.yaml: add path filters (was firing on every PR)
- unit-tests.yaml: add path filters (was firing on every PR/push)
Docs-only, README, experiment, and LICENSE changes no longer trigger
these workflows. Saves ~20 workflow runs per docs-only PR.
Move cli, console, env_utils, checkpoint, config_parser, and
version_check imports from module level into main(). These imports
trigger the full dependency chain (cfapi, models, PrComment, libcst,
requests, Rich) costing ~500ms on every CLI invocation — even for
simple commands that dont need most of these modules.
Also moves paneled_text import into print_codeflash_banner() and
passes process_pyproject_config as parameter to _handle_config_loading
to avoid a module-level reference.
When the user runs `codeflash --version`, read the version string
and exit immediately without importing cli, telemetry, models, or
any other heavy modules. This mirrors the pattern used in pip where
`pip --version` was optimized from 138ms to 20ms (7x).
Before: 524ms (imports cli.py -> cfapi -> models -> libcst -> ...)
After: ~16ms (imports only codeflash.version)
Sets open-pull-requests-limit: 0 on all ecosystems. Existing open
Dependabot PRs are unaffected — this only prevents new ones.
Re-enable by removing the open-pull-requests-limit lines.
Adds `false &&` guard to the pr-review job condition. The job will
be skipped on all triggers until this is reverted. The @claude mention
job is unaffected.
Move heavy third-party imports (posthog, sentry_sdk, and their
integrations) from module level into the functions that use them.
These imports cost ~350ms combined but are only needed when
telemetry is actually initialized, not on every CLI invocation.
- posthog_cf.py: defer `from posthog import Posthog` into
initialize_posthog(), defer cfapi/console/lsp imports into ph()
- sentry.py: defer `import sentry_sdk` and integrations into
init_sentry()
Dependabot was auto-discovering all package.json and pyproject.toml
files including 12 in code_to_optimize/ (test fixtures). These PRs
always fail because E2E tests need secrets unavailable on Dependabot
PRs — 70% of Dependabot runs were failing on vite updates to fixtures.
Explicit config monitors only the real dependency files:
- / (root pyproject.toml)
- /packages/codeflash (npm package)
- GitHub Actions versions
v1.0.90 broke Bedrock OIDC auth — all Claude Code runs have been
failing with 403 since Apr 8.
Root cause: anthropics/claude-code-action#1196
Pinning to v1.0.89 (last working version) until upstream fix lands.
All 12 E2E workflows used `paths: ['**']` which triggered on every file
change — docs, configs, experiments, etc. This caused ~140-200 min of
compute per push event (18+ parallel workflows).
Now E2E tests only trigger when relevant source code changes:
- Python E2E: codeflash/**, tests/**, pyproject.toml, uv.lock, workflow files
- JS E2E: same + packages/**
- Java E2E: already had proper path filters (no change needed)
Estimated savings: ~$150-200/mo in CI compute.
Move subagent tracking from a single event property to the ph() function
so every PostHog event is automatically tagged with subagent: true/false.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds `subagent: bool` property to the existing run-start event so PostHog
can segment and compare agent-driven vs human CLI optimization runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Java to supported languages in how-codeflash-works, add auth and
GitHub App steps to java-installation, add Java tab to codeflash-all
tip, reorder trace-and-optimize Java examples, and clarify Java class
method syntax in one-function.
Closes CF-1090
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Normalize paths to forward slashes in JS/TS code generation and coverage
parsing — backslashes are escape chars in JavaScript strings and cause
silent corruption on Windows. Also relax timing test thresholds for CI.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gradle --no-daemon on multi-module projects forces a full JVM cold start
for every invocation. On eureka, configuration + dependency resolution
alone takes ~10 min before tests even start. 900s was still getting
killed at the boundary.
1200s (20 min) provides headroom for: cold Gradle startup (~10 min) +
test execution (~5 min) + JaCoCo overhead + safety margin.
PR #2013 iterated through 300→600→900s and found 900s sufficient only
when build caches were warm from prior invocations in the same session.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two changes to prevent cold-build timeouts on large multi-module Gradle
projects (e.g., eureka ~16 min cold build):
1. install_multi_module_deps now compiles testClasses instead of just
classes, so the test execution timeout only covers running tests,
not compilation.
2. Pre-install compilation timeout increased from 300s to 900s to
accommodate cold Gradle --no-daemon builds on large projects.
Combined with the coverage min_timeout of 900s (previous commit),
compilation and test execution each get their own 900s budget instead
of sharing one.
Ported from PR #2013 experience where 300s/600s were validated as
insufficient for eureka cold builds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gradle --no-daemon on multi-module projects (e.g., eureka) needs cold
JVM startup + dependency resolution + compilation + test execution +
JaCoCo agent overhead, which exceeds 300s. At 300s the process is
killed mid-execution, producing partial results that the pipeline
can't use for behavioral baseline.
Ported from PR #2013 which validated 300s→600s→900s progression on
eureka (build takes ~10 min, 900s provides safe headroom).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The regex for extracting modules from settings.gradle only matched
single-line include statements. Multi-line includes like eureka's
(include 'a',\n 'b',\n 'c') only captured the first module, causing
test_module to be None and breaking multi-module path resolution
(e.g., classfiles lookup for JaCoCo coverage conversion).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Gradle JaCoCo plugin approach (jacocoTestReport task) fails on
multi-module projects and adds 5-10 min overhead. Replace with:
1. Inject -javaagent:{runtime_jar}=destfile={exec} via JAVA_TOOL_OPTIONS
(AgentDispatcher routes destfile= args to JaCoCo PreMain)
2. Run tests without jacocoTestReport task
3. Convert .exec to .xml via shaded JaCoCo CLI in the runtime JAR
This eliminates the "jacocoTestReport not found" error on eureka and
similar multi-module Gradle projects, and removes build file mutation
for coverage setup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gradle resolves the codeflash-runtime JAR to ~/.gradle/caches/, not
~/.m2/. Add an optional classpath parameter to find_agent_jar() that
searches the resolved classpath for the JAR before falling back to
the existing ~/.m2 / resources / dev-build chain.
Thread the parameter through build_javaagent_arg, build_agent_env,
instrument_source_for_line_profiler, and line_profiler_step so the
optimization pipeline passes the resolved classpath automatically.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Generalize _find_top_level_dependencies_block() into _find_top_level_block(name)
so it can find any top-level block (dependencies, repositories, etc.)
- Rewrite _ensure_maven_central_repo() to use tree-sitter instead of regex,
preventing false matches inside buildscript/subprojects/allprojects blocks
- Add _update_existing_codeflash_dependency() to replace stale versions or
old files() format with the current Maven Central coordinate
- Wire version update into add_codeflash_dependency() and
add_codeflash_dependency_multimodule() so old entries get updated instead
of silently skipped
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The optimization pre-compiles three regex patterns at module load time (`_INCLUDE_PATTERN`, `_LISTOF_PATTERN`, `_QUOTED_PATTERN`) instead of recompiling them on every function call, eliminating the ~1 ms pattern-compilation overhead that line profiler shows dominated the original version (44.3% of total time in the first `re.findall` alone). The second major change replaces the O(n) `if stripped not in modules` list scan with a set-based `if stripped not in seen` check, which cuts the deduplication cost from ~288 ns to ~72 ns per check when the fallback listOf branch executes. Runtime improves from 2.35 ms to 1.91 ms (23% faster) with no behavioral regressions.
- Parse listOf(...) patterns in settings.gradle.kts for projects that
build include lists dynamically (e.g. OpenRewrite)
- Use word boundary in include regex to avoid matching variable names
like 'includedProjects'
- Break module voting ties using codeflash.toml module-root config,
so the function's own module is preferred over cross-module tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instrumentation writes verification_type="void_state" for void methods,
but the enum lacked this value, causing ValueError on every SQLite row parse.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>