Add all non-required-check E2E workflows and prek lint to the
consolidated ci.yaml:
- 4 standard Python E2Es (async, benchmark, coverage, init)
- 3 JS E2Es (cjs-function, esm-async, ts-class)
- 2 Java E2Es (fibonacci-nogit, tracer)
- prek lint
New change detection outputs:
- e2e_js: triggers JS E2Es when packages/ changes
- e2e_java: triggers Java E2Es when java runtime/fixtures change
Total: 17 jobs + determine-changes + gate = 19 jobs in one file.
Down from 22 workflow files to 7 (remaining are non-test: claude,
codeflash self-optimize, label-workflow-changes, publish, java-e2e).
Additional savings per irrelevant PR: ~$0.80 (10 jobs x ~$0.08).
Total per skipped PR: ~$1.85.
ci.yaml was in all three check_paths calls, so creating/modifying
the workflow itself triggered all test jobs. Workflow-only PRs
should skip tests — the gate job still validates the pattern.
Replace 7 individual required-check workflows (unit-tests, mypy,
5 E2E tests) with a single ci.yaml following the astral-sh/ruff
gate pattern:
- determine-changes job uses native git diff (no third-party deps)
- Each test job skipped at job level when paths don't match
- Single required-checks-passed gate job accepts success + skipped
- E2E security preserved: environment gating, author allowlists
This fixes the long-standing issue where workflow-level path filters
leave required checks "Pending" on PRs that don't touch code paths,
blocking merge without admin override.
Estimated savings: ~$1.05/skipped PR ($0.64 unit-tests + $0.01
type-check + $0.40 E2E), ~$50-100/yr in compute, plus eliminating
all admin-merge workarounds.
Editing a workflow YAML file should not trigger that same workflow
to run. Removes .github/workflows/<file> from its own paths filter
in mypy.yml, prek.yaml, and unit-tests.yaml.
- codeflash-optimize.yaml: replace paths: ['**'] wildcard with targeted filters
- mypy.yml: add path filters (was firing on every PR/push including docs)
- prek.yaml: add path filters (was firing on every PR)
- unit-tests.yaml: add path filters (was firing on every PR/push)
Docs-only, README, experiment, and LICENSE changes no longer trigger
these workflows. Saves ~20 workflow runs per docs-only PR.
Adds `false &&` guard to the pr-review job condition. The job will
be skipped on all triggers until this is reverted. The @claude mention
job is unaffected.
v1.0.90 broke Bedrock OIDC auth — all Claude Code runs have been
failing with 403 since Apr 8.
Root cause: anthropics/claude-code-action#1196
Pinning to v1.0.89 (last working version) until upstream fix lands.
All 12 E2E workflows used `paths: ['**']` which triggered on every file
change — docs, configs, experiments, etc. This caused ~140-200 min of
compute per push event (18+ parallel workflows).
Now E2E tests only trigger when relevant source code changes:
- Python E2E: codeflash/**, tests/**, pyproject.toml, uv.lock, workflow files
- JS E2E: same + packages/**
- Java E2E: already had proper path filters (no change needed)
Estimated savings: ~$150-200/mo in CI compute.
- Trigger on any codeflash/** or tests/** changes (not just java subset)
- Validate replay test files are discovered per-function
- Already validates: replay test generation, global discovery count,
optimization success, and minimum speedup percentage
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Triage now classifies PRs as TRIVIAL/SMALL/LARGE based on lines changed
- SMALL PRs: focused correctness check, quick duplicate scan, skip coverage
- LARGE PRs: full review with design checks, deep duplicate detection, coverage
- Optimization PRs: concise correctness verdict instead of long essays
- Added explicit scope rules: only read files in the diff, don't explore broadly
Instead of immediately closing optimization PRs when CI fails, Claude
now checks out the branch, inspects failures, and attempts to fix them.
Only closes if unfixable, with a specific explanation of the failures.
Merges the omni-main-java branch which synced main into omni-java,
including JavaFunctionOptimizer, removal of is_java()/is_python() guards,
protocol dispatch for parse_test_xml, and deletion of concolic_testing.py.
Drop the /simplify step that caused unprompted refactors and scope
creep in PR reviews. Also add prek pre-commit rule to project config
so the PR bot and all contributors see it.
Drop the /simplify step that caused unprompted refactors and scope
creep in PR reviews. Also add prek pre-commit rule to project config
so the PR bot and all contributors see it.
The unit-tests workflow relied on a pre-committed JAR binary in
resources/ which could become stale when Comparator.java changes.
Now the workflow builds the JAR from source and installs it to the
local Maven repo, matching what java-e2e and fibonacci-nogit already do.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add paths-ignore to skip reviews for docs, config, CI, and non-production files
- Use github.event.sender.login instead of github.actor for reliable bot detection
- Add triage step to early-exit on trivial PRs
Consolidates duplicate-code-detector.yml into claude.yml as a step in
the pr-review job. Adds concurrency groups with cancel-in-progress to
prevent comment spam from racing workflow runs.
- Use XML structure instead of markdown for clearer step boundaries
- Resolve stale review threads via GraphQL instead of leaving them
- Positive framing instead of negation for instructions
- Replace aggressive language with calm direct instructions
- Add /simplify skill invocation for code quality pass
- Add verification checkpoint at the end
- Auto-close stale codeflash optimization PRs (age, conflicts, CI failures, deleted functions)
- Remove inline comment MCP tool, add Skill tool
Remove redundant windows-unit-tests.yml and add Windows Python 3.13 job
to the main unit-tests.yaml workflow. Add PYTHONIOENCODING env var for
Windows compatibility.
Run unit tests on Windows with Python 3.13 in addition to all Python
versions (3.9-3.14) on Ubuntu. This ensures cross-platform compatibility
is tested while keeping Windows test duration reasonable.
Extend the publish workflow to handle both codeflash and codeflash-benchmark
releases from a single workflow file, triggered by their respective version
files. Also syncs benchmark __init__.py version to match pyproject.toml.
The cross-region inference profile for Claude Opus 4.6 on Bedrock is
`us.anthropic.claude-opus-4-6-v1`, not `us.anthropic.claude-opus-4-6-v1:0`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace Azure Foundry authentication with AWS Bedrock OIDC in all
Claude Code GitHub Actions workflows.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>