codeflash

Author	SHA1	Message	Date
Kevin Turcios	986654b7e6	fix: pin PYTHONHASHSEED=0 in test env and enhance diff diagnostics Set PYTHONHASHSEED=0 in test subprocess environments so original and candidate runs use identical hash behavior, eliminating a source of non-deterministic return-value comparisons. Also upgrade diff logging from debug to info level with actual types and repr values for DID_PASS, RETURN_VALUE, and STDOUT diffs.	2026-04-10 06:38:08 -05:00
Kevin Turcios	e191f74aa6	chore: add diagnostic logging to compare_test_results Temporary instrumentation to debug flaky futurehouse E2E test. Logs matched/skipped/timed-out counts and did_all_timeout state.	2026-04-10 06:16:39 -05:00
Kevin Turcios	fefccd5935	fix: drop JFR inline event config that breaks JDK 11 The jdk.ExecutionSample#period=1ms syntax in -XX:StartFlightRecording is only supported on JDK 13+. On JDK 11 (CI), it causes "Failure when starting JFR on_create_vm_2" and no JFR file is created. The settings=profile preset still provides 10ms CPU sampling.	2026-04-10 05:28:34 -05:00
Kevin Turcios	bfe6f3a828	Remove debug timing instrumentation from tracer Strip AtomicLong accumulators, System.nanoTime() timing, and getTimingSummary() that were added for profiling. No functional change.	2026-04-10 05:16:49 -05:00
Kevin Turcios	01e22152c7	flexing	2026-04-10 05:07:53 -05:00
Kevin Turcios	e81f25f825	fix: remove stale repeatString assertions from integration tests repeatString was removed from Workload.java in the E2E reduction.	2026-04-10 05:05:17 -05:00
Kevin Turcios	0772398c59	perf: optimize Java tracing agent serialization and writes - Reuse ThreadLocal Kryo Output buffers (eliminates #1 allocation hotspot) - Fast-path inline serialization for safe arg types (bypasses executor) - Skip verification roundtrip for known-safe containers (ArrayList, HashMap, etc.) - Batch SQLite inserts (256/txn) with permanent autocommit-off - Switch to ArrayBlockingQueue (no per-element Node allocation) - Add opt-in in-memory SQLite mode (VACUUM INTO at shutdown), enabled in CI - Add timing instrumentation (onEntry, serialization, writes, dump) - Add ProfilingWorkload fixture for benchmarking Benchmark (50k captures): onEntry 5200ms→1200ms (4.3x), avg/capture 0.43ms→0.02ms (21x), writes 3200ms→900ms (3.5x) with in-memory mode.	2026-04-10 04:55:36 -05:00
Kevin Turcios	08aa94c54a	perf: reduce java-tracer E2E to single function for ~11 min target Drop repeatString from the Workload fixture (2→1 function). computeSum alone exercises the full tracer→optimizer pipeline (trace → replay tests → optimize → evaluate → rank → explain → review). The second function added no additional pipeline coverage.	2026-04-10 03:44:54 -05:00
Kevin Turcios	46957e190f	fix: update java tracer unit tests for reduced Workload fixture Remove assertions for filterEvens and instanceMethod which were removed from the Workload fixture. Adjust expected invocation counts accordingly.	2026-04-10 03:17:46 -05:00
Kevin Turcios	21f61ec93d	ci: add java_tracer_e2e fixture path to e2e_java change detection The fixture directory wasn't in the path filter, so changes to Workload.java didn't trigger the java E2E tests.	2026-04-10 03:08:03 -05:00
Kevin Turcios	2b0f633c0f	perf: reduce java-tracer E2E from ~75 min to ~15 min Remove filterEvens and instanceMethod from the Workload fixture (4→2 functions) and reduce main() loop from 1000→100 rounds. The E2E test only needs to verify the tracer→optimizer pipeline works end-to-end; it doesn't need 4 functions or 1604 replay tests to prove that. Expected impact: ~2 functions × ~8 candidates × fewer replay tests should bring the job from ~75 min down to ~10-15 min.	2026-04-10 03:04:29 -05:00
Kevin Turcios	5ee642e35e	Merge pull request #2057 from codeflash-ai/fix/api-read-timeout fix: increase API read timeout to prevent flaky E2E failures	2026-04-10 02:45:31 -05:00
Kevin Turcios	4ac573f10f	fix: increase API read timeout from 90s to 300s to prevent flaky E2E failures The flat 90s timeout was too aggressive for LLM-powered endpoints (/testgen, /optimize, /refinement) under load, causing ReadTimeoutError and failing the async-optimization E2E test. Split into (10s connect, 300s read) tuple so connections fail fast but LLM inference gets adequate time.	2026-04-10 02:33:16 -05:00
Kevin Turcios	72a41a5665	Merge pull request #2055 from codeflash-ai/perf/defer-cli-imports perf: defer cli.py imports for 7.7x faster --help	2026-04-10 01:59:57 -05:00
Kevin Turcios	93810f8be6	Merge pull request #2056 from codeflash-ai/chore/delete-disabled-workflows chore: delete disabled codeflash.yaml workflow	2026-04-10 01:52:47 -05:00
Kevin Turcios	79d47e0fae	chore: delete disabled codeflash.yaml workflow JS ESM integration test — disabled and superseded by ci.yaml's e2e-js matrix.	2026-04-10 01:51:52 -05:00
Kevin Turcios	381d1319ea	fix: specify utf-8 encoding in benchmark read_text for Windows CI Windows defaults to cp1252 which can't decode some source file bytes.	2026-04-10 01:48:31 -05:00
Kevin Turcios	fe39d40e1b	perf: add type identity fast-paths for str/list/tuple/dict in comparator Move the 4 most common return-value types (str, list/tuple, dict) to `orig_type is T` identity checks at the top of the dispatch chain, before the frozenset lookup. A single pointer comparison is cheaper than a frozenset hash, and these types need special handling anyway (temp-path normalization, recursive comparison, superset support). Before: dict traversed ~8 isinstance checks before being handled. After: dict is handled at check #3 via `orig_type is dict`. The isinstance fallbacks remain as slow-paths for subclasses (deque, ChainMap, defaultdict, scipy dok_matrix, etc.). Backported from codeflash-python dispatch ordering.	2026-04-10 01:25:05 -05:00
Kevin Turcios	5a5b6e46ac	bench: add dedicated comparator microbenchmark for frozenset fast-path 5 scenarios: primitives, nested dicts, DB rows, deep nesting, and identity types (frozenset/range/complex/Decimal/OrderedDict).	2026-04-10 01:05:02 -05:00
Kevin Turcios	4c3c6ea167	perf: add frozenset fast-path for comparator type dispatch Use O(1) frozenset membership test with type identity before falling through to isinstance MRO traversal. Backported from codeflash-python.	2026-04-10 00:53:55 -05:00
Kevin Turcios	accbab4a16	fix: update test_cmd_auth patches for deferred imports Imports in cmd_auth.py were moved into function bodies, so mock patches must target the source modules instead of cmd_auth's namespace.	2026-04-10 00:36:02 -05:00
Kevin Turcios	2e2e19f7ae	bench: add libcst visitor benchmarks for multi-file and full pipeline - test_benchmark_libcst_multi_file: discover_functions + get_code_optimization_context across 10 real source files - test_benchmark_libcst_pipeline: full discover → extract → replace → merge pipeline on one file	2026-04-10 00:21:45 -05:00
Kevin Turcios	1a25f05e14	fix: remove unnecessary Optimizer from benchmark test The test only needs project_root, not a full Optimizer (which requires an API key). Also adds missing __init__.py to tests/benchmarks/.	2026-04-10 00:10:36 -05:00
Kevin Turcios	2208e8ca77	bench: add CLI startup benchmark for codeflash compare --script Measures median wall-clock time for --version, --help, auth status, and compare --help across 30 runs with 3 warmups. Usage: codeflash compare main codeflash/optimize \ --script "python benchmarks/bench_cli_startup.py" \ --script-output benchmarks/results.json	2026-04-09 23:59:26 -05:00
Kevin Turcios	b533f50bdc	perf: backport libcst visitor dispatch cache from codeflash-python Cache the visitor dispatch tables that libcst rebuilds on every MatcherDecoratableTransformer/Visitor instantiation. The tables depend only on the class, not the instance, so caching by type is safe. Saves ~27ms per visitor instantiation (24x faster). Also fix pre-existing ruff F821 in cli.py (missing exit_with_message import in process_pyproject_config).	2026-04-09 23:46:45 -05:00
github-actions[bot]	61053be9ce	style: auto-format with ruff	2026-04-10 04:39:45 +00:00
Kevin Turcios	436d642847	perf: defer libcst, Rich, comparator imports in models.py Move libcst, rich.tree.Tree, console, comparator, code_utils, registry, lsp.helpers, and LspMarkdownMessage from module-level to the methods that use them. Only pydantic and TestType remain at module level (needed for class definitions). models.py import: 633ms → 125ms on Azure Standard_D4s_v5.	2026-04-09 23:38:40 -05:00
github-actions[bot]	88babfef25	style: auto-format with ruff	2026-04-10 04:30:36 +00:00
Kevin Turcios	2fc528ebda	perf: defer heavy imports in env_utils and shell_utils Defer console, formatter, code_utils, registry, and lsp.helpers imports from module level into the functions that use them. Inline is_LSP_enabled (a one-liner env var check) to avoid importing lsp.helpers on the happy path of get_codeflash_api_key. auth status: 237ms → 160ms on Azure Standard_D4s_v5.	2026-04-09 23:29:31 -05:00
Kevin Turcios	992e91abc7	fix: prevent ruff auto-format from rewriting version.py placeholders uv-dynamic-versioning rewrites version.py on every `uv run`, so the ruff auto-format job was inadvertently committing dev version strings. Restore version.py files after formatting and revert the ones already changed on this branch.	2026-04-09 23:21:25 -05:00
github-actions[bot]	1e8e5d2cc2	style: auto-format with ruff	2026-04-10 04:14:58 +00:00
Kevin Turcios	a8c004164e	perf: skip telemetry/banner for auth and compare commands Restructure main() command dispatch so auth and compare exit early without loading telemetry (sentry, posthog), version_check, or the banner. Defer cmd_auth.py imports into functions. auth status: ~1000ms → 237ms (4.2x) compare --help: ~297ms → 38ms (7.9x)	2026-04-09 23:14:03 -05:00
github-actions[bot]	05a7641405	style: auto-format with ruff	2026-04-10 04:09:00 +00:00
Kevin Turcios	70e3ce1a67	perf: defer cli.py imports for 7.7x faster --help Move heavy module-level imports in cli.py (console, env_utils, code_utils, config_parser, lsp.helpers, version) into the functions that actually use them. Split main.py imports so parse_args() is called before loading the full stack — --help exits via argparse before any heavy modules load. Benchmark (Azure Standard_D4s_v5, Python 3.13, hyperfine --min-runs 30): --help: 297ms → 39ms (7.7x faster) --version: 17ms (unchanged)	2026-04-09 23:08:22 -05:00
Kevin Turcios	7351d0f0ba	Merge pull request #2051 from codeflash-ai/fix/ts-e2e-test-data-size Increase TS E2E test data size to fix flaky js-ts-class	2026-04-09 22:26:38 -05:00
Kevin Turcios	8ca0f8d2cc	Fix JS line profiler empty output file causing JSONDecodeError The profiler's save() was called every 100 hit() calls. With O(n²) algorithms this produced hundreds of thousands of writeFileSync calls, each truncating the file to 0 bytes before writing. If the subprocess timed out (SIGKILL), the file was left at 0 bytes → JSONDecodeError. Fixes: - Move require('fs')/require('path') to module scope (not inside save()) - Reduce save-every-N from 100 → 10,000 hits (100x fewer syscalls) - Pre-create output file with {} before running Jest (safety net) - Handle empty files gracefully in parse_results - Fix misleading "file not found" warning → "file empty or no timing data"	2026-04-09 22:26:23 -05:00
github-actions[bot]	23d9e73bfa	style: auto-format with ruff	2026-04-09 22:26:23 -05:00
Kevin Turcios	b7bcd0fe2e	ci: add code_to_optimize/js/ to e2e_js path filter The change detection for JS E2E tests was missing the test fixture directory, so PRs that only modify JS test data (like this one) were skipped. Java already had its equivalent path included.	2026-04-09 22:26:19 -05:00
Kevin Turcios	a73ccca426	Increase test data size for TS findDuplicates benchmark The js-ts-class E2E test was flaky because n=100 is too small for the O(n²)→O(n) optimization to overcome Map/Set per-operation overhead. At n=100, the LLM correctly generates a Map-based O(n) solution but it benchmarks as slower (-10.6%) due to constant factor dominance. Bump to n=10,000 so the algorithmic improvement produces measurable speedup, making the 30% E2E threshold reliably achievable.	2026-04-09 22:26:19 -05:00
Kevin Turcios	477dfa246e	Merge pull request #2049 from codeflash-ai/ci/cleanup-test-markers Clean up Java test skip markers	2026-04-09 22:26:10 -05:00
github-actions[bot]	41841325e2	style: auto-format with ruff	2026-04-10 03:23:38 +00:00
Kevin Turcios	da536db8a2	Clean up Java test skip markers - Remove dead `import shutil` from test_comparator.py - Rename `requires_java` → `requires_java_runtime` for consistency with test_run_and_parse.py - Remove redundant `@requires_java_runtime` on test_behavior_return_value_correctness (class already has it)	2026-04-09 22:22:39 -05:00
Kevin Turcios	e73492f414	Merge pull request #2053 from codeflash-ai/fix/ci-windows-shell fix(ci): add shell: bash to conditional install step for Windows	2026-04-09 22:22:29 -05:00
Kevin Turcios	5b6318fcbb	fix(ci): add shell: bash to conditional install step for Windows The bash [[ ]] syntax fails on Windows runners which default to PowerShell. Explicitly setting shell: bash fixes the ParserError.	2026-04-09 22:22:11 -05:00
Kevin Turcios	145043fdb3	Merge pull request #2052 from codeflash-ai/ci/workflow-upgrades-and-fixes ci: upgrade action versions, add uv cache, fix broken paths, DRY publish	2026-04-09 22:10:58 -05:00
Kevin Turcios	7c4d98c6e7	ci: restore uv venv --seed in claude.yml uv venv --seed makes pip available in the venv, which the Claude Code action may need.	2026-04-09 22:08:59 -05:00
Kevin Turcios	be4c459d01	ci: upgrade action versions, add uv cache, fix broken paths, DRY publish - Bump actions/checkout v4/v5 → v6, setup-node v4 → v6, setup-java v4 → v5, prek-action v1 → v2, github-script v6 → v7, aws-credentials v4 → v6, claude-code-action v1.0.89 → v1 - Add enable-cache: true to all astral-sh/setup-uv steps - Remove redundant uv venv --seed (uv sync creates venvs automatically) - Merge double uv sync steps in unit-tests into single conditional - Fix codeflash.yaml: broken path filter and working-directory - Consolidate duplicate publish jobs into a single matrix job - Remove generate_release_notes overridden by manual body	2026-04-09 22:06:41 -05:00
Kevin Turcios	153097b9a3	Merge pull request #2015 from codeflash-ai/fix/gradle-maven-central-dependency fix: improve multi-module Gradle detection for dynamic settings.gradle.kts	2026-04-09 19:17:01 -05:00
github-actions[bot]	a6ea56bf50	style: auto-format with ruff	2026-04-09 23:44:22 +00:00
Kevin Turcios	2dba3e3849	Merge branch 'main' into fix/gradle-maven-central-dependency	2026-04-09 18:43:25 -05:00

1 2 3 4 5 ...

7976 commits