codeflash-internal

mirror of https://github.com/codeflash-ai/codeflash-internal.git synced 2026-05-04 18:25:18 +00:00

Author	SHA1	Message	Date
codeflash-ai[bot]	1f621c4682	Optimize LLMClient.call The optimization adds an early-exit check in `calculate_llm_cost` that returns zero immediately when all rate fields (`input_cost`, `cached_input_cost`, `output_cost`) are zero, before extracting token counts via `getattr` calls. Line profiler confirms the hot path: the original spent 70.7% of function time (580 ms) in the final return statement's arithmetic, but 99.3% of calls (949/956) had zero-cost models where token extraction was wasted work. The optimized version short-circuits these cases in 1.9 ms total, cutting `calculate_llm_cost` from 821 ms to 29 ms (96.5% reduction). This cascades to `LLMClient.call`, where cost calculation dropped from 50.5% to 4.3% of method time, yielding an 80% throughput gain (6165 → 11,097 ops/sec) despite a 37% concurrency ratio regression caused by spending proportionally more time in non-yielding sync code after eliminating the async bottleneck.	2026-04-05 05:05:52 +00:00
claude[bot]	c60d43d334	fix: resolve ty type error for Lock._loop private attribute Replace access to private `asyncio.Lock._loop` attribute with equivalent check using `self.client_loop`, which is semantically equivalent and avoids the unresolved-attribute diagnostic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 04:39:04 +00:00
mohammed ahmed	62c42b49f4	Fix TOCTOU race condition in LLM client usage causing APIConnectionError Issue #11: LLM Client TOCTOU Race Condition Root Cause: PR #2575 fixed the client recreation race but introduced a Time-Of-Check to Time-Of-Use (TOCTOU) bug. The lock protects client creation (lines 103-120), but the actual API call happens after the lock is released. This creates a race window where another thread can close the client between the lock release and the API call. Race Timeline: 1. Thread A: Acquires lock → checks/creates clients → releases lock 2. Thread A: Calls `await self.call_openai()` (line 155) 3. Thread B: Acquires lock → detects loop change → closes clients → recreates 4. Thread A: Uses `self.openai_client.chat.completions.create()` (line 231) → RuntimeError: Cannot send a request, as the client has been closed. Impact: - Affects 2/12 optimization runs (16.7% under concurrent load) - Causes 300-second timeouts when all LLM retries exhaust - Trace IDs: 146c8968-8264-4755-a852-0bebd4988517, 1dfac2ef-8e74-41da-870c-03b3697badf4 - Error pattern: All LLM calls fail → retry exhaustion → timeout → "NO OPTIMIZATIONS GENERATED" Fix: 1. Remove client close() calls (lines 108-121) - Don't close clients when recreating 2. Capture client references (lines 123-127) - Snapshot clients after lock for consistency 3. Pass clients as parameters - call_openai(client=...) and call_anthropic(client=...) This prevents the TOCTOU race by: - Not closing clients that concurrent requests might be using - Allowing Python's GC to safely clean up clients when no references remain - Ensuring each request uses a consistent client instance Tests Added: - `test_old_clients_not_closed_on_event_loop_change` - Verifies close() is never called - Updated existing tests to expect new behavior (no close() calls) Verification: - All 564 tests pass - Reproduces the production error before fix - Fix eliminates the race condition Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-04-05 04:36:02 +00:00
Aseem Saxena	c54904daf9	Merge pull request #2548 from codeflash-ai/fix/llm-close-errors Fix: Handle LLM client close() errors gracefully	2026-04-03 14:37:35 -07:00
claude[bot]	681e8187ca	fix: resolve mypy type errors in test file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-03 19:21:57 +00:00
claude[bot]	2135849f27	style: auto-fix linting issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-03 19:20:57 +00:00
mohammed ahmed	4c4b497d2a	Fix: Handle LLM client close() errors gracefully Issue: The `/ai/optimization_review` endpoint was returning 500 errors when trying to close LLM clients during event loop changes. Root Cause: In `aiservice/llm.py` lines 96-99, the `close()` calls on OpenAI and Anthropic clients were not wrapped in exception handlers. When the httpx transport was already closed or in a bad state (e.g., event loop closure, connection already closed), the exception would propagate and cause the entire request to fail with a 500 error. Fix: Wrapped both `openai_client.close()` and `anthropic_client.close()` in try-except blocks that catch and log exceptions at DEBUG level. This prevents transport errors from crashing requests while still attempting to clean up resources properly. Impact: Fixes 500 errors on `/ai/optimization_review` and other endpoints that use the LLM client when event loops change or clients are in bad states. Testing: Added `test_llm_client_close.py` with 2 test cases that verify: 1. Transport errors during close() are handled gracefully 2. Event loop closed errors are handled gracefully Traces: 312d7392, 5bbdf214, a1325051	2026-04-03 19:19:19 +00:00
mohammed ahmed	b814e1e7e6	Merge pull request #2535 from codeflash-ai/fix/llm-client-event-loop-closure fix: close old LLM clients when event loop changes	2026-04-03 19:00:36 +02:00
mohammed ahmed	b2debb96b7	Merge branch 'main' into fix/llm-client-event-loop-closure	2026-04-03 15:31:36 +02:00
Sarthak Agarwal	9bf81e7418	aiservice logs add and misc fix to track the errors (#2530 ) # Pull Request Checklist ## Description - [ ] Description of PR: Clear and concise description of what this PR accomplishes - [ ] Breaking Changes: Document any breaking changes (if applicable) - [ ] Related Issues: Link to any related issues or tickets ## Testing - [ ] Test cases Attached: All relevant test cases have been added/updated - [ ] Manual Testing: Manual testing completed for the changes ## Monitoring & Debugging - [ ] Logging in place: Appropriate logging has been added for debugging user issues - [ ] Sentry will be able to catch errors: Error handling ensures Sentry can capture and report errors - [ ] Avoid Dev based/Prisma logging: No development-only or Prisma-specific logging in production code ## Configuration - [ ] Env variables newly added: Any new environment variables are documented in .env.example file or mentioned in description --- ## Additional Notes <!-- Add any additional context, screenshots, or notes for reviewers here --> Co-authored-by: ali <mohammed18200118@gmail.com>	2026-04-03 16:50:45 +05:30
claude[bot]	35519b6e84	fix: resolve mypy type errors in test_llm_client.py	2026-04-03 07:25:08 +00:00
claude[bot]	20b0b01994	style: auto-fix linting issues - ruff-format: reformat test file - fix ty type error: cast mock clients to MagicMock for assert_called_once Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-03 07:23:09 +00:00
Codeflash Bot	322d8736c9	fix: close old LLM clients when event loop changes This fixes a critical bug where old AsyncAzureOpenAI and AsyncAnthropicBedrock clients were not being closed when the event loop changed, causing: 1. Connection pool exhaustion → "couldn't get a connection after 30.00 sec" 2. RuntimeError: Event loop is closed during httpx client cleanup Root cause: In LLMClient.call(), when the event loop changed, new clients were created but old clients were not properly closed, leading to connection leaks. Fix: - Added await client.close() for both openai_client and anthropic_client before creating new instances - Added comprehensive unit tests to verify proper cleanup Impact: - Resolves ~150+ test generation failures (500 errors) - Fixes event loop closure errors in aiservice logs Trace IDs affected: 04500fbd-88e0-44e4-8d20-32f6a0dc06cc (and many others) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-04-03 07:20:37 +00:00
Kevin Turcios	d04d0dbbd2	fix: remove middleware.ts conflicting with proxy.ts (#2534 ) Removes middleware.ts added by #2532. Next.js 16 uses proxy.ts — having both causes build failure.	2026-04-03 00:46:19 -05:00
Kevin Turcios	ba64b92eb4	fix: add /api/healthcheck to proxy.ts ignorePaths (#2533 ) ## Summary - `/api/healthcheck` was returning 401 because `proxy.ts` requires auth for all `/api/` routes - Application Gateway health probe got 401 → marked backend unhealthy → 502 for all users* - Adds `/api/healthcheck` to `ignorePaths` so it bypasses auth - Also removes the erroneously added `middleware.ts` (Next.js 16 uses `proxy.ts`) ## Test plan - [ ] `/api/healthcheck` returns 200 without auth - [ ] Authenticated routes still require login - [ ] Application Gateway backend health shows Healthy	2026-04-03 00:44:28 -05:00
Kevin Turcios	081d0c15dd	fix: restore middleware.ts for Auth0 v4 healthcheck (#2532 ) ## Summary - Auth0 v4 auto-generates middleware that protects all routes when no `middleware.ts` exists - This caused `/api/healthcheck` to return 401, making the Application Gateway mark the backend as unhealthy → 502 for all users - Restores explicit middleware with Auth0 v4 API and excludes `/api/healthcheck` from the matcher ## Test plan - [ ] `/api/healthcheck` returns 200 without auth - [ ] Authenticated routes still require login - [ ] Application Gateway backend health shows Healthy	2026-04-02 23:37:13 -05:00
Kevin Turcios	5dca735fc8	Upgrade Next.js 14 → 16, React 18 → 19, and dependencies (#2385 ) ## Summary - Upgrade Next.js 14.2 → 16.1, React 18 → 19, React DOM 18 → 19 - Upgrade @sentry/nextjs 9 → 10, @auth0/nextjs-auth0 3 → 4, ESLint 8 → 9 - Migrate all async request APIs (cookies, params, searchParams are now Promises) - Migrate middleware.ts → proxy.ts (Next.js 16 convention) - Rewrite ESLint config for flat config format - New Auth0Client setup with backward-compatible AUTH0_DOMAIN derivation - Turbopack browser-only resolveAlias for web-tree-sitter Node.js stubs ## Test plan - [ ] `npm run build` passes - [ ] `npm run lint` passes (0 errors, warnings only from React Compiler rules) - [ ] `npm run type-check` passes - [ ] `npm run dev` starts successfully with Turbopack - [ ] Auth login/logout flow works end-to-end - [ ] Verify `AUTH0_DOMAIN` or `AUTH0_ISSUER_BASE_URL` env var is set in deployment	2026-04-02 22:38:01 -05:00
Kevin Turcios	c2feaf91f0	fix: return 422 for operational failures instead of 500 across all endpoints (#2528 ) ## Summary - Return 422 Unprocessable Entity instead of 500 for known operational failures (LLM output parsing failures, no valid candidates produced, invalid rankings, etc.) across all aiservice endpoints - Keeps 500 for genuine internal errors (bare `except Exception` catch-alls that could include DB/network failures) - Adds `422` to Django-Ninja response schemas so the framework serializes responses correctly ## Endpoints changed \| Endpoint \| Failure type \| Old \| New \| \|---\|---\|---\|---\| \| `/ai/testgen` \| `TestGenerationFailedError`, `ParserSyntaxError` \| 500 \| 422 \| \| `/ai/optimize` \| No valid candidates generated \| 500 \| 422 \| \| `/ai/optimize-line-profiler` \| No optimizations generated \| 500 \| 422 \| \| `/ai/adaptive_optimize` \| LLM parse error, no candidate \| 500 \| 422 \| \| `/ai/code_repair` \| LLM error, `ParserSyntaxError`, `ValidationError` \| 500 \| 422 \| \| `/ai/rank` \| Invalid ranking from LLM \| 500 \| 422 \| \| `/ai/explain` \| LLM failure, XML parse failure \| 500 \| 422 \| \| `/ai/optimization_review` \| JSON parse failure, no JSON block \| 500 \| 422 \| ## Why These endpoints were returning 500 for expected outcomes (e.g., LLM returning unparseable output), which triggered Azure 5xx alerts and inflated error metrics. 422 correctly signals that the request was understood but the server couldn't produce a valid result. ## Test plan - [x] `uv run pytest -x -q -k "optimizer or rank or explain or code_repair or review"` — 199 passed - [ ] Verify Azure 5xx alert rate drops after deploy --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-02 19:37:27 -05:00
Kevin Turcios	e7f4bb40b3	perf: lazy debug_log_sensitive_data to skip model_dump_json in production (#2527 ) ## Summary - Convert `debug_log_sensitive_data(f"...{response.model_dump_json(indent=2)}")` to `debug_log_sensitive_data_from_callable(lambda: ...)` across 8 endpoint files - In production, `debug_log_sensitive_data` is a no-op but the f-string interpolation (including `model_dump_json(indent=2)`) was always evaluated — serializing the full LLM response to JSON on every call - The `_from_callable` variant only invokes the lambda when debug logging is active (non-production) - Fix pre-existing bug: `log_response()` closures in 4 endpoint files returned `None` instead of a string, causing `debug_log_sensitive_data_from_callable` to log `None`. Now they return the concatenated log string as expected by the callable-based API. Affected endpoints: Python optimizer, line profiler, jit_rewrite, Java optimizer, Java line profiler, JS/TS optimizer, JS/TS line profiler, testgen. ## Test plan - [x] All 558 unit tests pass - [x] mypy clean - [x] ruff clean - [ ] Verify debug logging still works in non-production environments --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>	2026-04-02 19:37:25 -05:00
Kevin Turcios	0029a0e76e	perf: optimize postprocessing pipeline — eliminate redundant CST codegen (#2526 ) ## Summary - Replace Pydantic frozen dataclass with stdlib `@dataclass(frozen=True)` for `CodeExplanationAndID` and `CodeAndExplanation`, removing `field_validator` that ran `.code` + `compile()` ~280 times per pipeline run - Pre-compute `original_module.code` once and pass to pipeline steps (`clean_extraneous_comments`, `equality_check`) that previously called it independently - Replace `ast.dump(annotate_fields=False)` with `ast.unparse` in `deduplicate_optimizations` (70% faster) - Skip re-parse in `dedup_and_sort_imports` when isort returns unchanged code - Cache comment-stripped original code across candidates in `clean_extraneous_comments` Pipeline median per-run: ~1.5s → 184ms (4 candidates, controlled measurement). Saves ~4-5s of CPU per optimization request in production. ## Test plan - [x] All 558 unit tests pass - [x] mypy clean - [x] ruff clean (no new warnings) - [ ] Verify optimizer endpoints return correct results in staging --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>	2026-04-02 19:37:15 -05:00
Kevin Turcios	d0e97992d6	perf: fire-and-forget logging to reduce response latency 100-300ms (#2525 ) ## Summary - Move `safe_log_features()` and `update_optimization_cost()` out of blocking `TaskGroup`s into fire-and-forget background tasks across 4 optimization endpoints (optimizer, optimizer_line_profiler, jit_rewrite, adaptive_optimizer) - These DB writes are analytics-only and don't affect response bodies — waiting for them adds 100-300ms per request unnecessarily - Add `aiservice/background.py` with `fire_and_forget()` helper using the same `set` + `add_done_callback` pattern already used in `LLMClient` - `get_or_create_optimization_event()` remains awaited where the response needs `event.id` ## Test plan - [x] All 550 tests pass locally - [ ] Verify response latency improvement in production metrics after deploy - [ ] Confirm `safe_log_features` and `update_optimization_cost` still complete successfully in background (check DB records) --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-02 11:37:52 -05:00
mohammed ahmed	de0f30ae15	Fix: Strip .js extensions from vi.mock() calls in Vitest tests (#2524 ) ## Summary Vitest tests were failing with "Cannot find module" errors because `vi.mock()` calls retained `.js` extensions while imports had them stripped, causing mock/import path mismatch in ESM mode. ## Root Cause The `strip_js_extensions()` function in `testgen.py` only handled `jest.mock()` but not `vi.mock()`, which is used by Vitest. The pattern `_JEST_MOCK_EXTENSION_PATTERN` matched Jest mocking functions but not Vitest's `vi.*` equivalents. ## Fix Added `_VITEST_MOCK_EXTENSION_PATTERN` regex to match and strip extensions from: - `vi.mock()` - `vi.doMock()` - `vi.unmock()` - `vi.requireActual()` - `vi.requireMock()` - `vi.importActual()` - `vi.importMock()` ## Affected Trace IDs - `0fe99c9f-b348-4f0a-b051-0ea9455231ba` - `127cdaec-a343-4918-a86a-b646dd4d79cf` - `2b6c896e-20d7-4505-8bf4-e4a2f20b37fc` These trace IDs exhibited the bug where generated tests had `vi.mock('../config/paths.js')` but imports had `from '../config/paths'`, causing module resolution failures. ## Test Coverage - Added 8 new tests in `TestStripJsExtensions` class - All 31 tests in `test_testgen_javascript.py` pass - Specific regression test for vi.mock() extension stripping - Tests cover all vi.mock variants and edge cases ## Files Changed - `django/aiservice/core/languages/js_ts/testgen.py` (fix) - `django/aiservice/tests/testgen/test_testgen_javascript.py` (tests) --------- Co-authored-by: Codeflash Bot <codeflash-bot@codeflash.ai> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Sarthak Agarwal <sarthak.saga@gmail.com>	2026-04-02 21:50:45 +05:30
mohammed ahmed	179302d006	Fix test generation replay 500 error when arrays contain None values (#2521 ) ## Summary Fixes 500 Internal Server Error when replaying test generation with `--rerun` flag and database arrays contain `None`/`NULL` values. ## Root Cause The `rerun_testgen()` function in `core/shared/replay.py` accessed array elements without checking if they were `None`. When PostgreSQL arrays contained `NULL` values (e.g., `generated_test = [NULL, 'test2']`), the function returned a `TestGenResponseSchema` with `None` values, causing Pydantic validation to fail: ``` pydantic_core._pydantic_core.ValidationError: 2 validation errors for TestGenResponseSchema generated_tests Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] instrumented_behavior_tests Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] ``` ## Changes Added explicit `None` checks before creating `TestGenResponseSchema`: - If `generated_test[index]` or `instrumented_generated_test[index]` is `None`, return `None` (skip this test) - If `instrumented_perf_test[index]` is `None`, default to empty string (non-critical field) ## Impact Resolves 10+ replay failures where test generation produced partial results stored as `NULL` in database arrays. ## Test Coverage Added comprehensive test suite for `replay.py`: - `test_rerun_with_valid_test_data()` - Happy path - `test_rerun_with_none_values_in_arrays()` - Primary bug fix test - `test_rerun_with_index_out_of_bounds()` - Boundary conditions - `test_rerun_with_empty_arrays()` - Empty data handling - `test_rerun_with_none_arrays()` - NULL arrays - `test_rerun_with_mismatched_array_lengths()` - Length mismatches - `test_rerun_missing_perf_test()` - Missing perf data All 7 tests pass. ## Trace IDs This fix addresses errors seen in traces: - Primary: `056561cc-94af-4d7b-ac79-85dfd4b7282d` - And 9 additional trace IDs with the same "500 - Error generating JavaScript tests" error ## Verification Tested with original failing trace: ```bash cd /workspace/target && codeflash --file src/daemon/constants.ts --function formatGatewayServiceDescription --rerun 056561cc-94af-4d7b-ac79-85dfd4b7282d ``` Before fix: `ERROR: 500 - Traceback... ValidationError: Input should be a valid string [type=string_type, input_value=None]` After fix: Gracefully skips None entries, no 500 error ✅ --------- Co-authored-by: Codeflash Bot <codeflash-bot@codeflash.ai> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-02 21:49:58 +05:30
Kevin Turcios	d504f111a7	fix: plug memory leak from LogRecord buffering and unblock async event loop (#2523 ) ## Summary - Memory leak fix: Added explicit `LOGGING` config in `settings.py` to prevent unbounded `LogRecord` buffering. Django's `django.request` logger creates WARNING records for 4xx responses with the full `ASGIRequest` (headers, body, payload) pinned in `args`. Without explicit config, Django's default handlers and Sentry's `enable_logs=True` buffer these indefinitely. Setting `django.request` to ERROR level + removing `enable_logs=True` eliminated the leak — load testing showed 84% reduction in per-request memory growth (7.4 → 1.2 KiB/req). - Async event loop fix: Wrapped `parse_and_generate_candidate_schema()` in `asyncio.to_thread()` across all 4 async callers (optimizer, optimizer_line_profiler, jit_rewrite, adaptive_optimizer). This offloads the synchronous libcst parsing + 8-stage postprocessing pipeline to the thread pool, preventing it from blocking the event loop during peak traffic. ## Test plan - [x] All 550 tests pass (`uv run pytest tests/ --ignore=tests/profiling -x -q`) - [ ] Monitor Azure memory alerts after deploy — expect significant reduction in memory growth rate - [ ] Monitor 5xx error rate during peak traffic — expect reduction from event loop no longer blocked by sync postprocessing --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-02 10:57:58 -05:00
Kevin Turcios	df90110fe8	fix: prevent log_features from 500ing optimization endpoints (#2518 ) ## Summary - `thread_sensitive=False` on `sync_to_async` so concurrent `log_features` calls get their own threads instead of serializing through one (was `True`, causing a bottleneck) - Raised DB pool `max_size` from 10 to 100 — prod Postgres allows 859 connections, giving plenty of headroom - Added `safe_log_features` wrapper that catches errors via Sentry instead of propagating — used at all 9 TaskGroup and bare-await call sites so a logging failure can't crash an otherwise successful optimization endpoint - Kept `transaction.atomic` + `select_for_update` for correctness (Django doesn't support async transactions yet, and removing these causes lost-update races on dict-merge fields) ## Root cause `log_features` uses `@sync_to_async` + `@transaction.atomic` because Django lacks async transaction support. The previous fix for pool exhaustion changed `thread_sensitive=False` to `True`, which serialized all calls through a single thread — fixing pool exhaustion but creating a throughput bottleneck that caused 500s under load. Additionally, 6 call sites used `asyncio.TaskGroup` where any `log_features` exception would propagate and crash the entire endpoint. ## Test plan - [x] `tests/log_features/test_log_features_concurrency.py` — verifies `thread_sensitive=False` and `safe_log_features` is async - [x] `ruff check` passes on all changed files - [ ] Deploy to staging and verify no 500s under concurrent optimization requests	2026-04-02 06:51:20 -05:00
mohammed ahmed	c4222a4aeb	Merge pull request #2508 from codeflash-ai/fix/js-import-resolution-detect-export-style Fix JS/TS import resolution to detect export style from source code	2026-04-02 10:51:10 +02:00
claude[bot]	76c605c6d9	style: auto-fix linting issues	2026-04-01 17:22:46 +00:00
mohammed ahmed	868a9d5d37	Merge pull request #2511 from codeflash-ai/codeflash/optimize-pr2508-2026-04-01T17.18.19 ⚡️ Speed up function `_resolve_import` by 1,027% in PR #2508 (`fix/js-import-resolution-detect-export-style`)	2026-04-01 19:20:42 +02:00
codeflash-ai[bot]	dd518c18aa	Optimize _resolve_import The optimization hoisted the 70-element `reserved_words` set out of `_is_valid_js_identifier` into a module-level `frozenset`, eliminating 1677 repeated set constructions that consumed 1.79 ms per profiler (42% of that function's time). More significantly, `_detect_export_style` previously compiled six regex patterns on every invocation via f-string interpolation with `escaped_id`; the optimized version pre-compiles generic patterns once at module load and uses `finditer` plus manual identifier comparison, cutting the function's runtime from 3.17 s to 14.7 ms across 1146 calls—a 99.5% reduction that accounts for nearly all of the 10× speedup. Test annotations confirm the largest gains occur in the `test_large_scale_many_class_methods_with_alternating_export_styles` case (107 ms → 4.66 ms), where repeated export detection dominated.	2026-04-01 17:18:23 +00:00
claude[bot]	534d0317b1	style: auto-fix linting issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 17:10:38 +00:00
ali	bb747096c8	fix existing unit tests	2026-04-01 19:07:25 +02:00
claude[bot]	83b37c7337	style: auto-fix linting issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 16:26:09 +00:00
Codeflash Bot	9ec5eb7c5a	Fix JS/TS import resolution to detect export style from source code When generating tests for class methods (e.g., ModulesContainer.getById), the test generator was incorrectly assuming default import style, generating: import ModulesContainer from '...' This caused "Cannot find module" errors when: 1. The class was not exported at all 2. The class used named export (export class X) instead of default export This fix: - Adds _detect_export_style() to parse source code and detect actual export style - Modifies _resolve_import() to use detected export style: - 'export default class X' → default import - 'export class X' → named import - No export → named import (test will fail, surfacing the issue) - Adds comprehensive unit tests for all scenarios Affected traces: 12332328-80e8-4bde-bdd6-c76ac373675a, 73ccd4c6-a4f7-467a-8356-5199e9d9b877, 989dcbda-bc27-40b7-aed0-0ab51fd00e6d, and others with ERR_MODULE_NOT_FOUND Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-04-01 16:23:17 +00:00
Kevin Turcios	0abc6bf1e3	async: parallelize endpoint epilogue DB writes (#2490 ) ## Summary Parallelize independent DB writes at the end of 4 endpoints using `asyncio.TaskGroup`. With psycopg3 connection pooling (#2489), each task gets its own connection from the pool. ### Endpoints optimized \| Endpoint \| Before \| After \| \|----------\|--------\|-------\| \| Refinement \| `log_features` then `update_optimization_cost` \| `TaskGroup` (concurrent) \| \| Explanations \| `update_optimization_cost` inside inner fn \| Moved to handler, `TaskGroup` with `log_features` \| \| Optimization review \| `update_optimization_cost` inside inner fn \| Moved to handler, `TaskGroup` with `update_optimization_features_review` \| \| Ranker \| `update_optimization_cost` inside inner fn \| Moved to handler, `TaskGroup` with `log_features` \| Each endpoint saves ~87ms (one DB round-trip) by overlapping two independent writes. ### Comprehensive audit All 13 endpoints were audited — no remaining async antipatterns found: - No blocking calls in async paths - No `await`-in-loop patterns - LLM clients already use connection reuse - All other endpoints have at most 1 DB write in the epilogue ## Test plan - [x] All 538 tests passing - [ ] Verify under load in staging --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com> Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>	2026-04-01 06:15:16 -05:00
mohammed ahmed	2887b34d02	chore: clean up codeflash JS workflow (#2499 ) ## Summary - Normalize quote style to double quotes for YAML consistency - Remove redundant `jest-junit` runtime install step (already in devDependencies) - Simplify codeflash CLI flags: `--all --verbose --yes` → `--yes` ## Test plan - [ ] Verify workflow runs successfully on a test PR touching `js/cf-api/` or `js/cf-webapp/` - [ ] Confirm `npm ci` installs jest-junit from package-lock without the extra install step 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Kevin Turcios <106575910+KRRT7@users.noreply.github.com>	2026-04-01 04:23:57 -05:00
mohammed ahmed	8d987de65c	Fix TypeScript validator to support JSX/TSX syntax (#2503 ) ## Summary The TypeScript validator was rejecting valid JSX/TSX syntax, causing optimization runs to fail on React components with JSX. ## Problem The validator was using `tree_sitter_typescript.language_typescript()` which doesn't parse JSX syntax. This caused validation failures for `.tsx` files containing JSX elements like: - `<div className={...} />` - `{...rest}` (spread props) - Any JSX tags ## Solution Changed to use `tree_sitter_typescript.language_tsx()` instead. Since TSX is a superset of TypeScript, this supports both: - Plain TypeScript code - TypeScript with JSX (TSX) ## Testing Added three new test cases: - `test_tsx_simple_jsx` - Tests basic JSX elements - `test_tsx_nested_jsx` - Tests nested JSX - `test_tsx_with_props_spread` - Tests spread props in JSX All existing tests continue to pass. ## Impact This fixes validation errors for all React/JSX components. Affected trace IDs from logs: - 5bedfbb7-ccc0-4fdd-b208-60b8b860750c - 39892d42-774f-4921-80fc-2ee42ff8ae1c - 80b818b6-e784-4ff8-abda-c3ce6b25422f - 9b76943e-1a93-45fa-84b9-aae7d6305f79 - d1bac014-d622-4772-90ea-0f9ff88e32dd 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Codeflash Bot <codeflash-bot@codeflash.ai> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-04-01 04:20:41 -05:00
mohammed ahmed	de4a22d549	Merge pull request #2502 from codeflash-ai/fix/postgres-pool-exhaustion-thread-sensitive Fix PostgreSQL connection pool exhaustion in log_features	2026-04-01 02:11:26 +02:00
claude[bot]	8629ac756e	fix: resolve mypy type errors in test file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-31 23:54:53 +00:00
claude[bot]	f530a1e562	style: auto-fix linting issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-31 23:53:58 +00:00
Codeflash Bot	8734f5a0f8	Fix PostgreSQL connection pool exhaustion in log_features Bug: PostgreSQL connection pool timeout (30 seconds) Root cause: log_features uses @sync_to_async(thread_sensitive=False), causing each call to grab a separate database connection from the pool. When multiple optimization requests run concurrently, the pool (max_size=10) exhausts. Error seen: psycopg_pool.PoolTimeout: couldn't get a connection after 30.00 sec Fix: Change thread_sensitive=False to thread_sensitive=True. This ensures Django properly reuses connections across async/sync boundaries instead of allocating a new connection for each call. Affected trace IDs from logs: - a0d8dab6-6524-47dc-9c82-5fa92e6390fb - 62f5c35b-7161-4ab0-958a-4865231f5188 - ddc0e882-f914-49e4-a2ac-2d5f19a17507 - eaeb0cbe-6474-4808-9092-42f837dd52cf Testing: - Added test_log_features_concurrency.py to verify thread_sensitive=True - Verified reproduction script now passes without pool exhaustion - All existing tests pass Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-31 23:49:51 +00:00
mohammed ahmed	1908325dc8	feat: add rerun trace support to aiservice endpoints (#2493 ) ## Summary - Adds `rerun_trace_id` field to all request schemas (`OptimizeSchema`, `OptimizeSchemaLP`, `TestGenSchema`, `RefinementRequestSchema`, `CodeRepairRequestSchema`) - Creates `core/shared/replay.py` with shared rerun logic that queries `optimization_features` and returns stored results - Adds early-return short-circuit to `/optimize`, `/optimize-line-profiler`, `/testgen`, `/refinement`, `/code_repair` — bypasses LLM calls when `rerun_trace_id` is provided - Filters results by `optimizations_origin.source` (OPTIMIZE, OPTIMIZE_LP, REFINE, REPAIR) and matches by parent optimization ID for refinement/repair ## Test plan - [ ] Run optimization normally to populate `optimization_features` with a trace_id - [ ] Rerun with `codeflash --rerun <trace_id>` against local server - [ ] Verify each endpoint returns stored results without LLM calls - [ ] Verify backward compatibility — requests without `rerun_trace_id` behave unchanged 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Sarthak Agarwal <sarthak.saga@gmail.com>	2026-03-29 18:16:13 +05:30
mohammed ahmed	2612a56994	Merge pull request #2497 from codeflash-ai/chore/make-health-check-public-in-cfapi Chore: expose healthcheck endpoint publicly for cf-api	2026-03-29 09:41:48 +02:00
mohammed ahmed	dac9989e43	Merge branch 'main' into chore/make-health-check-public-in-cfapi	2026-03-29 09:28:05 +02:00
mohammed ahmed	6aca6a83ad	Merge pull request #2498 from codeflash-ai/mohammedahmed18-patch-1 chore: packages read access for gh action	2026-03-29 09:27:29 +02:00
mohammed ahmed	bbf3a94f87	read access for packages	2026-03-29 09:25:56 +02:00
ali	592b8bebb4	make healthcheck public in cfapi	2026-03-29 09:09:48 +02:00
mohammed ahmed	3f0cf01772	Merge pull request #2496 from codeflash-ai/chore/gh-registery-for-js-action Include the github registery for @codeflash-ai in codeflash js action	2026-03-29 05:06:56 +02:00
mohammed ahmed	eaa56e9ebf	Update codeflash-js.yaml	2026-03-29 04:01:34 +02:00
Kevin Turcios	40def05997	fix: resolve language_version in testgen_repair endpoint (#2494 ) ## Summary - The `/testgen_repair` endpoint was missed when `language_version` support was added in #2488 - Clients that stopped sending `python_version` (codeflash-ai/codeflash#1914) hit `400 - Python version is required` - Adds `language_version` field and `resolve_python_version` validator to `TestRepairSchema`, matching the pattern in `OptimizeSchema`/`TestGenSchema` - Replaces `python_version=data.python_version` with `language_version=data.language_version` when constructing `TestGenSchema` in the repair handler ## Test plan - [ ] Deploy and verify testgen repair calls no longer return 400 - [ ] Verify old clients sending `python_version` still work (backward compat via validator)	2026-03-27 17:12:53 -05:00
Kevin Turcios	927cc99588	fix: correct regexes for removing empty benchmark sections in PR comments (#2495 ) ## Summary - Fix regexes in `buildBenchmarkInfo` that strip empty improved/degraded benchmark sections from PR comments - The regexes didn't match the actual template text: wrong heading level (`####` vs `###`), wrong emoji (`📉` vs `⚡️`), and wrong wording — causing `{benchmark_info_degraded}` to render literally when there are no degraded benchmarks - Add unit tests for improved-only, degraded-only, and empty benchmark scenarios ## Test plan - [x] All 33 tests in `pr-changes-utils.test.ts` pass - [x] New tests verify template placeholders are fully removed when a section is empty	2026-03-27 17:12:35 -05:00

1 2 3 4 5 ...

6404 commits