Commit graph

6410 commits

Author SHA1 Message Date
claude[bot]
97132bf8b5 fix: resolve mypy type errors 2026-04-04 18:28:19 +00:00
claude[bot]
eb430456c3 style: auto-fix linting issues 2026-04-04 18:26:17 +00:00
codeflash-ai[bot]
198d7cabe8
Optimize _detect_export_style
The optimization replaces three sequential regex scans (`_DEFAULT_EXPORT_PATTERN.finditer`, `_NAMED_EXPORT_PATTERN.finditer`, `_NAMED_BRACES_PATTERN.finditer`) with a single combined pattern (`_EXPORT_COMBINED_PATTERN`) using named capture groups, reducing regex engine overhead from ~3.1 million nanoseconds to ~2.0 million nanoseconds. Using `match.lastgroup` to determine export type avoids repeatedly checking which group matched, cutting per-match inspection cost. Test results confirm the largest gains on cases with many exports (e.g., `test_braces_export_with_many_items_large` improved 1800%, `test_large_braces_export` improved 644%), where eliminating redundant scans compounds. Small regressions (4-15%) on trivial inputs like single default exports are negligible compared to the 72% overall runtime improvement.
2026-04-04 18:24:43 +00:00
claude[bot]
eaa2c7a493 style: auto-fix linting issues 2026-04-04 18:15:54 +00:00
codeflash-ai[bot]
dfdb5cc586
Optimize _detect_export_style_cached
The optimization removed the expensive `re.sub(r"\s+", " ", source_code)` normalization call (which accounted for 42% of runtime) and merged six separate regex scans into three: one combined pattern for default exports (`(?:class|function)`), one for named exports (`(?:class|function|const)`), and one for brace exports. Line profiler confirms whitespace normalization consumed 8.46 ms of the original 19.94 ms total, and the reduction from six `finditer` loops to three cuts redundant full-text traversals. The tests reveal no behavioral regressions, indicating the patterns already tolerated raw whitespace without normalization.
2026-04-04 18:13:23 +00:00
claude[bot]
5044d02f7f fix: remove duplicate cached function definitions in testgen.py
The optimization commit accidentally duplicated _detect_export_style_cached,
_render_system_template, and _render_user_template definitions, causing
mypy no-redef errors and dead code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:58:21 +00:00
codeflash-ai[bot]
64e09d6e85
Optimize build_javascript_prompt
The optimization wraps `_detect_export_style` (which scans source code with six separate regex `finditer` loops) and both Jinja template renders in `@lru_cache` decorators with bounded sizes (512/256 entries). Line profiler shows `_detect_export_style` consumed 97.7% of `_resolve_import` runtime and the two `get_template().render()` calls consumed 98.8% of `build_javascript_prompt` runtime; caching eliminates this redundant work when the same function_name/source_code or template arguments recur across test generation calls. The 22× speedup comes from avoiding repeated regex compilation/scanning and template parsing on cache hits, which are common in batch test-generation workflows where the same modules or prompts are processed multiple times.
2026-04-04 17:55:25 +00:00
claude[bot]
e018c40bb5 fix: resolve mypy and ty type errors in test file
- Fix invalid FunctionToOptimize constructor args (qualified_name, line_number don't exist)
- Fix import to use aiservice.models.functions_to_optimize instead of core.shared.testgen_models
- Cast messages[1]["content"] to str to satisfy type checker
- Simplify effective_module_system assignment in testgen.py to fix mypy error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:45:25 +00:00
mohammed ahmed
da778302b3 Fix TypeScript tests using require() in CommonJS packages
**Problem:**
When generating tests for TypeScript files in CommonJS packages, the AI
service generated `require()` statements instead of `import`. TypeScript
test runners (@swc/jest, ts-jest) expect ESM import syntax in TypeScript
test files, regardless of the project's module system.

This caused SyntaxError when Jest tried to run the generated tests:
```
SyntaxError: Unexpected token, expected ","
```

**Root Cause:**
The js_import macro in _macros.md.j2 generates require() when
module_system != "esm". But TypeScript test files (.test.ts) should
ALWAYS use ESM import syntax because test runners like @swc/jest and
ts-jest expect it.

**Fix:**
- Added language parameter to build_javascript_prompt()
- For TypeScript tests, override effective_module_system to "esm"
- This ensures TypeScript tests always get import syntax
- JavaScript tests in CommonJS packages are unaffected

**Testing:**
- Added 2 regression tests in test_typescript_commonjs_import_bug.py
- All 32 existing testgen tests pass (no regressions)
- Verified TypeScript tests use import, JavaScript tests unchanged

**Trace IDs:**
- 30ddf51e-95aa-4aa2-a73b-b36058d6c275 (prefixed function)

**Impact:**
- Affects all TypeScript files in CommonJS packages
- Severity: HIGH
- Systematic bug (reproducible every time)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-04 17:40:44 +00:00
Aseem Saxena
c54904daf9
Merge pull request #2548 from codeflash-ai/fix/llm-close-errors
Fix: Handle LLM client close() errors gracefully
2026-04-03 14:37:35 -07:00
claude[bot]
681e8187ca fix: resolve mypy type errors in test file
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 19:21:57 +00:00
claude[bot]
2135849f27 style: auto-fix linting issues
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 19:20:57 +00:00
mohammed ahmed
4c4b497d2a Fix: Handle LLM client close() errors gracefully
**Issue**: The `/ai/optimization_review` endpoint was returning 500 errors
when trying to close LLM clients during event loop changes.

**Root Cause**: In `aiservice/llm.py` lines 96-99, the `close()` calls on
OpenAI and Anthropic clients were not wrapped in exception handlers. When
the httpx transport was already closed or in a bad state (e.g., event loop
closure, connection already closed), the exception would propagate and cause
the entire request to fail with a 500 error.

**Fix**: Wrapped both `openai_client.close()` and `anthropic_client.close()`
in try-except blocks that catch and log exceptions at DEBUG level. This
prevents transport errors from crashing requests while still attempting to
clean up resources properly.

**Impact**: Fixes 500 errors on `/ai/optimization_review` and other endpoints
that use the LLM client when event loops change or clients are in bad states.

**Testing**: Added `test_llm_client_close.py` with 2 test cases that verify:
1. Transport errors during close() are handled gracefully
2. Event loop closed errors are handled gracefully

**Traces**: 312d7392, 5bbdf214, a1325051
2026-04-03 19:19:19 +00:00
mohammed ahmed
b814e1e7e6
Merge pull request #2535 from codeflash-ai/fix/llm-client-event-loop-closure
fix: close old LLM clients when event loop changes
2026-04-03 19:00:36 +02:00
mohammed ahmed
b2debb96b7
Merge branch 'main' into fix/llm-client-event-loop-closure 2026-04-03 15:31:36 +02:00
Sarthak Agarwal
9bf81e7418
aiservice logs add and misc fix to track the errors (#2530)
# Pull Request Checklist

## Description
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Related Issues**: Link to any related issues or tickets

## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes

## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code

## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---

## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->

Co-authored-by: ali <mohammed18200118@gmail.com>
2026-04-03 16:50:45 +05:30
claude[bot]
35519b6e84 fix: resolve mypy type errors in test_llm_client.py 2026-04-03 07:25:08 +00:00
claude[bot]
20b0b01994 style: auto-fix linting issues
- ruff-format: reformat test file
- fix ty type error: cast mock clients to MagicMock for assert_called_once

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 07:23:09 +00:00
Codeflash Bot
322d8736c9 fix: close old LLM clients when event loop changes
This fixes a critical bug where old AsyncAzureOpenAI and AsyncAnthropicBedrock
clients were not being closed when the event loop changed, causing:

1. Connection pool exhaustion → "couldn't get a connection after 30.00 sec"
2. RuntimeError: Event loop is closed during httpx client cleanup

Root cause:
In LLMClient.call(), when the event loop changed, new clients were created
but old clients were not properly closed, leading to connection leaks.

Fix:
- Added await client.close() for both openai_client and anthropic_client
  before creating new instances
- Added comprehensive unit tests to verify proper cleanup

Impact:
- Resolves ~150+ test generation failures (500 errors)
- Fixes event loop closure errors in aiservice logs

Trace IDs affected: 04500fbd-88e0-44e4-8d20-32f6a0dc06cc (and many others)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-03 07:20:37 +00:00
Kevin Turcios
d04d0dbbd2
fix: remove middleware.ts conflicting with proxy.ts (#2534)
Removes middleware.ts added by #2532. Next.js 16 uses proxy.ts — having
both causes build failure.
2026-04-03 00:46:19 -05:00
Kevin Turcios
ba64b92eb4
fix: add /api/healthcheck to proxy.ts ignorePaths (#2533)
## Summary
- `/api/healthcheck` was returning 401 because `proxy.ts` requires auth
for all `/api/*` routes
- Application Gateway health probe got 401 → marked backend unhealthy →
**502 for all users**
- Adds `/api/healthcheck` to `ignorePaths` so it bypasses auth
- Also removes the erroneously added `middleware.ts` (Next.js 16 uses
`proxy.ts`)

## Test plan
- [ ] `/api/healthcheck` returns 200 without auth
- [ ] Authenticated routes still require login
- [ ] Application Gateway backend health shows Healthy
2026-04-03 00:44:28 -05:00
Kevin Turcios
081d0c15dd
fix: restore middleware.ts for Auth0 v4 healthcheck (#2532)
## Summary
- Auth0 v4 auto-generates middleware that protects all routes when no
`middleware.ts` exists
- This caused `/api/healthcheck` to return 401, making the Application
Gateway mark the backend as unhealthy → **502 for all users**
- Restores explicit middleware with Auth0 v4 API and excludes
`/api/healthcheck` from the matcher

## Test plan
- [ ] `/api/healthcheck` returns 200 without auth
- [ ] Authenticated routes still require login
- [ ] Application Gateway backend health shows Healthy
2026-04-02 23:37:13 -05:00
Kevin Turcios
5dca735fc8
Upgrade Next.js 14 → 16, React 18 → 19, and dependencies (#2385)
## Summary
- Upgrade Next.js 14.2 → 16.1, React 18 → 19, React DOM 18 → 19
- Upgrade @sentry/nextjs 9 → 10, @auth0/nextjs-auth0 3 → 4, ESLint 8 → 9
- Migrate all async request APIs (cookies, params, searchParams are now
Promises)
- Migrate middleware.ts → proxy.ts (Next.js 16 convention)
- Rewrite ESLint config for flat config format
- New Auth0Client setup with backward-compatible AUTH0_DOMAIN derivation
- Turbopack browser-only resolveAlias for web-tree-sitter Node.js stubs

## Test plan
- [ ] `npm run build` passes
- [ ] `npm run lint` passes (0 errors, warnings only from React Compiler
rules)
- [ ] `npm run type-check` passes
- [ ] `npm run dev` starts successfully with Turbopack
- [ ] Auth login/logout flow works end-to-end
- [ ] Verify `AUTH0_DOMAIN` or `AUTH0_ISSUER_BASE_URL` env var is set in
deployment
2026-04-02 22:38:01 -05:00
Kevin Turcios
c2feaf91f0
fix: return 422 for operational failures instead of 500 across all endpoints (#2528)
## Summary
- Return **422 Unprocessable Entity** instead of 500 for known
operational failures (LLM output parsing failures, no valid candidates
produced, invalid rankings, etc.) across all aiservice endpoints
- Keeps 500 for genuine internal errors (bare `except Exception`
catch-alls that could include DB/network failures)
- Adds `422` to Django-Ninja response schemas so the framework
serializes responses correctly

## Endpoints changed
| Endpoint | Failure type | Old | New |
|---|---|---|---|
| `/ai/testgen` | `TestGenerationFailedError`, `ParserSyntaxError` | 500
| 422 |
| `/ai/optimize` | No valid candidates generated | 500 | 422 |
| `/ai/optimize-line-profiler` | No optimizations generated | 500 | 422
|
| `/ai/adaptive_optimize` | LLM parse error, no candidate | 500 | 422 |
| `/ai/code_repair` | LLM error, `ParserSyntaxError`, `ValidationError`
| 500 | 422 |
| `/ai/rank` | Invalid ranking from LLM | 500 | 422 |
| `/ai/explain` | LLM failure, XML parse failure | 500 | 422 |
| `/ai/optimization_review` | JSON parse failure, no JSON block | 500 |
422 |

## Why
These endpoints were returning 500 for expected outcomes (e.g., LLM
returning unparseable output), which triggered Azure 5xx alerts and
inflated error metrics. 422 correctly signals that the request was
understood but the server couldn't produce a valid result.

## Test plan
- [x] `uv run pytest -x -q -k "optimizer or rank or explain or
code_repair or review"` — 199 passed
- [ ] Verify Azure 5xx alert rate drops after deploy

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 19:37:27 -05:00
Kevin Turcios
e7f4bb40b3
perf: lazy debug_log_sensitive_data to skip model_dump_json in production (#2527)
## Summary

- Convert
`debug_log_sensitive_data(f"...{response.model_dump_json(indent=2)}")`
to `debug_log_sensitive_data_from_callable(lambda: ...)` across 8
endpoint files
- In production, `debug_log_sensitive_data` is a no-op but the f-string
interpolation (including `model_dump_json(indent=2)`) was always
evaluated — serializing the full LLM response to JSON on every call
- The `_from_callable` variant only invokes the lambda when debug
logging is active (non-production)
- **Fix pre-existing bug**: `log_response()` closures in 4 endpoint
files returned `None` instead of a string, causing
`debug_log_sensitive_data_from_callable` to log `None`. Now they return
the concatenated log string as expected by the callable-based API.

Affected endpoints: Python optimizer, line profiler, jit_rewrite, Java
optimizer, Java line profiler, JS/TS optimizer, JS/TS line profiler,
testgen.

## Test plan

- [x] All 558 unit tests pass
- [x] mypy clean
- [x] ruff clean
- [ ] Verify debug logging still works in non-production environments

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-04-02 19:37:25 -05:00
Kevin Turcios
0029a0e76e
perf: optimize postprocessing pipeline — eliminate redundant CST codegen (#2526)
## Summary

- Replace Pydantic frozen dataclass with stdlib
`@dataclass(frozen=True)` for `CodeExplanationAndID` and
`CodeAndExplanation`, removing `field_validator` that ran `.code` +
`compile()` ~280 times per pipeline run
- Pre-compute `original_module.code` once and pass to pipeline steps
(`clean_extraneous_comments`, `equality_check`) that previously called
it independently
- Replace `ast.dump(annotate_fields=False)` with `ast.unparse` in
`deduplicate_optimizations` (70% faster)
- Skip re-parse in `dedup_and_sort_imports` when isort returns unchanged
code
- Cache comment-stripped original code across candidates in
`clean_extraneous_comments`

**Pipeline median per-run: ~1.5s → 184ms** (4 candidates, controlled
measurement). Saves ~4-5s of CPU per optimization request in production.

## Test plan

- [x] All 558 unit tests pass
- [x] mypy clean
- [x] ruff clean (no new warnings)
- [ ] Verify optimizer endpoints return correct results in staging

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-04-02 19:37:15 -05:00
Kevin Turcios
d0e97992d6
perf: fire-and-forget logging to reduce response latency 100-300ms (#2525)
## Summary

- Move `safe_log_features()` and `update_optimization_cost()` out of
blocking `TaskGroup`s into fire-and-forget background tasks across 4
optimization endpoints (optimizer, optimizer_line_profiler, jit_rewrite,
adaptive_optimizer)
- These DB writes are analytics-only and don't affect response bodies —
waiting for them adds 100-300ms per request unnecessarily
- Add `aiservice/background.py` with `fire_and_forget()` helper using
the same `set` + `add_done_callback` pattern already used in `LLMClient`
- `get_or_create_optimization_event()` remains awaited where the
response needs `event.id`

## Test plan

- [x] All 550 tests pass locally
- [ ] Verify response latency improvement in production metrics after
deploy
- [ ] Confirm `safe_log_features` and `update_optimization_cost` still
complete successfully in background (check DB records)

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 11:37:52 -05:00
mohammed ahmed
de0f30ae15
Fix: Strip .js extensions from vi.mock() calls in Vitest tests (#2524)
## Summary

Vitest tests were failing with "Cannot find module" errors because
`vi.mock()` calls retained `.js` extensions while imports had them
stripped, causing mock/import path mismatch in ESM mode.

## Root Cause

The `strip_js_extensions()` function in `testgen.py` only handled
`jest.mock()` but not `vi.mock()`, which is used by Vitest. The pattern
`_JEST_MOCK_EXTENSION_PATTERN` matched Jest mocking functions but not
Vitest's `vi.*` equivalents.

## Fix

Added `_VITEST_MOCK_EXTENSION_PATTERN` regex to match and strip
extensions from:
- `vi.mock()`
- `vi.doMock()`
- `vi.unmock()`
- `vi.requireActual()`
- `vi.requireMock()`
- `vi.importActual()`
- `vi.importMock()`

## Affected Trace IDs

- `0fe99c9f-b348-4f0a-b051-0ea9455231ba`
- `127cdaec-a343-4918-a86a-b646dd4d79cf`
- `2b6c896e-20d7-4505-8bf4-e4a2f20b37fc`

These trace IDs exhibited the bug where generated tests had
`vi.mock('../config/paths.js')` but imports had `from
'../config/paths'`, causing module resolution failures.

## Test Coverage

- Added 8 new tests in `TestStripJsExtensions` class
- All 31 tests in `test_testgen_javascript.py` pass
- Specific regression test for vi.mock() extension stripping
- Tests cover all vi.mock variants and edge cases

## Files Changed

- `django/aiservice/core/languages/js_ts/testgen.py` (fix)
- `django/aiservice/tests/testgen/test_testgen_javascript.py` (tests)

---------

Co-authored-by: Codeflash Bot <codeflash-bot@codeflash.ai>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Sarthak Agarwal <sarthak.saga@gmail.com>
2026-04-02 21:50:45 +05:30
mohammed ahmed
179302d006
Fix test generation replay 500 error when arrays contain None values (#2521)
## Summary

Fixes 500 Internal Server Error when replaying test generation with
`--rerun` flag and database arrays contain `None`/`NULL` values.

## Root Cause

The `rerun_testgen()` function in `core/shared/replay.py` accessed array
elements without checking if they were `None`. When PostgreSQL arrays
contained `NULL` values (e.g., `generated_test = [NULL, 'test2']`), the
function returned a `TestGenResponseSchema` with `None` values, causing
Pydantic validation to fail:

```
pydantic_core._pydantic_core.ValidationError: 2 validation errors for TestGenResponseSchema
generated_tests
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
instrumented_behavior_tests
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
```

## Changes

Added explicit `None` checks before creating `TestGenResponseSchema`:
- If `generated_test[index]` or `instrumented_generated_test[index]` is
`None`, return `None` (skip this test)
- If `instrumented_perf_test[index]` is `None`, default to empty string
(non-critical field)

## Impact

Resolves **10+ replay failures** where test generation produced partial
results stored as `NULL` in database arrays.

## Test Coverage

Added comprehensive test suite for `replay.py`:
- `test_rerun_with_valid_test_data()` - Happy path
- `test_rerun_with_none_values_in_arrays()` - **Primary bug fix test**
- `test_rerun_with_index_out_of_bounds()` - Boundary conditions
- `test_rerun_with_empty_arrays()` - Empty data handling
- `test_rerun_with_none_arrays()` - NULL arrays
- `test_rerun_with_mismatched_array_lengths()` - Length mismatches
- `test_rerun_missing_perf_test()` - Missing perf data

All 7 tests pass.

## Trace IDs

This fix addresses errors seen in traces:
- Primary: `056561cc-94af-4d7b-ac79-85dfd4b7282d`
- And 9 additional trace IDs with the same "500 - Error generating
JavaScript tests" error

## Verification

Tested with original failing trace:
```bash
cd /workspace/target && codeflash --file src/daemon/constants.ts --function formatGatewayServiceDescription --rerun 056561cc-94af-4d7b-ac79-85dfd4b7282d
```

**Before fix:** `ERROR: 500 - Traceback... ValidationError: Input should
be a valid string [type=string_type, input_value=None]`
**After fix:** Gracefully skips None entries, no 500 error 

---------

Co-authored-by: Codeflash Bot <codeflash-bot@codeflash.ai>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 21:49:58 +05:30
Kevin Turcios
d504f111a7
fix: plug memory leak from LogRecord buffering and unblock async event loop (#2523)
## Summary

- **Memory leak fix**: Added explicit `LOGGING` config in `settings.py`
to prevent unbounded `LogRecord` buffering. Django's `django.request`
logger creates WARNING records for 4xx responses with the full
`ASGIRequest` (headers, body, payload) pinned in `args`. Without
explicit config, Django's default handlers and Sentry's
`enable_logs=True` buffer these indefinitely. Setting `django.request`
to ERROR level + removing `enable_logs=True` eliminated the leak — load
testing showed **84% reduction** in per-request memory growth (7.4 → 1.2
KiB/req).

- **Async event loop fix**: Wrapped
`parse_and_generate_candidate_schema()` in `asyncio.to_thread()` across
all 4 async callers (optimizer, optimizer_line_profiler, jit_rewrite,
adaptive_optimizer). This offloads the synchronous libcst parsing +
8-stage postprocessing pipeline to the thread pool, preventing it from
blocking the event loop during peak traffic.

## Test plan

- [x] All 550 tests pass (`uv run pytest tests/ --ignore=tests/profiling
-x -q`)
- [ ] Monitor Azure memory alerts after deploy — expect significant
reduction in memory growth rate
- [ ] Monitor 5xx error rate during peak traffic — expect reduction from
event loop no longer blocked by sync postprocessing

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 10:57:58 -05:00
Kevin Turcios
df90110fe8
fix: prevent log_features from 500ing optimization endpoints (#2518)
## Summary

- **`thread_sensitive=False`** on `sync_to_async` so concurrent
`log_features` calls get their own threads instead of serializing
through one (was `True`, causing a bottleneck)
- **Raised DB pool `max_size` from 10 to 100** — prod Postgres allows
859 connections, giving plenty of headroom
- **Added `safe_log_features` wrapper** that catches errors via Sentry
instead of propagating — used at all 9 TaskGroup and bare-await call
sites so a logging failure can't crash an otherwise successful
optimization endpoint
- **Kept `transaction.atomic` + `select_for_update`** for correctness
(Django doesn't support async transactions yet, and removing these
causes lost-update races on dict-merge fields)

## Root cause

`log_features` uses `@sync_to_async` + `@transaction.atomic` because
Django lacks async transaction support. The previous fix for pool
exhaustion changed `thread_sensitive=False` to `True`, which serialized
all calls through a single thread — fixing pool exhaustion but creating
a throughput bottleneck that caused 500s under load. Additionally, 6
call sites used `asyncio.TaskGroup` where any `log_features` exception
would propagate and crash the entire endpoint.

## Test plan

- [x] `tests/log_features/test_log_features_concurrency.py` — verifies
`thread_sensitive=False` and `safe_log_features` is async
- [x] `ruff check` passes on all changed files
- [ ] Deploy to staging and verify no 500s under concurrent optimization
requests
2026-04-02 06:51:20 -05:00
mohammed ahmed
c4222a4aeb
Merge pull request #2508 from codeflash-ai/fix/js-import-resolution-detect-export-style
Fix JS/TS import resolution to detect export style from source code
2026-04-02 10:51:10 +02:00
claude[bot]
76c605c6d9 style: auto-fix linting issues 2026-04-01 17:22:46 +00:00
mohammed ahmed
868a9d5d37
Merge pull request #2511 from codeflash-ai/codeflash/optimize-pr2508-2026-04-01T17.18.19
️ Speed up function `_resolve_import` by 1,027% in PR #2508 (`fix/js-import-resolution-detect-export-style`)
2026-04-01 19:20:42 +02:00
codeflash-ai[bot]
dd518c18aa
Optimize _resolve_import
The optimization hoisted the 70-element `reserved_words` set out of `_is_valid_js_identifier` into a module-level `frozenset`, eliminating 1677 repeated set constructions that consumed 1.79 ms per profiler (42% of that function's time). More significantly, `_detect_export_style` previously compiled six regex patterns on every invocation via f-string interpolation with `escaped_id`; the optimized version pre-compiles generic patterns once at module load and uses `finditer` plus manual identifier comparison, cutting the function's runtime from 3.17 s to 14.7 ms across 1146 calls—a 99.5% reduction that accounts for nearly all of the 10× speedup. Test annotations confirm the largest gains occur in the `test_large_scale_many_class_methods_with_alternating_export_styles` case (107 ms → 4.66 ms), where repeated export detection dominated.
2026-04-01 17:18:23 +00:00
claude[bot]
534d0317b1 style: auto-fix linting issues
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 17:10:38 +00:00
ali
bb747096c8
fix existing unit tests 2026-04-01 19:07:25 +02:00
claude[bot]
83b37c7337 style: auto-fix linting issues
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 16:26:09 +00:00
Codeflash Bot
9ec5eb7c5a Fix JS/TS import resolution to detect export style from source code
When generating tests for class methods (e.g., ModulesContainer.getById),
the test generator was incorrectly assuming default import style, generating:
  import ModulesContainer from '...'

This caused "Cannot find module" errors when:
1. The class was not exported at all
2. The class used named export (export class X) instead of default export

This fix:
- Adds _detect_export_style() to parse source code and detect actual export style
- Modifies _resolve_import() to use detected export style:
  - 'export default class X' → default import
  - 'export class X' → named import
  - No export → named import (test will fail, surfacing the issue)
- Adds comprehensive unit tests for all scenarios

Affected traces: 12332328-80e8-4bde-bdd6-c76ac373675a, 73ccd4c6-a4f7-467a-8356-5199e9d9b877, 989dcbda-bc27-40b7-aed0-0ab51fd00e6d, and others with ERR_MODULE_NOT_FOUND

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-01 16:23:17 +00:00
Kevin Turcios
0abc6bf1e3
async: parallelize endpoint epilogue DB writes (#2490)
## Summary

Parallelize independent DB writes at the end of 4 endpoints using
`asyncio.TaskGroup`. With psycopg3 connection pooling (#2489), each task
gets its own connection from the pool.

### Endpoints optimized

| Endpoint | Before | After |
|----------|--------|-------|
| **Refinement** | `log_features` then `update_optimization_cost` |
`TaskGroup` (concurrent) |
| **Explanations** | `update_optimization_cost` inside inner fn | Moved
to handler, `TaskGroup` with `log_features` |
| **Optimization review** | `update_optimization_cost` inside inner fn |
Moved to handler, `TaskGroup` with `update_optimization_features_review`
|
| **Ranker** | `update_optimization_cost` inside inner fn | Moved to
handler, `TaskGroup` with `log_features` |

Each endpoint saves ~87ms (one DB round-trip) by overlapping two
independent writes.

### Comprehensive audit

All 13 endpoints were audited — no remaining async antipatterns found:
- No blocking calls in async paths
- No `await`-in-loop patterns
- LLM clients already use connection reuse
- All other endpoints have at most 1 DB write in the epilogue

## Test plan

- [x] All 538 tests passing
- [ ] Verify under load in staging

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
2026-04-01 06:15:16 -05:00
mohammed ahmed
2887b34d02
chore: clean up codeflash JS workflow (#2499)
## Summary
- Normalize quote style to double quotes for YAML consistency
- Remove redundant `jest-junit` runtime install step (already in
devDependencies)
- Simplify codeflash CLI flags: `--all --verbose --yes` → `--yes`

## Test plan
- [ ] Verify workflow runs successfully on a test PR touching
`js/cf-api/` or `js/cf-webapp/`
- [ ] Confirm `npm ci` installs jest-junit from package-lock without the
extra install step

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Kevin Turcios <106575910+KRRT7@users.noreply.github.com>
2026-04-01 04:23:57 -05:00
mohammed ahmed
8d987de65c
Fix TypeScript validator to support JSX/TSX syntax (#2503)
## Summary

The TypeScript validator was rejecting valid JSX/TSX syntax, causing
optimization runs to fail on React components with JSX.

## Problem

The validator was using `tree_sitter_typescript.language_typescript()`
which doesn't parse JSX syntax. This caused validation failures for
`.tsx` files containing JSX elements like:
- `<div className={...} />`
- `{...rest}` (spread props)
- Any JSX tags

## Solution

Changed to use `tree_sitter_typescript.language_tsx()` instead. Since
TSX is a superset of TypeScript, this supports both:
- Plain TypeScript code
- TypeScript with JSX (TSX)

## Testing

Added three new test cases:
- `test_tsx_simple_jsx` - Tests basic JSX elements
- `test_tsx_nested_jsx` - Tests nested JSX
- `test_tsx_with_props_spread` - Tests spread props in JSX

All existing tests continue to pass.

## Impact

This fixes validation errors for all React/JSX components. Affected
trace IDs from logs:
- 5bedfbb7-ccc0-4fdd-b208-60b8b860750c
- 39892d42-774f-4921-80fc-2ee42ff8ae1c
- 80b818b6-e784-4ff8-abda-c3ce6b25422f
- 9b76943e-1a93-45fa-84b9-aae7d6305f79
- d1bac014-d622-4772-90ea-0f9ff88e32dd

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Codeflash Bot <codeflash-bot@codeflash.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-01 04:20:41 -05:00
mohammed ahmed
de4a22d549
Merge pull request #2502 from codeflash-ai/fix/postgres-pool-exhaustion-thread-sensitive
Fix PostgreSQL connection pool exhaustion in log_features
2026-04-01 02:11:26 +02:00
claude[bot]
8629ac756e fix: resolve mypy type errors in test file
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 23:54:53 +00:00
claude[bot]
f530a1e562 style: auto-fix linting issues
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 23:53:58 +00:00
Codeflash Bot
8734f5a0f8 Fix PostgreSQL connection pool exhaustion in log_features
Bug: PostgreSQL connection pool timeout (30 seconds)
Root cause: log_features uses @sync_to_async(thread_sensitive=False), causing
each call to grab a separate database connection from the pool. When multiple
optimization requests run concurrently, the pool (max_size=10) exhausts.

Error seen: psycopg_pool.PoolTimeout: couldn't get a connection after 30.00 sec

Fix: Change thread_sensitive=False to thread_sensitive=True. This ensures Django
properly reuses connections across async/sync boundaries instead of allocating
a new connection for each call.

Affected trace IDs from logs:
- a0d8dab6-6524-47dc-9c82-5fa92e6390fb
- 62f5c35b-7161-4ab0-958a-4865231f5188
- ddc0e882-f914-49e4-a2ac-2d5f19a17507
- eaeb0cbe-6474-4808-9092-42f837dd52cf

Testing:
- Added test_log_features_concurrency.py to verify thread_sensitive=True
- Verified reproduction script now passes without pool exhaustion
- All existing tests pass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-31 23:49:51 +00:00
mohammed ahmed
1908325dc8
feat: add rerun trace support to aiservice endpoints (#2493)
## Summary
- Adds `rerun_trace_id` field to all request schemas (`OptimizeSchema`,
`OptimizeSchemaLP`, `TestGenSchema`, `RefinementRequestSchema`,
`CodeRepairRequestSchema`)
- Creates `core/shared/replay.py` with shared rerun logic that queries
`optimization_features` and returns stored results
- Adds early-return short-circuit to `/optimize`,
`/optimize-line-profiler`, `/testgen`, `/refinement`, `/code_repair` —
bypasses LLM calls when `rerun_trace_id` is provided
- Filters results by `optimizations_origin.source` (OPTIMIZE,
OPTIMIZE_LP, REFINE, REPAIR) and matches by parent optimization ID for
refinement/repair

## Test plan
- [ ] Run optimization normally to populate `optimization_features` with
a trace_id
- [ ] Rerun with `codeflash --rerun <trace_id>` against local server
- [ ] Verify each endpoint returns stored results without LLM calls
- [ ] Verify backward compatibility — requests without `rerun_trace_id`
behave unchanged

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Sarthak Agarwal <sarthak.saga@gmail.com>
2026-03-29 18:16:13 +05:30
mohammed ahmed
2612a56994
Merge pull request #2497 from codeflash-ai/chore/make-health-check-public-in-cfapi
Chore: expose healthcheck endpoint publicly for cf-api
2026-03-29 09:41:48 +02:00
mohammed ahmed
dac9989e43
Merge branch 'main' into chore/make-health-check-public-in-cfapi 2026-03-29 09:28:05 +02:00
mohammed ahmed
6aca6a83ad
Merge pull request #2498 from codeflash-ai/mohammedahmed18-patch-1
chore: packages read access for gh action
2026-03-29 09:27:29 +02:00