No description
Find a file
Kevin Turcios df90110fe8
fix: prevent log_features from 500ing optimization endpoints (#2518)
## Summary

- **`thread_sensitive=False`** on `sync_to_async` so concurrent
`log_features` calls get their own threads instead of serializing
through one (was `True`, causing a bottleneck)
- **Raised DB pool `max_size` from 10 to 100** — prod Postgres allows
859 connections, giving plenty of headroom
- **Added `safe_log_features` wrapper** that catches errors via Sentry
instead of propagating — used at all 9 TaskGroup and bare-await call
sites so a logging failure can't crash an otherwise successful
optimization endpoint
- **Kept `transaction.atomic` + `select_for_update`** for correctness
(Django doesn't support async transactions yet, and removing these
causes lost-update races on dict-merge fields)

## Root cause

`log_features` uses `@sync_to_async` + `@transaction.atomic` because
Django lacks async transaction support. The previous fix for pool
exhaustion changed `thread_sensitive=False` to `True`, which serialized
all calls through a single thread — fixing pool exhaustion but creating
a throughput bottleneck that caused 500s under load. Additionally, 6
call sites used `asyncio.TaskGroup` where any `log_features` exception
would propagate and crash the entire endpoint.

## Test plan

- [x] `tests/log_features/test_log_features_concurrency.py` — verifies
`thread_sensitive=False` and `safe_log_features` is async
- [x] `ruff check` passes on all changed files
- [ ] Deploy to staging and verify no 500s under concurrent optimization
requests
2026-04-02 06:51:20 -05:00
.claude fix: rename skill files to be Windows-compatible 2026-02-17 05:01:30 +00:00
.codex fix: rename skill files to be Windows-compatible 2026-02-17 05:01:30 +00:00
.gemini fix: rename skill files to be Windows-compatible 2026-02-17 05:01:30 +00:00
.github async: parallelize endpoint epilogue DB writes (#2490) 2026-04-01 06:15:16 -05:00
.idea more cleanup 2026-01-28 22:23:54 +02:00
.tessl chore: switch tessl to managed mode (#2491) 2026-03-27 07:21:14 -05:00
.vscode Revert "CF-1041 observability v2 " need more changes and testing (#2375) 2026-02-06 01:18:17 +05:30
cli/code-to-optimize codeflash-omni-java (#2335) 2026-02-13 23:26:55 +05:30
deployment/onprem-simple local setup (#1898) 2025-11-17 12:35:09 -08:00
django fix: prevent log_features from 500ing optimization endpoints (#2518) 2026-04-02 06:51:20 -05:00
experiments move sqlalchemy, gen_inspired_tests, and mistral 2025-12-30 15:22:39 -05:00
js make healthcheck public in cfapi 2026-03-29 09:09:48 +02:00
tiles test: add evals for all three Tessl tiles 2026-02-14 22:25:30 -05:00
.dockerignore local setup (#1898) 2025-11-17 12:35:09 -08:00
.editorconfig consistency in formatting across ide & js projs (#1499) 2025-03-04 23:52:45 +00:00
.gitattributes chore: add gh-aw duplicate code detector workflow (#2418) 2026-02-14 18:14:55 -05:00
.gitignore chore: switch tessl to managed mode (#2491) 2026-03-27 07:21:14 -05:00
.gitmodules add the new submodule at cli 2025-02-13 01:05:51 -05:00
.mcp.json feat: add Tessl tiles for codeflash-internal (rules, docs, skills) 2026-02-14 22:16:33 -05:00
.pre-commit-config.yaml add pre-k GHA 2025-12-30 01:17:05 -05:00
.prettierrc Rename some auth0 mgmt things 2023-12-10 17:40:07 -08:00
AGENTS.md feat: add Tessl tiles for codeflash-internal (rules, docs, skills) 2026-02-14 22:16:33 -05:00
CLAUDE.md feat: add Tessl tiles for codeflash-internal (rules, docs, skills) 2026-02-14 22:16:33 -05:00
lefthook.yml saga4 misc fixes (#2018) 2025-11-14 20:23:58 -08:00
mypy.ini Byesian analysis implementation 2025-01-17 17:44:24 -08:00
package-lock.json Initial js support in aiservice 2026-01-14 22:15:27 -08:00
package.json Initial js support in aiservice 2026-01-14 22:15:27 -08:00
README.md Update README.md 2024-12-27 11:55:26 -08:00
secretlint.config.js add secret scanner and monorepo hook (#1201) 2024-11-09 14:23:39 +00:00
tessl.json chore: switch tessl to managed mode (#2491) 2026-03-27 07:21:14 -05:00

CodeFlash MonoRepo

Here's the projects that are part of the CodeFlash MonoRepo:

  • CodeFlash Client - /cli/
  • CodeFlash Python Django ai service - /django/aiservice
  • CodeFlash NodeJS CF API - /js/cf-api
  • CodeFlash Webapp - /js/cf-webapp

Project Setup

Prerequisites

  • Node.js and npm: Ensure Node.js is installed and npm is set up for installation of pre-commit hook(Lefthook).
  • Python and Mamba: Ensure Python is installed and Mamba is set up.

post clone run npm install to install all the dependencies at root level.

Glossary

Optimization

  • Codeflash Optimizer - The overarching technology that solves Code optimization.
  • Function to Optimize - The target function that we want to optimize.
  • Optimization Candidate - generated code that we think might be an optimization of the code to optimize.
  • Helper function - This a function being called by, and is under the code path of the function to optimize.
  • Read-Write Context - The part of the code context provided to the LLM that it can modify. Aka - Code To Optimize
  • Read-Only Context - The part of the context that is only provided as info to the LLM. It is not expected to be modified.

Test generation

  • Verification - System to verify if the optimization candidate has the same functional behavior as the function to optimize.
  • Existing Tests - All the existing tests that are present in a repo.
  • Generated Test - The tests that we create for the user using the LLM.
  • Tracer - Our technology that collects and dumps the input arguments and other info for a Python executable.
  • Replay test - This test reruns all the inputs for a function to optimize that were collected by the tracer.
  • Inspired Regression tests - Newly generated Tests that were "inspired" by existing tests. That means these are new test cases that are generated by the llm understanding how the code works by looking at the existing test cases and function to optimize.
  • Comparator - Our function that compares any two Python objects and returns True if they are equal and False if they are not equal.

Infra and Systems

  • CF API - The javascript webservice that currently serves the GitHub App.
  • AI Service - The Python Django service that serves the AI endpoints.
  • Webapp - The react web application written in Next.js. Users can generate API Key etc here.
  • PostHog - Our events tracking and product analytics 3rd party tool.
  • Sentry - Our code crash telemetry service that helps us understand how codeflash fails.