mirror of https://github.com/codeflash-ai/codeflash-internal.git synced 2026-05-04 18:25:18 +00:00

No description

Find a file

codeflash-ai[bot] 1f621c4682 Optimize LLMClient.call The optimization adds an early-exit check in `calculate_llm_cost` that returns zero immediately when all rate fields (`input_cost`, `cached_input_cost`, `output_cost`) are zero, before extracting token counts via `getattr` calls. Line profiler confirms the hot path: the original spent 70.7% of function time (580 ms) in the final return statement's arithmetic, but 99.3% of calls (949/956) had zero-cost models where token extraction was wasted work. The optimized version short-circuits these cases in 1.9 ms total, cutting `calculate_llm_cost` from 821 ms to 29 ms (96.5% reduction). This cascades to `LLMClient.call`, where cost calculation dropped from 50.5% to 4.3% of method time, yielding an 80% throughput gain (6165 → 11,097 ops/sec) despite a 37% concurrency ratio regression caused by spending proportionally more time in non-yielding sync code after eliminating the async bottleneck.		2026-04-05 05:05:52 +00:00
.claude	fix: rename skill files to be Windows-compatible	2026-02-17 05:01:30 +00:00
.codex	fix: rename skill files to be Windows-compatible	2026-02-17 05:01:30 +00:00
.gemini	fix: rename skill files to be Windows-compatible	2026-02-17 05:01:30 +00:00
.github	async: parallelize endpoint epilogue DB writes (#2490 )	2026-04-01 06:15:16 -05:00
.idea	more cleanup	2026-01-28 22:23:54 +02:00
.tessl	chore: switch tessl to managed mode (#2491 )	2026-03-27 07:21:14 -05:00
.vscode	Revert "CF-1041 observability v2 " need more changes and testing (#2375 )	2026-02-06 01:18:17 +05:30
cli/code-to-optimize	codeflash-omni-java (#2335 )	2026-02-13 23:26:55 +05:30
deployment/onprem-simple	local setup (#1898 )	2025-11-17 12:35:09 -08:00
django	Optimize LLMClient.call	2026-04-05 05:05:52 +00:00
experiments	move sqlalchemy, gen_inspired_tests, and mistral	2025-12-30 15:22:39 -05:00
js	fix: remove middleware.ts conflicting with proxy.ts (#2534 )	2026-04-03 00:46:19 -05:00
tiles	test: add evals for all three Tessl tiles	2026-02-14 22:25:30 -05:00
.dockerignore	local setup (#1898 )	2025-11-17 12:35:09 -08:00
.editorconfig	consistency in formatting across ide & js projs (#1499 )	2025-03-04 23:52:45 +00:00
.gitattributes	chore: add gh-aw duplicate code detector workflow (#2418 )	2026-02-14 18:14:55 -05:00
.gitignore	chore: switch tessl to managed mode (#2491 )	2026-03-27 07:21:14 -05:00
.gitmodules	add the new submodule at cli	2025-02-13 01:05:51 -05:00
.mcp.json	feat: add Tessl tiles for codeflash-internal (rules, docs, skills)	2026-02-14 22:16:33 -05:00
.pre-commit-config.yaml	add pre-k GHA	2025-12-30 01:17:05 -05:00
.prettierrc	Rename some auth0 mgmt things	2023-12-10 17:40:07 -08:00
AGENTS.md	feat: add Tessl tiles for codeflash-internal (rules, docs, skills)	2026-02-14 22:16:33 -05:00
CLAUDE.md	feat: add Tessl tiles for codeflash-internal (rules, docs, skills)	2026-02-14 22:16:33 -05:00
lefthook.yml	saga4 misc fixes (#2018 )	2025-11-14 20:23:58 -08:00
mypy.ini	Byesian analysis implementation	2025-01-17 17:44:24 -08:00
package-lock.json	Initial js support in aiservice	2026-01-14 22:15:27 -08:00
package.json	Initial js support in aiservice	2026-01-14 22:15:27 -08:00
README.md	Update README.md	2024-12-27 11:55:26 -08:00
secretlint.config.js	add secret scanner and monorepo hook (#1201 )	2024-11-09 14:23:39 +00:00
tessl.json	chore: switch tessl to managed mode (#2491 )	2026-03-27 07:21:14 -05:00

README.md

CodeFlash MonoRepo

Here's the projects that are part of the CodeFlash MonoRepo:

CodeFlash Client - /cli/
CodeFlash Python Django ai service - /django/aiservice
CodeFlash NodeJS CF API - /js/cf-api
CodeFlash Webapp - /js/cf-webapp

Project Setup

Prerequisites

Node.js and npm: Ensure Node.js is installed and npm is set up for installation of pre-commit hook(Lefthook).
Python and Mamba: Ensure Python is installed and Mamba is set up.

post clone run npm install to install all the dependencies at root level.

Glossary

Optimization

Codeflash Optimizer - The overarching technology that solves Code optimization.
Function to Optimize - The target function that we want to optimize.
Optimization Candidate - generated code that we think might be an optimization of the code to optimize.
Helper function - This a function being called by, and is under the code path of the function to optimize.
Read-Write Context - The part of the code context provided to the LLM that it can modify. Aka - Code To Optimize
Read-Only Context - The part of the context that is only provided as info to the LLM. It is not expected to be modified.

Test generation

Verification - System to verify if the optimization candidate has the same functional behavior as the function to optimize.
Existing Tests - All the existing tests that are present in a repo.
Generated Test - The tests that we create for the user using the LLM.
Tracer - Our technology that collects and dumps the input arguments and other info for a Python executable.
Replay test - This test reruns all the inputs for a function to optimize that were collected by the tracer.
Inspired Regression tests - Newly generated Tests that were "inspired" by existing tests. That means these are new test cases that are generated by the llm understanding how the code works by looking at the existing test cases and function to optimize.
Comparator - Our function that compares any two Python objects and returns True if they are equal and False if they are not equal.

Infra and Systems

CF API - The javascript webservice that currently serves the GitHub App.
AI Service - The Python Django service that serves the AI endpoints.
Webapp - The react web application written in Next.js. Users can generate API Key etc here.
PostHog - Our events tracking and product analytics 3rd party tool.
Sentry - Our code crash telemetry service that helps us understand how codeflash fails.