codeflash-internal

Author	SHA1	Message	Date
claude[bot]	ae7110491c	fix: add type ignore for Django ORM field type mismatch Update type hints for `add_months_safe` and `get_next_subscription_period` to accept both datetime.datetime and datetime.date, and add ty:ignore comment for Django ORM field type that ty cannot infer correctly. Co-authored-by: Aseem Saxena <aseembits93@users.noreply.github.com>	2026-02-24 10:37:33 +00:00
aseembits93	7f824ce101	fix: eliminate redundant DB queries in middleware and unblock LLM responses Auth now attaches fetched organization/subscription to the request so TrackUsageMiddleware reuses them instead of re-querying. RateLimitMiddleware caches restricted_paths at init and uses async cache methods. LLM call recording is fire-and-forget via asyncio.create_task to avoid blocking responses on DB writes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 20:43:18 +05:30
aseembits93	d4867ef18e	refactor: make line profiler JIT handling consistent with regular optimizer Move JIT instructions appending from the per-call level (optimize_python_code_line_profiler_single) to the endpoint level (optimize endpoint), matching the regular optimizer's pattern. This removes the is_numerical_code parameter threading through the call chain. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 19:54:03 +05:30
aseembits93	0b523fc367	fix: enforce direct JIT decorator in optimizer prompt for numerical code When is_numerical_code is true, the LLM sometimes outputs conditional fallback paths (try/except, if/else) instead of applying the JIT decorator directly. Add explicit output format instructions to prevent this behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 19:49:24 +05:30
Kevin Turcios	033d14ea87	Merge branch 'main' into testgen-jit-iter	2026-02-23 08:56:11 +00:00
Kevin Turcios	f14ff077a6	Merge branch 'main' into reduce-recompilations	2026-02-23 08:55:29 +00:00
Kevin Turcios	05aecd6fbd	Merge pull request #2437 from codeflash-ai/misc-changes fix: improve ranker scoring consistency and local-caching bias	2026-02-23 08:55:18 +00:00
Kevin Turcios	40ff909b03	fix: add DATABASE_URL and DJANGO_SETTINGS_MODULE to pr-review workflow Coverage analysis in the Claude pr-review job needs these env vars to run pytest, matching how django-unit-tests and codeflash-aiservice workflows configure them.	2026-02-23 03:43:33 -05:00
claude[bot]	bf4e38c301	fix: add cast to satisfy ty type checker for list covariance The ty type checker correctly flags that list[str] is not a subtype of list[str \| None] due to list invariance. Added explicit cast. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-23 08:42:24 +00:00
Kevin Turcios	16e043883a	style: auto-format ranker and test_markdown_utils	2026-02-23 03:39:38 -05:00
Kevin Turcios	85a1c8b183	fix: derive ranker ranking from structured scores instead of LLM array The JSON parsing path returned the LLM's explicit ranking array, which sometimes contradicted its own per-dimension scores. Use _scores_to_ranking() to compute the ranking from weighted scores when available, falling back to the LLM ranking only when scores are absent.	2026-02-23 03:37:42 -05:00
Kevin Turcios	20ee6d5b62	fix: penalize local variable caching of globals in ranker prompt The ranker LLM was rewarding candidates that cache global variables into locals as a performance win. Add an explicit rule: this is only relevant on Python ≤3.10; on 3.11+ LOAD_GLOBAL uses adaptive specialization and is nearly as fast as LOAD_FAST.	2026-02-23 03:37:21 -05:00
Kevin Turcios	c95a36cf38	fix: handle nested code fences in extract_code_block The non-greedy regex in FIRST_CODE_BLOCK_PATTERN stopped at the first ``` occurrence, even inside triple-quoted strings or nested code fence blocks. This truncated the extracted code and lost test functions when LLMs embedded function definitions using ```python:filepath syntax. Switch to greedy matching and require the closing ``` to be alone on its line so intermediate backticks are skipped.	2026-02-23 03:36:50 -05:00
Kevin Turcios	ca71d0c8a0	refactor: remove constructor notes preprocessing from testgen pipeline Full class source is now included in the client-side testgen context, making the server-side constructor signature extraction redundant.	2026-02-23 03:36:50 -05:00
Kevin Turcios	bfd9f2cd04	fix: respect test_index when creating optimization_features row The get_or_create defaults passed test lists without positional indexing, so when a higher test_index created the row first its content landed at index 0 and was overwritten by the lower index update, losing a test.	2026-02-23 03:36:50 -05:00
Kevin Turcios	6346d0992a	chore: rename repo path env vars to match standard names CODEFLASH_INTERNAL_REPO_PATH → AISERVICE_DIR, CODEFLASH_CLI_REPO_PATH → CODEFLASH_DIR	2026-02-23 03:36:50 -05:00
Kevin Turcios	af3185edff	fix: handle non-numeric patch suffixes and support Python 3.15	2026-02-23 03:36:50 -05:00
Sarthak Agarwal	2cb3d51ddb	fix issue with closed and merged PRs raising suggestion (#2436 )	2026-02-21 01:23:55 +05:30
Aseem Saxena	852274e2be	Merge branch 'main' into reduce-recompilations	2026-02-21 00:59:24 +05:30
aseembits93	85c5a2ec82	reduce rcompilations in the tests	2026-02-21 00:57:52 +05:30
Aseem Saxena	8f6d1d0602	fix: improve JIT testgen prompt to avoid error-checking tests Add explicit guidance to avoid generating tests that check for specific exception types, since JIT compilers (numba, torch.compile) produce different error types than uncompiled code. This ensures generated tests work consistently for both compiled and uncompiled versions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-20 18:59:04 +00:00
Aseem Saxena	5553b01bc1	Merge branch 'main' into testgen-jit-iter	2026-02-21 00:06:44 +05:30
claude[bot]	4fa972edd3	refactor: remove unused TORCH_TENSOR_FUNCTIONS constant Co-authored-by: Aseem Saxena <aseembits93@users.noreply.github.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-20 18:33:41 +00:00
Sarthak Agarwal	eb5f4b460e	Migrate to AWS bedrock (#2430 ) AWS_ACCESS_KEY_ID= AWS_SECRET_ACCESS_KEY= AWS_REGION=us-east-1 Will require these for boto3 authentication	2026-02-20 23:52:48 +05:30
claude[bot]	46da033b05	style: fix ruff formatting and add mypy type annotation	2026-02-20 18:09:05 +00:00
Aseem Saxena	7e1b2a3ade	investigate	2026-02-20 18:03:28 +00:00
Kevin Turcios	7005156190	Merge pull request #2427 from codeflash-ai/class-constructor-notes feat: add constructor notes for non-dataclass classes	2026-02-19 01:46:55 +00:00
Kevin Turcios	b5af1ca353	Merge branch 'main' into class-constructor-notes	2026-02-19 01:46:45 +00:00
Aseem Saxena	e336a91c93	update model id	2026-02-17 07:08:27 -08:00
aseembits93	730c01d047	feat: switch Claude workflows from Foundry to AWS Bedrock Replace Anthropic Foundry authentication with AWS Bedrock OIDC in both claude.yml and duplicate-code-detector.yml workflows. Changes: - Replace use_foundry with use_bedrock - Add aws-actions/configure-aws-credentials@v4 OIDC step - Remove ANTHROPIC_FOUNDRY_API_KEY/BASE_URL env vars - Update model identifiers to Bedrock format Requires AWS_ROLE_TO_ASSUME secret to be configured in the repo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 19:01:51 +05:30
Kevin Turcios	a69f67f68f	Merge pull request #2428 from codeflash-ai/fix-windows-skill-filenames fix: rename skill files to be Windows-compatible	2026-02-17 05:06:29 +00:00
Kevin Turcios	a21eb7aba2	fix: rename skill files to be Windows-compatible Renamed skill files from using colons to dashes (e.g., tessl:add-api-endpoint → tessl-add-api-endpoint) to fix checkout issues on Windows filesystems which don't allow colons in filenames. Skills will continue to work as the files contain relative paths to .tessl directory and don't reference their own filenames.	2026-02-17 05:01:30 +00:00
Sarthak Agarwal	e22e5d1f8b	Add codeflash optimization workflow for cf-api and cf-webapp (#2411 ) Co-authored-by: Kevin Turcios <106575910+KRRT7@users.noreply.github.com>	2026-02-16 19:48:15 +05:30
claude[bot]	1bb1407c6b	fix: resolve type checker errors	2026-02-15 12:33:05 +00:00
Kevin Turcios	d6a3c6254f	feat: add constructor notes for non-dataclass classes with __init__ The LLM prompt preprocessing now highlights __init__ signatures for regular classes, not just @dataclass ones, reducing brute-force constructor guessing and pytest.skip() fallbacks in generated tests.	2026-02-15 07:29:05 -05:00
Kevin Turcios	38eda0c2d6	Merge pull request #2426 from codeflash-ai/observability-chat-codebase-browsing feat: observability chat codebase browsing and model attribution fix	2026-02-15 06:50:59 -05:00
Kevin Turcios	e5d70443db	fix: use positional insertion in log_features to preserve model attribution log_features() appended test results in call-completion order, causing model attribution swaps when LLM responses arrived out of order. Pass test_index through and use positional insertion instead of append.	2026-02-15 03:58:05 -05:00
Kevin Turcios	496033539e	fix: use flex column layout and SSR-safe localStorage for split pane Replace sticky positioning + ResizeObserver height calc with a flex column layout (h-screen container, flex-1 panel group) that reliably fills the viewport. Drop useDefaultLayout hook (not SSR-safe) in favor of manual localStorage persistence inside useEffect.	2026-02-15 03:58:04 -05:00
Kevin Turcios	bf826a6e0f	feat: add resizable split pane and expandable tool results to observability chat Replace the fixed 480px chat overlay with a draggable split-pane layout using react-resizable-panels, and make tool rounds expandable to show the actual data the agent retrieved (code, errors, LLM call details).	2026-02-15 03:58:04 -05:00
Kevin Turcios	a3f9c655f9	fix: guarantee text response when agent loop produces only thinking blocks Remove MAX_TOOL_ROUNDS cap so the model decides when to stop calling tools. Add a safety net that makes a final tool-free API call if the loop ends without emitting any visible text, fixing empty assistant bubbles. Clean up redundant comments.	2026-02-15 03:58:04 -05:00
Kevin Turcios	870968e7a7	refactor: restructure system prompt for Claude Opus 4.6 best practices - Move trace data to top of prompt (long-context best practice: data before instructions improves quality ~30%) - Wrap sections in XML tags (<trace_data>, <role>, <domain_knowledge>, <guidelines>, <use_parallel_tool_calls>) for better parseability - Remove aggressive language (MUST, CRITICAL, HARD REQUIREMENT) that causes overtriggering on Opus 4.6 - Replace rigid 4-step investigation workflow with general guidelines to let adaptive thinking handle reasoning strategy - Remove duplicate content (tool reference section, two checklists) - Add <use_parallel_tool_calls> block per Anthropic's recommended pattern - Tone down tool descriptions from directive to descriptive - Net reduction: 49 fewer lines in system prompt	2026-02-15 03:58:04 -05:00
Kevin Turcios	ae2bff113d	fix: prevent context blowup by redacting thinking blocks between rounds Thinking blocks from previous tool rounds (10-50KB each) were accumulating in conversation history, causing Azure AI Foundry to hang after 4+ rounds. Redact thinking content before each API call while preserving required block structure. Also adds per-round timeout safety net and status indicators between rounds.	2026-02-15 03:58:04 -05:00
Kevin Turcios	fbcc283e97	perf: unify agent loop and pre-build lookup maps for O(1) tool calls Eliminate redundant API call by extracting text from the loop's final response directly instead of making a separate streaming call. Pre-build candidatesBySource, candidatesById, and testModelMap in indexTraceData() to replace repeated O(n) linear searches in tool calls and prompt building. Combine cost/token aggregation into a single pass.	2026-02-15 03:58:04 -05:00
Kevin Turcios	b09262ccbc	feat: add tool activity display and fix streaming timeout in observability chat Restructure agent loop to use stream()+finalMessage() for all API calls, fixing the SDK's non-streaming timeout error with max_tokens 32k. Add parallel tool execution, tool activity bubbles in the frontend, and restructure the system prompt for better investigation behavior.	2026-02-15 03:58:04 -05:00
Kevin Turcios	51372ca0ad	feat: add debugging workflow and response checklist to observability chat prompt Guide the chat agent to use the new tools proactively: a DEBUGGING TOOLS section with structured guidance for get_llm_call_detail and codebase browsing, a 4-step workflow (OBSERVE → INVESTIGATE → LOCATE → RECOMMEND), and a RESPONSE CHECKLIST at the end of the prompt requiring the agent to cite real file paths before responding.	2026-02-15 03:58:04 -05:00
Kevin Turcios	782ee508de	feat: add codebase browsing and LLM call inspection to observability chat Give the observability chat agent four new tools: get_llm_call_detail (full prompt/response for any LLM call), read_file, search_code, and list_directory for navigating the codeflash-internal and codeflash CLI repos. This lets the agent trace problems end-to-end from trace data through actual prompts to pipeline source code. - Add id to IndexedTraceData.llmCalls so the agent can reference calls - Make resolveToolCall async (Prisma + fs + child_process) - Make processToolUseResponse async to match - Bump MAX_TOOL_ROUNDS from 5 to 15 for multi-step code browsing - Add CODEFLASH_INTERNAL_REPO_PATH / CODEFLASH_CLI_REPO_PATH env vars - Path traversal protection, file size caps, search result limits	2026-02-15 03:58:04 -05:00
Kevin Turcios	eecd3ba4ce	Merge pull request #2425 from codeflash-ai/tessl-json-update chore: update tessl config and add npm tiles	2026-02-15 03:57:09 -05:00
Kevin Turcios	6933fe07ac	chore: add npm tessl tiles from tessl install	2026-02-15 03:56:05 -05:00
Kevin Turcios	9ab71ad672	chore: add .next/ to gitignore	2026-02-15 03:55:12 -05:00
Kevin Turcios	b6dc71421a	chore: update tessl.json with npm tile entries	2026-02-15 03:54:13 -05:00

... 3 4 5 6 7 ...

6455 commits