codeflash-internal

mirror of https://github.com/codeflash-ai/codeflash-internal.git synced 2026-05-04 18:25:18 +00:00

Author	SHA1	Message	Date
Kevin Turcios	12c6113f7e	Update context_helpers.py	2026-03-22 03:56:26 -05:00
Kevin Turcios	387c909c9e	fix codeflash optimizing python backend (#2483 ) Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-22 03:50:30 -05:00
Kevin Turcios	28c9acc877	refactor: aiservice deep dive — LLM client, dedup, async, cleanup (#2482 ) ## Summary Comprehensive refactoring of the aiservice Django backend focusing on code quality, deduplication, and correctness: - LLM client extraction: Extract `LLMClient` class with lazy client init, centralized error handling, and event loop detection - Centralize retry logic: `@stamina.retry` on `call_anthropic`/`call_openai` for transient errors (rate limits, timeouts, 500s), removing scattered retry decorators from testgen files - Deduplicate helpers: Consolidate `extract_code_and_explanation` into shared `context_helpers.py`, unify `normalize__code` into `normalize_c_style_code` - Eliminate double DB queries: Auth middleware `afirst()` then `aupdate()` by PK, middleware caches org/subscription - Parallelize Java optimizer: Use `asyncio.TaskGroup` for independent LLM calls - Lazy logging: Convert all f-string logging to lazy `%s` formatting across 11 files - Cleanup: Remove unused `PipelineError`/`ValidationError`, fix `seach_and_replace.py` typo, replace `print()` with `logging.debug()` in middleware - Sentry*: Reduce sampling 1.0 → 0.1/0.01, fix auth `settings.DEBUG` check, sanitize ranker errors ## Test plan - [x] All existing pytest tests pass (`uv run pytest`) - [x] Ruff lint/format clean - [x] No behavioral changes — pure refactoring --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-22 01:53:32 -05:00
Aseem Saxena	c5e8b56c6f	Merge pull request #2317 from codeflash-ai/codeflash/optimize-checkForValidAPIKey-mkwv868t ⚡️ Speed up function `checkForValidAPIKey` by 30%	2026-03-18 12:13:42 -07:00
Aseem Saxena	d44ca16d27	Merge branch 'main' into codeflash/optimize-checkForValidAPIKey-mkwv868t	2026-03-18 12:10:27 -07:00
Aseem Saxena	1dde1f0e16	Merge pull request #2323 from codeflash-ai/add/close_pr_end_point new endpoint for close pr	2026-03-18 12:09:13 -07:00
Aseem Saxena	960401e2d4	Merge branch 'main' into add/close_pr_end_point	2026-03-18 12:08:35 -07:00
HeshamHM28	8f74cf42e2	Fix Unauthorized check for CLI login page (#2480 ) # Pull Request Checklist ## Description - [ ] Description of PR: Clear and concise description of what this PR accomplishes - [ ] Breaking Changes: Document any breaking changes (if applicable) - [ ] Related Issues: Link to any related issues or tickets ## Testing - [ ] Test cases Attached: All relevant test cases have been added/updated - [ ] Manual Testing: Manual testing completed for the changes ## Monitoring & Debugging - [ ] Logging in place: Appropriate logging has been added for debugging user issues - [ ] Sentry will be able to catch errors: Error handling ensures Sentry can capture and report errors - [ ] Avoid Dev based/Prisma logging: No development-only or Prisma-specific logging in production code ## Configuration - [ ] Env variables newly added: Any new environment variables are documented in .env.example file or mentioned in description --- ## Additional Notes <!-- Add any additional context, screenshots, or notes for reviewers here -->	2026-03-17 16:37:18 -07:00
Sarthak Agarwal	8f41556b01	fix to mobile view sidebar and login msg (#2481 ) # Pull Request Checklist ## Description - [ ] Description of PR: Clear and concise description of what this PR accomplishes - [ ] Breaking Changes: Document any breaking changes (if applicable) - [ ] Related Issues: Link to any related issues or tickets ## Testing - [ ] Test cases Attached: All relevant test cases have been added/updated - [ ] Manual Testing: Manual testing completed for the changes ## Monitoring & Debugging - [ ] Logging in place: Appropriate logging has been added for debugging user issues - [ ] Sentry will be able to catch errors: Error handling ensures Sentry can capture and report errors - [ ] Avoid Dev based/Prisma logging: No development-only or Prisma-specific logging in production code ## Configuration - [ ] Env variables newly added: Any new environment variables are documented in .env.example file or mentioned in description --- ## Additional Notes <!-- Add any additional context, screenshots, or notes for reviewers here -->	2026-03-18 04:57:23 +05:30
HeshamHM28	ea23cf06a6	fix: skip Python validation for Java/JS in optimize-line-profiler endpoint (#2478 ) ## Summary - Fix `/optimize-line-profiler` endpoint rejecting Java/JS/TS requests with `"Invalid Python version"` error by moving `parse_python_version()` and Python syntax validation inside `if is_python:` block - Fix code extraction regex in Java and JS/TS line profiler optimizers to handle LLM responses with ```` ```java:FileName.java ```` format (optional `:filename` suffix) --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: HeshamHM28 <HeshamHM28@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-10 18:31:08 -07:00
Sarthak Agarwal	7deb16819e	[Fix] Suppress slack for codeflash employees (#2466 ) Co-authored-by: Aseem Saxena <aseem.bits@gmail.com> Co-authored-by: Kevin Turcios <106575910+KRRT7@users.noreply.github.com>	2026-03-08 02:54:32 +05:30
Kevin Turcios	d74da02e57	fix: return display-version tests as generated_tests in testgen response (#2477 ) ## Summary - The `/testgen` response's `generated_tests` field contained the assert-removed version with `codeflash_output` assignments - When the CLI's testgen review fell back to this field (instead of `raw_generated_tests`), the review LLM flagged every test as a "no-op assignment" - Now returns the display version (asserts kept, no instrumentation) as `generated_tests`, matching what the repair endpoint already does - Also applies isort to the display source for consistency	2026-03-07 13:21:09 -06:00
Kevin Turcios	42b8eed7b4	fix: greedy code extraction with retry & fix unawaited coroutine in Java optimizer (#2476 ) ## Summary - Use greedy code extraction and retry on syntax errors in testgen repair - Fix broken `asyncio.to_thread(log_features)` in Java optimizer — `log_features` is `@sync_to_async` so calling it via `to_thread` created an unawaited coroutine (`RuntimeWarning: coroutine 'SyncToAsync.__call__' was never awaited`) and silently skipped logging. Replaced with `await log_features(...)` using correct keyword arguments. ## Test plan - [ ] Verify testgen repair handles syntax errors with retry - [ ] Verify Java optimization requests no longer emit `SyncToAsync` RuntimeWarning - [ ] Verify Java optimization features are correctly logged to DB	2026-03-07 03:18:43 -05:00
Kevin Turcios	0ca3a2ab07	fix: use greedy code extraction and retry on syntax errors in repair (#2475 ) ## Summary - Switch `extract_code_block_with_context` (non-greedy `.?`) → `extract_code_block` (greedy `.`) for repair code extraction — the non-greedy regex matched the first closing fence, truncating code when the LLM included explanatory snippets before the full file (root cause of 82% of repair failures) - Add `ast.parse` validation before CST parsing for fast syntax checking - Retry the LLM once with the specific syntax error appended to the conversation when validation fails ## Test plan - [x] Existing tests pass - [ ] Run end-to-end optimization to verify repairs succeed	2026-03-06 06:24:31 -05:00
Kevin Turcios	07edfaa0bd	fix: testgen prompt improvements (#2474 ) ## Summary - Add multiline string literal constraint to testgen and repair prompts — LLM was consistently generating unterminated string literals by splitting strings across lines without triple quotes - Deduplicate anthropic/markdown branches in testgen prompt templates — single flow with inline `{% if is_xml %}` wrappers instead of duplicated content ## Test plan - [x] Verified templates render correctly for both anthropic and openai model types (sync and async) - [x] All block overrides from child templates work with the unified block names	2026-03-06 10:54:50 +00:00
Kevin Turcios	434fb7df77	feat: improve testgen review & repair quality (#2473 ) ## Summary - Pass coverage details (unexecuted lines, threshold) to review and repair prompts so the LLM can identify low-coverage tests - Accept previous repair errors in the repair endpoint and include them in the prompt for retry cycles - Parallelize per-test review LLM calls with `asyncio.TaskGroup` - Conditionally include codeflash env var context (`CODEFLASH_TRACER_DISABLE`, etc.) in repair prompts when the function under test references them ## Test plan - [x] Tested locally with codeflash CLI against `Tracer.__enter__` — review, repair, and retry cycles all work - [x] Coverage details and previous errors appear correctly in prompts - [x] Review parallelization reduces latency from sequential ~60s per test to concurrent	2026-03-06 10:23:55 +00:00
Kevin Turcios	14c0b3acca	fix: handle syntactically invalid LLM output in testgen repair (#2472 ) ## Summary - Catch `ParserSyntaxError` when parsing LLM-repaired code instead of letting it bubble to the generic 500 handler - Reduces Sentry noise from expected LLM failures - The CLI already handles non-200 responses gracefully (returns `None`, continues)	2026-03-06 07:32:30 +00:00
Kevin Turcios	4edd183d82	perf: use Haiku model for testgen repair (#2471 ) ## Summary - Switch testgen repair endpoint from `EXECUTE_MODEL` (GPT-5-Mini) to `HAIKU_MODEL` (Haiku 4.5) - Matches the review endpoint which already uses Haiku - Repair is a structured task (splice functions, fix assertions) that doesn't need a frontier model - Should reduce latency (was timing out at 90s in CI) and cost	2026-03-06 07:10:44 +00:00
Kevin Turcios	8d1dfd9bdb	Merge pull request #2465 from codeflash-ai/testgen-review-repair feat: per-function test review + repair endpoints	2026-03-05 22:37:21 +00:00
claude[bot]	6c7377a71f	fix: resolve duplicate kwargs and missing HttpError import in testgen_repair	2026-03-05 22:14:28 +00:00
Kevin Turcios	641e609bda	Merge branch 'main' into testgen-review-repair	2026-03-05 22:09:37 +00:00
Kevin Turcios	de109c6e12	Update django/aiservice/core/shared/testgen_review/repair.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2026-03-05 17:09:28 -05:00
Kevin Turcios	737a270801	Update django/aiservice/core/shared/testgen_review/repair.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2026-03-05 17:09:17 -05:00
Kevin Turcios	0e9a8a5959	Merge pull request #2469 from codeflash-ai/fix-markdown-code-path-lookup fix: clarify multi-file prompt to identify target file	2026-03-05 13:13:23 +00:00
Kevin Turcios	d9a963d305	fix: resolve contradicting response format instructions in multi-file prompt	2026-03-05 07:59:39 -05:00
Kevin Turcios	9a979439f1	fix: clarify multi-file prompt to identify target file and reduce context noise Tell the LLM the first file is the optimization target and remaining files are context only. Allow omitting unchanged context files from the response.	2026-03-05 05:59:41 -05:00
Kevin Turcios	8106d53e32	Merge remote-tracking branch 'origin/testgen-review-repair' into testgen-review-repair	2026-03-04 14:30:01 -05:00
Kevin Turcios	1532a66278	feat: include coverage info in test review and improve review prompt Accept coverage_summary in the review schema and pass it to the prompt. Add two new review criteria: low coverage detection and constructor/ dependency error patterns. Coverage percentage is shown in the user prompt so the reviewer can flag tests that don't exercise the function.	2026-03-04 14:14:19 -05:00
claude[bot]	f31b428a72	style: auto-fix linting issues	2026-03-04 09:15:27 +00:00
Kevin Turcios	ff35883ce6	Merge remote-tracking branch 'origin/testgen-review-repair' into testgen-review-repair	2026-03-04 04:13:24 -05:00
Kevin Turcios	644ded986f	Merge remote-tracking branch 'origin/main' into testgen-review-repair	2026-03-04 04:10:56 -05:00
Kevin Turcios	c2a67e8137	feat: pass test failure messages to review endpoint for better context Include runtime error messages from behavioral test failures in the review request. Failed function verdicts now include the specific error message. The review prompt shows error details so the AI can see patterns like type validation failures.	2026-03-04 04:09:27 -05:00
Kevin Turcios	fce866c96f	fix: splice only flagged functions from LLM repair into original test source Instead of replacing the entire test file with the LLM's output, parse both the original and repaired sources as CST, extract only the flagged function nodes from the repair output, and surgically replace them in the original. Unflagged functions are preserved exactly as-is.	2026-03-04 03:26:03 -05:00
Kevin Turcios	33be205d88	feat: run postprocessing pipeline on repaired tests before instrumentation Repaired tests from the LLM now go through the same postprocessing pipeline as initial generation (import fixing, loop limiting, unused definition removal) before instrumentation. Returns the display version (with asserts) as generated_tests for client-side display.	2026-03-04 03:20:09 -05:00
claude[bot]	8fe3171934	fix: resolve mypy type errors in generate.py and postprocess_pipeline.py	2026-03-04 08:19:57 +00:00
Kevin Turcios	2899eae4da	feat: return display-ready test source with asserts in testgen response Split postprocessing_testgen_pipeline to capture the test source before assert removal — fully cleaned (imports, loops, definitions) but with original asserts intact. Return it as raw_generated_tests in the TestGenResponseSchema so the CLI can display the human-readable version.	2026-03-04 03:16:30 -05:00
Kevin Turcios	96284e4805	Merge pull request #2467 from codeflash-ai/fix-js-async-testgen-flaky-tests fix: reduce flaky generated tests for JS async functions	2026-03-04 06:37:43 +00:00
Kevin Turcios	40f3236645	refactor: simplify template selection with string composition	2026-03-04 01:13:06 -05:00
Kevin Turcios	c2f9b17969	Merge remote-tracking branch 'origin/main' into fix-js-async-testgen-flaky-tests	2026-03-04 01:09:00 -05:00
Aseem Saxena	56ac044a86	Merge pull request #2364 from codeflash-ai/match-testdiff-schema bug: mismatch in cli and internal schema for code repair	2026-03-04 05:16:49 +05:30
claude[bot]	38ca8824d6	fix: resolve mypy type errors in code_repair_context	2026-03-03 23:29:13 +00:00
Aseem Saxena	16253b3d63	Merge branch 'main' into match-testdiff-schema	2026-03-04 04:56:29 +05:30
Sarthak Agarwal	cc32654b7f	mocha prompts in backend (#2468 ) Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>	2026-03-04 04:09:10 +05:30
HeshamHM28	44fc7dc8e8	feat: Add support for specifying target Java version in test generation (#2445 )	2026-03-03 22:03:29 +00:00
Aseem Saxena	29e91e1c3d	Merge branch 'main' into match-testdiff-schema	2026-03-03 07:28:08 +05:30
Aseem Saxena	94fc60bb13	Merge branch 'main' into fix-js-async-testgen-flaky-tests	2026-03-03 07:27:24 +05:30
Saurabh Misra	e8f1589107	Merge pull request #2429 from codeflash-ai/cf-aws-bedrock-claude-workflows feat: switch Claude workflows from Foundry to AWS Bedrock	2026-03-02 17:48:04 -08:00
aseembits93	76a81b4381	chore: switch CI Claude model to Sonnet 4.6 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 06:47:20 +05:30
aseembits93	26e4936659	keep the non foundry env vars	2026-03-03 06:03:06 +05:30
Aseem Saxena	9e5e61e53d	Apply suggestion	2026-03-02 16:27:35 -08:00

1 2 3 4 5 ...

6350 commits