Commit graph

1366 commits

Author SHA1 Message Date
Kevin Turcios
c13835963c docs: restructure CLAUDE.md files into modular rules
Slim down CLAUDE.md files and move content into path-scoped
.claude/rules/ files to reduce context bloat.
2026-02-14 19:36:21 -05:00
Kevin Turcios
4c3deeb7b8
Restructure CLAUDE.md files and add path-scoped rules for monorepo (#2417)
## Summary

- Restructure CLAUDE.md hierarchy so Claude Code auto-discovers
project-specific instructions
- Delete dead `AGENTS.md` files (referenced non-existent
`.tessl/RULES.md`)
- Rename `django/aiservice/AGENTS.md` → `CLAUDE.md` for auto-discovery
- Create `js/CLAUDE.md` with package commands and gotchas
- Move PR review guidelines to `.claude/rules/pr-review.md` (auto-loaded
rule)
- Move prek workflow to `.claude/skills/fix-prek.md` (on-demand skill)
- Add path-scoped rules for Python and Next.js patterns
- Add domain glossary, service architecture diagram, and per-package
gotchas

## Test plan

- Verify `CLAUDE.md` files exist at root, `django/aiservice/`, and `js/`
- Verify no remaining references to `AGENTS.md` or `.tessl/`
- Verify `.claude/rules/` and `.claude/skills/` files are committed
2026-02-14 17:13:09 -05:00
Kevin Turcios
e26a8ea486
Reorganize top-level feature modules under core/ (#2416)
## Summary

- Move `log_features/` → `core/log_features/` (Django app with
`managed=False` models, no DB impact)
- Move `ranker/`, `workflow_gen/`, `adaptive_optimizer/` →
`core/languages/python/` (Python-focused API modules)
- Update all imports across the codebase (19 files)

## Test plan

- [x] All 548 tests pass
- [x] No stale top-level imports (`from log_features.`, `from ranker.`,
etc.)
- [x] `log_features` AppConfig preserves `label = "log_features"` for
Django app registry compatibility
2026-02-14 17:07:40 -05:00
Kevin Turcios
6caf7469c6
Decouple language modules and remove stale cross-module code (#2415)
## Summary

- Extract testgen and optimizer API routers from
`core/languages/python/` into `core/shared/` with lazy imports,
eliminating cross-module coupling between language modules
- Delete stale JavaScript prompt files left in the Python module after
migration to `js_ts/`
- Remove backward-compat fallback paths for prompt files that already
exist at their new locations
- Remove unused `is_multi_context_any()` and its cross-language imports
- Remove unused `BEGIN_PATCH`/`END_PATCH` constants and stale TODO

## Test plan

- [ ] Verify testgen endpoint dispatches correctly for Python, JS/TS,
and Java
- [ ] Verify optimizer endpoint dispatches correctly for all languages
- [ ] Run existing testgen and optimizer tests
2026-02-14 00:09:44 -05:00
Kevin Turcios
2614393793
Add test_index to LLM call context for observability chat (#2414)
## Summary

- Pass test_index through LLM call context so observability chat can
attribute responses to specific test generation calls
- Fix SSE streaming to send keepalive pings from the start

CF-504
2026-02-13 23:49:20 -05:00
Sarthak Agarwal
c721723971
remove demo test loops (#2412) 2026-02-14 00:43:09 +05:30
Saurabh Misra
198c0c1a4e
codeflash-omni-java (#2335)
# Pull Request Checklist

## Description
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Related Issues**: Link to any related issues or tickets

## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes

## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code

## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---

## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: HeshamHM28 <HeshamMohamedFathy@outlook.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-39-200.ec2.internal>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Kevin Turcios <turcioskevinr@gmail.com>
Co-authored-by: Kevin Turcios <106575910+KRRT7@users.noreply.github.com>
2026-02-13 23:26:55 +05:30
Kevin Turcios
ad26be10b8
Fix JS/TS cross-imports from Python module (#2396)
## Problem

The JS/TS language handler (`core/languages/js_ts/`) was importing
models, schemas, config, prompts, and helpers directly from the Python
language handler. This created a confusing architectural dependency and
risked serving wrong language-specific prompt content.

## What Changed

- Created `core/shared/` for genuinely language-agnostic code (optimizer
schemas, models, config, testgen models, context helpers)
- Moved JS/TS-specific prompts and context helpers into
`core/languages/js_ts/`
- Updated all consumers (20+ files) to import from the correct locations
- Removed backwards-compat re-exports from the Python module

## Result

- **Before:** 11 imports from `core.languages.python` in
`core/languages/js_ts/`
- **After:** 0
2026-02-12 22:34:38 -05:00
Kevin Turcios
0df421eccb
Add chat interface to observability timeline (#2395)
## Summary
- Chat panel on the observability timeline that uses Claude to answer
questions about optimization traces
- Tool-based context retrieval (fetches candidates, tests, errors on
demand instead of stuffing everything upfront)
- Uses `@anthropic-ai/sdk` via Azure AI Foundry
- Strengthened testgen prompts to ban mocks/fakes for test inputs
2026-02-12 20:45:33 -05:00
Kevin Turcios
e28642cf22
Fix FTO display showing wrong function for methods with common names (#2391)
Store qualified function name (e.g., HttpInterface.__init__) and
file_path in testgen metadata instead of bare function_name (__init__).
Update the frontend parser to handle qualified names by splitting into
class + method and searching within the correct class using both
tree-sitter and regex. Prioritize the file matching filePath before
searching all files.

# Pull Request Checklist

## Description
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Related Issues**: Link to any related issues or tickets

## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes

## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code

## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---

## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->
2026-02-12 00:30:33 -05:00
Kevin Turcios
db973a0487
fix: relax testgen assertion rule to allow imports from function depe… (#2388)
…ndencies

The old rule ("NOT in libraries such as numpy, pandas etc.") forced LLMs
to reinvent helpers like np.allclose using slow / inaccurate Python
loops. The new rule allows assertions from packages already imported by
the function under test.

# Pull Request Checklist

## Description
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Related Issues**: Link to any related issues or tickets

## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes

## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code

## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---

## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->
2026-02-09 15:05:19 -05:00
Kevin Turcios
629442cc5e
Restructure aiservice to language-first architecture (#2383)
## Summary
- Reorganizes `django/aiservice/` from feature-first layout (separate
`optimizer/`, `testgen/`, `code_repair/` dirs) to language-first layout
under `core/languages/{python,js_ts}/`
- Adds handler/registry/dispatcher pattern for routing requests to
language-specific implementations
- All existing module code preserved via `git mv` for history tracking;
no logic changes to existing modules

## What changed
- New `core/` app with registry, dispatcher, protocols, and error
hierarchy
- `PythonHandler` and `JSTypeScriptHandler` delegate to existing module
functions
- All imports updated across the codebase (views, tests,
adaptive_optimizer, etc.)
- Integration tests for handler registration and dispatch
- 155 files changed, ~880 additions / ~207 deletions (mostly import path
updates and moves)

## Test plan
- [ ] `python manage.py check` passes
- [ ] Integration tests in
`tests/integration/test_handler_integration.py` pass
- [ ] Existing test suite passes with updated import paths
- [ ] Ruff and ty clean on all new infrastructure files

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-02-09 09:15:50 -05:00
Kevin Turcios
b9d318279c
feat: observability improvements and testgen prompt modernization (#2382)
## Summary
- Rewrite testgen system prompts from constraint-heavy to positive-first
structure with chain-of-thought instructions
- Simplify LLM message structure from `[system, user, user, user]` to
`[system, user]` by absorbing plan_content guidelines into system
prompts
- Observability UI: add search to LLM debug dialog, expand timeline view
- Fix data capture: raw LLM responses, all user messages in prompt
column, nested code fences, empty notes handling

## Test plan
- [ ] Verify testgen produces valid test suites with the new prompt
structure
- [ ] Verify observability timeline displays LLM prompts/responses
correctly
- [ ] Check that search works in the LLM debug dialog
2026-02-09 01:20:59 -05:00
Kevin Turcios
752e2504e4
Restructure and improve refinement prompt (#2379)
## Summary
- Restructure the refinement system prompt into clear numbered sections
(Preserve Behavior, Minimize Diff, Revert Anti-Patterns, Maintain
Readability) with an explicit 6-step refinement process
- Extract inline prompt strings into separate markdown files
(`refinement_system_prompt.md`, `refinement_user_prompt.md`), matching
the convention used by other optimizer prompts
- Add `AuthenticatedRequest` type hint to `refine()` endpoint and fix
grammar in tool use section

## Test plan
- [ ] Verify refinement endpoint still works end-to-end with a test
optimization candidate
- [ ] Confirm prompt content is loaded correctly from markdown files at
startup

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-02-08 02:10:20 -05:00
Kevin Turcios
47053591f4
observability v2 toggle (#2378) 2026-02-07 15:50:12 -05:00
Kevin Turcios
f03a06f4e1
Reintroduce enriched obs_context for testgen LLM calls (#2377)
## Summary
- Re-adds the enriched observability context from CF-1041 that was
reverted
- Passes `module_path`, `test_module_path`, `helper_function_names`,
`is_async`, and `function_to_optimize` details to `call_llm` in testgen

## Test plan
- [ ] Verify testgen LLM calls include the enriched context
- [ ] Confirm no regressions in test generation flow
2026-02-07 10:33:13 -05:00
Sarthak Agarwal
98fb2d1579
Revert "CF-1041 observability v2 " need more changes and testing (#2375)
Reverts codeflash-ai/codeflash-internal#2329
2026-02-06 01:18:17 +05:30
Kevin Turcios
07d33edd9f
CF-1041 observability v2 (#2329)
introducing this due to pain points in V1, not a complete rewrite, based
off v1

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
2026-02-05 14:08:02 -05:00
Sarthak Agarwal
08fd1a8787
adding validation for ts in refiner and testgen (#2372)
1. languages/js_ts/testgen.py:
- Updated parse_and_validate_js_output to accept a language parameter
- Uses validate_typescript_syntax when language="typescript", otherwise
uses validate_javascript_syntax
- Updated generate_and_validate_js_test_code to accept and pass the
language parameter
- Updated the call chain to pass language through to the validation
2. optimizer/context_utils/refiner_context.py:
- Added import for validate_typescript_syntax
- Fixed is_valid_refinement method to use correct validator based on
language
- Fixed validate_code_syntax in SingleRefinerContext class
- Fixed validate_code_syntax in MultiRefinerContext class
3. tests/optimizer/test_javascript_validator.py:
- Added test_typescript_type_assertion_valid_in_ts - verifies as unknown
as number is valid TypeScript
- Added test_typescript_type_assertion_invalid_in_js - verifies as
unknown as number is INVALID JavaScript (this would have caught the
original bug)
- Added test_typescript_generic_valid_in_ts - verifies generics are
valid TypeScript
- Added test_typescript_generic_invalid_in_js - verifies generics are
INVALID JavaScript
Files Already Correct (no changes needed):
- languages/js_ts/optimizer.py - already correctly checks language
- languages/js_ts/optimizer_lp.py - already correctly checks language
- optimizer/optimizer_line_profiler.py - already correctly checks
language

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-02-04 22:54:44 +00:00
Sarthak Agarwal
eb8ad603ff
vitest related changes to prompt (#2366) 2026-02-03 03:29:36 +05:30
Sarthak Agarwal
b48a8d9a43
Add vitest support in backend (#2363) 2026-02-02 20:51:52 +05:30
Sarthak Agarwal
cbfebf8ee4 fix(js-testgen): escape curly braces in prompt template
The JavaScript test generation prompt contained `{fn}` as part of
example code showing import syntax. However, Python's `.format()`
method interprets this as a placeholder and tries to substitute it,
causing a KeyError.

Fixed by escaping the curly braces as `{{fn}}` so they render as
literal `{fn}` in the final prompt.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 03:50:05 +05:30
Saurabh Misra
70360436bd fix: strip file extensions from JS/TS import paths in generated tests
LLMs often add .js extensions to TypeScript import paths (e.g.,
`import { func } from '../module.js'`), but TypeScript/Jest module
resolution doesn't require explicit extensions. This causes
"Cannot find module" errors.

This change adds `strip_js_extensions()` function that removes
.js/.ts/.tsx/.jsx/.mjs/.mts extensions from relative import paths
in generated tests. The function handles:
- ES module imports: import { x } from '../path.js'
- CommonJS requires: require('../path.js')
- Jest mocks: jest.mock('../path.js'), jest.doMock(), etc.

External package imports (lodash, react, etc.) are preserved.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 04:22:07 +00:00
Saurabh Misra
b801254d13 fix: strengthen import path extension guidance in prompts
Add more explicit instructions to prevent LLMs from adding .js/.ts
extensions to import paths. The previous guidance was being ignored
by some models.

- Add dedicated "CRITICAL: IMPORT PATH RULES" section with examples
- Show both WRONG and CORRECT patterns explicitly
- Remind to copy the provided import statement exactly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 02:35:21 +00:00
Saurabh Misra
d59c48426e fix: merge prompt extension fixes and LLM client improvements
- Cherry-pick: Remove .js extension guidance from prompts (from fix/js-import-extension-prompt)
- Add get_llm_client() to create fresh clients per request (fixes event loop issues)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 02:32:58 +00:00
Saurabh Misra
8461f71668 fix: use JavaScript identifier regex instead of Python isidentifier()
Python's str.isidentifier() validates Python identifiers, not JavaScript
identifiers. This caused valid JS identifiers like '$handler' to be
rejected (since $ is not valid in Python identifiers).

Changed to use a regex pattern that matches JavaScript identifier rules:
- Can start with letter, underscore, or $
- Can contain letters, digits, underscores, or $

Added tests for $ identifiers to ensure they are correctly handled.
2026-01-31 02:21:02 +00:00
Kevin Turcios
a394db3382 formatting 2026-01-30 20:00:11 -05:00
Saurabh Misra
addbaad370
Merge branch 'main' into fix/class-method-import-syntax 2026-01-30 16:36:03 -08:00
Saurabh Misra
09e6a1710f Address review: add validation for edge cases in import generation
- Add _is_valid_js_identifier() to check for reserved words (module, exports, prototype, etc.)
- Only use class import pattern for single-dot names where class name is valid identifier
- Fall back to module import for:
  - Multiple dots (e.g., Constructor.prototype.method)
  - Reserved words (e.g., module.exports)
- Add comprehensive tests for edge cases
2026-01-31 00:35:10 +00:00
Saurabh Misra
b2fb58eba6 Fix invalid JavaScript import syntax for class methods
When generating test imports for class methods like `Validator.validateRequest`,
the previous code produced invalid JavaScript:
  const { Validator.validateRequest } = require('../middlewares/Validator');

This is invalid because dots are not allowed in destructuring patterns.

The fix:
- Add _generate_import_statement() function to detect class methods (names with dots)
- For class methods: generate `const ClassName = require('...')`
- For simple functions: keep destructuring `const { funcName } = require('...')`
- Update prompt templates to use {import_statement} placeholder

Includes unit tests for the new import generation logic.
2026-01-31 00:35:10 +00:00
Saurabh Misra
289827e5cb
Merge pull request #2337 from codeflash-ai/fix/improve-typescript-validation-error-messages
fix: improve TypeScript/JavaScript validation error messages
2026-01-30 16:03:02 -08:00
Saurabh Misra
d255a29203
Update django/aiservice/aiservice/validators/javascript_validator.py
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
2026-01-30 16:00:05 -08:00
Saurabh Misra
8800614d1c Add unit tests for TypeScript/JavaScript validator error reporting
Tests for:
- Error location reporting with line numbers and code snippets
- Markdown code block parsing with various scenarios
- Multiple code blocks with mixed valid/invalid content
- Real-world TypeScript patterns (async, try-catch, template literals)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 23:53:26 +00:00
Saurabh Misra
07ae9db684 fix: improve TypeScript/JavaScript validation error messages
Add better error diagnostics for TypeScript/JavaScript syntax validation:

- Add line numbers and code snippets to error messages
- Log warnings when markdown parsing finds no code blocks
- Show the actual problematic code in error logs
- Help debug "Invalid syntax" errors by showing exact location

This helps diagnose issues where the API rejects code that tree-sitter
parses correctly on the client side by providing more context in the
error messages and logs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 23:47:39 +00:00
aseembits93
af2935f4f2 0-index finally 2026-01-30 14:42:28 -08:00
Kevin Turcios
c1a25b33e5
Merge branch 'main' into ranker-multidim-scoring 2026-01-30 22:16:08 +00:00
ali
99a7a32b32
safer caching 2026-01-30 19:52:18 +02:00
ali
879aa93967
fix validating js/ts code with markdown syntax 2026-01-30 19:44:31 +02:00
Kevin Turcios
0f5d578d37
Merge branch 'main' into ranker-multidim-scoring 2026-01-29 22:45:18 +00:00
HeshamHM28
c24f350719
Fix Prevent log code for paid org in the optimization feature "AI service " (#2325)
Fixes Cf-1038
2026-01-29 19:28:30 +00:00
Kevin Turcios
04197195e8
Store instrumented performance tests in feature logging (#2330)
## Summary
- Add `instrumented_perf_test` field to `OptimizationFeatures` model
- Update `log_features` function to accept and store performance
instrumented tests

---------

Co-authored-by: Sarthak Agarwal <sarthak.saga@gmail.com>
2026-01-29 03:09:47 -05:00
aseembits93
f1b6fbf737 adding back the instructions 2026-01-28 16:13:28 -08:00
aseembits93
7386dd20b5 1-indexed ranking everywhere 2026-01-28 16:02:24 -08:00
aseembits93
71d397753d Merge remote-tracking branch 'origin/main' into ranker-multidim-scoring 2026-01-28 15:36:30 -08:00
aseembits93
215e6ad390 fixed merge issues 2026-01-28 15:33:11 -08:00
ali
c19d9f4450
fix unit tests 2026-01-28 23:20:48 +02:00
ali
f0480fac39
use treesitter for validating js & ts code syntax 2026-01-28 23:15:30 +02:00
ali
db3f269b37
linting and formatting 2026-01-28 22:41:27 +02:00
ali
ec97ebd4e7
more cleanup 2026-01-28 22:23:54 +02:00
ali
31091350c9
cleanup 2026-01-28 22:19:40 +02:00