The get_or_create defaults passed test lists without positional
indexing, so when a higher test_index created the row first its
content landed at index 0 and was overwritten by the lower index
update, losing a test.
The LLM prompt preprocessing now highlights __init__ signatures for
regular classes, not just @dataclass ones, reducing brute-force
constructor guessing and pytest.skip() fallbacks in generated tests.
log_features() appended test results in call-completion order, causing
model attribution swaps when LLM responses arrived out of order. Pass
test_index through and use positional insertion instead of append.
## Summary
- Restructure CLAUDE.md hierarchy so Claude Code auto-discovers
project-specific instructions
- Delete dead `AGENTS.md` files (referenced non-existent
`.tessl/RULES.md`)
- Rename `django/aiservice/AGENTS.md` → `CLAUDE.md` for auto-discovery
- Create `js/CLAUDE.md` with package commands and gotchas
- Move PR review guidelines to `.claude/rules/pr-review.md` (auto-loaded
rule)
- Move prek workflow to `.claude/skills/fix-prek.md` (on-demand skill)
- Add path-scoped rules for Python and Next.js patterns
- Add domain glossary, service architecture diagram, and per-package
gotchas
## Test plan
- Verify `CLAUDE.md` files exist at root, `django/aiservice/`, and `js/`
- Verify no remaining references to `AGENTS.md` or `.tessl/`
- Verify `.claude/rules/` and `.claude/skills/` files are committed
## Summary
- Extract testgen and optimizer API routers from
`core/languages/python/` into `core/shared/` with lazy imports,
eliminating cross-module coupling between language modules
- Delete stale JavaScript prompt files left in the Python module after
migration to `js_ts/`
- Remove backward-compat fallback paths for prompt files that already
exist at their new locations
- Remove unused `is_multi_context_any()` and its cross-language imports
- Remove unused `BEGIN_PATCH`/`END_PATCH` constants and stale TODO
## Test plan
- [ ] Verify testgen endpoint dispatches correctly for Python, JS/TS,
and Java
- [ ] Verify optimizer endpoint dispatches correctly for all languages
- [ ] Run existing testgen and optimizer tests
## Summary
- Pass test_index through LLM call context so observability chat can
attribute responses to specific test generation calls
- Fix SSE streaming to send keepalive pings from the start
CF-504
# Pull Request Checklist
## Description
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Related Issues**: Link to any related issues or tickets
## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes
## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code
## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---
## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: HeshamHM28 <HeshamMohamedFathy@outlook.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-39-200.ec2.internal>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Kevin Turcios <turcioskevinr@gmail.com>
Co-authored-by: Kevin Turcios <106575910+KRRT7@users.noreply.github.com>
## Problem
The JS/TS language handler (`core/languages/js_ts/`) was importing
models, schemas, config, prompts, and helpers directly from the Python
language handler. This created a confusing architectural dependency and
risked serving wrong language-specific prompt content.
## What Changed
- Created `core/shared/` for genuinely language-agnostic code (optimizer
schemas, models, config, testgen models, context helpers)
- Moved JS/TS-specific prompts and context helpers into
`core/languages/js_ts/`
- Updated all consumers (20+ files) to import from the correct locations
- Removed backwards-compat re-exports from the Python module
## Result
- **Before:** 11 imports from `core.languages.python` in
`core/languages/js_ts/`
- **After:** 0
## Summary
- Chat panel on the observability timeline that uses Claude to answer
questions about optimization traces
- Tool-based context retrieval (fetches candidates, tests, errors on
demand instead of stuffing everything upfront)
- Uses `@anthropic-ai/sdk` via Azure AI Foundry
- Strengthened testgen prompts to ban mocks/fakes for test inputs
Store qualified function name (e.g., HttpInterface.__init__) and
file_path in testgen metadata instead of bare function_name (__init__).
Update the frontend parser to handle qualified names by splitting into
class + method and searching within the correct class using both
tree-sitter and regex. Prioritize the file matching filePath before
searching all files.
# Pull Request Checklist
## Description
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Related Issues**: Link to any related issues or tickets
## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes
## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code
## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---
## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->
…ndencies
The old rule ("NOT in libraries such as numpy, pandas etc.") forced LLMs
to reinvent helpers like np.allclose using slow / inaccurate Python
loops. The new rule allows assertions from packages already imported by
the function under test.
# Pull Request Checklist
## Description
- [ ] **Description of PR**: Clear and concise description of what this
PR accomplishes
- [ ] **Breaking Changes**: Document any breaking changes (if
applicable)
- [ ] **Related Issues**: Link to any related issues or tickets
## Testing
- [ ] **Test cases Attached**: All relevant test cases have been
added/updated
- [ ] **Manual Testing**: Manual testing completed for the changes
## Monitoring & Debugging
- [ ] **Logging in place**: Appropriate logging has been added for
debugging user issues
- [ ] **Sentry will be able to catch errors**: Error handling ensures
Sentry can capture and report errors
- [ ] **Avoid Dev based/Prisma logging**: No development-only or
Prisma-specific logging in production code
## Configuration
- [ ] **Env variables newly added**: Any new environment variables are
documented in .env.example file or mentioned in description
---
## Additional Notes
<!-- Add any additional context, screenshots, or notes for reviewers
here -->
## Summary
- Reorganizes `django/aiservice/` from feature-first layout (separate
`optimizer/`, `testgen/`, `code_repair/` dirs) to language-first layout
under `core/languages/{python,js_ts}/`
- Adds handler/registry/dispatcher pattern for routing requests to
language-specific implementations
- All existing module code preserved via `git mv` for history tracking;
no logic changes to existing modules
## What changed
- New `core/` app with registry, dispatcher, protocols, and error
hierarchy
- `PythonHandler` and `JSTypeScriptHandler` delegate to existing module
functions
- All imports updated across the codebase (views, tests,
adaptive_optimizer, etc.)
- Integration tests for handler registration and dispatch
- 155 files changed, ~880 additions / ~207 deletions (mostly import path
updates and moves)
## Test plan
- [ ] `python manage.py check` passes
- [ ] Integration tests in
`tests/integration/test_handler_integration.py` pass
- [ ] Existing test suite passes with updated import paths
- [ ] Ruff and ty clean on all new infrastructure files
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
## Summary
- Rewrite testgen system prompts from constraint-heavy to positive-first
structure with chain-of-thought instructions
- Simplify LLM message structure from `[system, user, user, user]` to
`[system, user]` by absorbing plan_content guidelines into system
prompts
- Observability UI: add search to LLM debug dialog, expand timeline view
- Fix data capture: raw LLM responses, all user messages in prompt
column, nested code fences, empty notes handling
## Test plan
- [ ] Verify testgen produces valid test suites with the new prompt
structure
- [ ] Verify observability timeline displays LLM prompts/responses
correctly
- [ ] Check that search works in the LLM debug dialog
## Summary
- Restructure the refinement system prompt into clear numbered sections
(Preserve Behavior, Minimize Diff, Revert Anti-Patterns, Maintain
Readability) with an explicit 6-step refinement process
- Extract inline prompt strings into separate markdown files
(`refinement_system_prompt.md`, `refinement_user_prompt.md`), matching
the convention used by other optimizer prompts
- Add `AuthenticatedRequest` type hint to `refine()` endpoint and fix
grammar in tool use section
## Test plan
- [ ] Verify refinement endpoint still works end-to-end with a test
optimization candidate
- [ ] Confirm prompt content is loaded correctly from markdown files at
startup
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
## Summary
- Re-adds the enriched observability context from CF-1041 that was
reverted
- Passes `module_path`, `test_module_path`, `helper_function_names`,
`is_async`, and `function_to_optimize` details to `call_llm` in testgen
## Test plan
- [ ] Verify testgen LLM calls include the enriched context
- [ ] Confirm no regressions in test generation flow
introducing this due to pain points in V1, not a complete rewrite, based
off v1
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
1. languages/js_ts/testgen.py:
- Updated parse_and_validate_js_output to accept a language parameter
- Uses validate_typescript_syntax when language="typescript", otherwise
uses validate_javascript_syntax
- Updated generate_and_validate_js_test_code to accept and pass the
language parameter
- Updated the call chain to pass language through to the validation
2. optimizer/context_utils/refiner_context.py:
- Added import for validate_typescript_syntax
- Fixed is_valid_refinement method to use correct validator based on
language
- Fixed validate_code_syntax in SingleRefinerContext class
- Fixed validate_code_syntax in MultiRefinerContext class
3. tests/optimizer/test_javascript_validator.py:
- Added test_typescript_type_assertion_valid_in_ts - verifies as unknown
as number is valid TypeScript
- Added test_typescript_type_assertion_invalid_in_js - verifies as
unknown as number is INVALID JavaScript (this would have caught the
original bug)
- Added test_typescript_generic_valid_in_ts - verifies generics are
valid TypeScript
- Added test_typescript_generic_invalid_in_js - verifies generics are
INVALID JavaScript
Files Already Correct (no changes needed):
- languages/js_ts/optimizer.py - already correctly checks language
- languages/js_ts/optimizer_lp.py - already correctly checks language
- optimizer/optimizer_line_profiler.py - already correctly checks
language
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
The JavaScript test generation prompt contained `{fn}` as part of
example code showing import syntax. However, Python's `.format()`
method interprets this as a placeholder and tries to substitute it,
causing a KeyError.
Fixed by escaping the curly braces as `{{fn}}` so they render as
literal `{fn}` in the final prompt.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
LLMs often add .js extensions to TypeScript import paths (e.g.,
`import { func } from '../module.js'`), but TypeScript/Jest module
resolution doesn't require explicit extensions. This causes
"Cannot find module" errors.
This change adds `strip_js_extensions()` function that removes
.js/.ts/.tsx/.jsx/.mjs/.mts extensions from relative import paths
in generated tests. The function handles:
- ES module imports: import { x } from '../path.js'
- CommonJS requires: require('../path.js')
- Jest mocks: jest.mock('../path.js'), jest.doMock(), etc.
External package imports (lodash, react, etc.) are preserved.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add more explicit instructions to prevent LLMs from adding .js/.ts
extensions to import paths. The previous guidance was being ignored
by some models.
- Add dedicated "CRITICAL: IMPORT PATH RULES" section with examples
- Show both WRONG and CORRECT patterns explicitly
- Remind to copy the provided import statement exactly
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Python's str.isidentifier() validates Python identifiers, not JavaScript
identifiers. This caused valid JS identifiers like '$handler' to be
rejected (since $ is not valid in Python identifiers).
Changed to use a regex pattern that matches JavaScript identifier rules:
- Can start with letter, underscore, or $
- Can contain letters, digits, underscores, or $
Added tests for $ identifiers to ensure they are correctly handled.
- Add _is_valid_js_identifier() to check for reserved words (module, exports, prototype, etc.)
- Only use class import pattern for single-dot names where class name is valid identifier
- Fall back to module import for:
- Multiple dots (e.g., Constructor.prototype.method)
- Reserved words (e.g., module.exports)
- Add comprehensive tests for edge cases
When generating test imports for class methods like `Validator.validateRequest`,
the previous code produced invalid JavaScript:
const { Validator.validateRequest } = require('../middlewares/Validator');
This is invalid because dots are not allowed in destructuring patterns.
The fix:
- Add _generate_import_statement() function to detect class methods (names with dots)
- For class methods: generate `const ClassName = require('...')`
- For simple functions: keep destructuring `const { funcName } = require('...')`
- Update prompt templates to use {import_statement} placeholder
Includes unit tests for the new import generation logic.
Add better error diagnostics for TypeScript/JavaScript syntax validation:
- Add line numbers and code snippets to error messages
- Log warnings when markdown parsing finds no code blocks
- Show the actual problematic code in error logs
- Help debug "Invalid syntax" errors by showing exact location
This helps diagnose issues where the API rejects code that tree-sitter
parses correctly on the client side by providing more context in the
error messages and logs.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
## Summary
- Add `instrumented_perf_test` field to `OptimizationFeatures` model
- Update `log_features` function to accept and store performance
instrumented tests
---------
Co-authored-by: Sarthak Agarwal <sarthak.saga@gmail.com>