codeflash

Author	SHA1	Message	Date
Mohamed Ashraf	8d42ed93dd	test(05-02): add concurrency-aware assertion removal tests - 14 new tests in TestConcurrencyPatterns class - synchronized blocks/methods preserved after transformation - volatile field reads, AtomicInteger ops preserved - ConcurrentHashMap, Thread.sleep, wait/notify patterns preserved - ReentrantLock, CountDownLatch patterns preserved - Real-world TokenBucket and CircularBuffer patterns validated - AssertJ assertion on synchronized method call validated - Total: 71 tests (57 existing + 14 new), all passing	2026-02-06 13:27:41 +00:00
misrasaurabh1	e958e4e9f4	optimize for performance	2026-02-06 00:08:36 -08:00
misrasaurabh1	81523f3593	Merge remote-tracking branch 'origin/omni-java' into omni-java	2026-02-05 23:57:26 -08:00
misrasaurabh1	0ff54b5043	better unit test discovery java	2026-02-05 23:57:13 -08:00
mashraf-222	a9f3e80bd5	Merge pull request #1398 from codeflash-ai/fix/java-tracer-routing fix: route Java/JavaScript/TypeScript to Optimizer instead of Python tracer	2026-02-06 01:33:12 +02:00
Mohamed Ashraf	fdb2668f7d	fix: route Java/JavaScript/TypeScript to Optimizer instead of Python tracer Java, JavaScript, and TypeScript files were incorrectly being routed through the Python tracing module when running `codeflash optimize --file <file>`, causing a FileNotFoundError when the tracer attempted to execute CLI args as Python scripts. This fix adds language detection at the start of tracer.py main() function. When a non-Python file is detected (Java, JS, TS), the function: 1. Detects the file language using get_language_support() 2. Parses and processes args properly with process_pyproject_config() 3. Routes directly to optimizer.run_with_args() instead of Python tracing Java and JS/TS use their own test runners (Maven/JUnit, Jest) and should never go through Python tracing. This fix unblocks all Java E2E optimization flows. Issue: Java optimization failed with "FileNotFoundError: '--file'" from tracing_new_process.py:855 Root cause: tracer.py had no language check before invoking Python-specific tracing subprocess Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-05 23:26:16 +00:00
mashraf-222	e88730e6ce	Merge pull request #1341 from codeflash-ai/fix/java-tests-project-rootdir fix: set tests_project_rootdir to tests_root for Java projects (Bug #2)	2026-02-05 18:30:14 +02:00
mashraf-222	73220782c3	Merge pull request #1345 from codeflash-ai/debug/java-test-filter fix: prevent Maven running all tests + fix TestFile type annotation	2026-02-05 18:29:21 +02:00
Kevin Turcios	95cc60397d	Merge branch 'main' into omni-java	2026-02-04 03:22:37 -05:00
Kevin Turcios	1c0d8da090	Merge pull request #1339 from codeflash-ai/coverage-no-files Skip when no gen tests and no existing tests	2026-02-04 00:29:06 -05:00
claude[bot]	daf570b969	style: auto-fix formatting issues	2026-02-04 05:21:45 +00:00
Kevin Turcios	0b13beb9b0	Merge branch 'main' into coverage-no-files	2026-02-04 00:20:39 -05:00
Kevin Turcios	bade48513a	chore: fix ruff lint issues after merges	2026-02-04 00:18:15 -05:00
Kevin Turcios	bbbc7ebe63	Merge #1362 : Speed up ReferenceFinder._find_reexports by 14%	2026-02-04 00:17:55 -05:00
Kevin Turcios	3dedc59cba	Merge #1357 : Speed up PrComment.to_json by 46%	2026-02-04 00:17:51 -05:00
Kevin Turcios	4a850d35fe	Merge #1353 : Speed up extract_imports_for_class by 429%	2026-02-04 00:17:47 -05:00
Kevin Turcios	5a704084db	Merge #1352 : Speed up get_external_base_class_inits by 344%	2026-02-04 00:17:43 -05:00
Kevin Turcios	60bd77675d	Merge #1343 : Speed up _collect_numerical_imports by 159%	2026-02-04 00:17:38 -05:00
Kevin Turcios	2b4af2fd06	revert: undo "a or b" pattern changes Reverts the automatic conversion from "a if a else b" to "a or b" pattern that was applied by ruff. The FURB110 rule remains disabled to prevent future automatic conversions.	2026-02-04 00:08:48 -05:00
Kevin Turcios	dfe073a592	Merge pull request #1370 from codeflash-ai/claude-workflow-perms feat: secure Claude workflow and add merge permissions	2026-02-03 23:57:44 -05:00
claude[bot]	5cb780a890	refactor: revert "or" pattern in files that didn't originally use it Reverted the "x or y" pattern back to "x if x else y" in 4 files that didn't originally use the "or" pattern, maintaining consistency with their original coding style. Files reverted: - codeflash/code_utils/codeflash_wrap_decorator.py - codeflash/github/PrComment.py - codeflash/result/explanation.py - codeflash/verification/codeflash_capture.py The other 20 files already used the "or" pattern and were kept as-is. Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>	2026-02-04 04:55:21 +00:00
Kevin Turcios	cb9248e022	feat: add merge/close permissions and secure workflow - Add git merge/fetch/checkout/branch to allowed tools - Add gh pr merge/close to allowed tools - Add allowed_bots for claude[bot] to trigger pr-review - Restrict @claude mentions to maintainers only (OWNER/MEMBER/COLLABORATOR) - Block fork PRs from triggering pr-review and claude-mention	2026-02-03 23:54:43 -05:00
Kevin Turcios	e0ec03b3c8	Merge branch 'main' into coverage-no-files	2026-02-03 23:42:21 -05:00
Kevin Turcios	dd0cca94d3	Merge pull request #1369 from codeflash-ai/test-claude-perms fix: skip pr-review when triggered by claude bot	2026-02-03 23:42:12 -05:00
Kevin Turcios	02b9bcb872	Merge branch 'main' into test-claude-perms	2026-02-03 23:41:59 -05:00
Kevin Turcios	831d296052	fix: skip pr-review when triggered by claude bot	2026-02-03 23:40:45 -05:00
claude[bot]	c0e8a98ca5	chore: disable FURB110 lint rule that enforces 'or' pattern The codebase prefers explicit 'a if a else b' over 'a or b' pattern. Disabled FURB110 (if-exp-instead-of-or-operator) rule to prevent automatic conversion by the linter. Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>	2026-02-04 04:39:20 +00:00
Kevin Turcios	07e577f751	Merge branch 'main' into coverage-no-files	2026-02-03 23:23:49 -05:00
Kevin Turcios	7f5b49fd49	Merge pull request #1367 from codeflash-ai/test-claude-perms agentic Claude for us	2026-02-03 23:21:07 -05:00
Kevin Turcios	8c40b3b099	Merge branch 'main' into test-claude-perms	2026-02-03 23:07:43 -05:00
Kevin Turcios	e1069ea7be	chore: update lockfile	2026-02-03 23:02:49 -05:00
Kevin Turcios	d5ec877a78	feat: add coverage analysis to PR review workflow - Run tests with coverage on changed files - Compare coverage between PR and main branch - New files require ≥75% test coverage - Modified files must have changed lines covered - Flag coverage regressions in PR comment	2026-02-03 22:57:56 -05:00
Saurabh Misra	610e63c168	Merge pull request #1348 from codeflash-ai/feature/java-verbose-logging [feat] Java verbose logging	2026-02-03 19:51:42 -08:00
Kevin Turcios	6289c5325a	feat: improve Claude PR review workflow - Consolidate claude-code-review.yml into claude.yml with two jobs - Add auto-fix for safe linting issues (formatting, imports) before review - Use --from-ref origin/main to only check changed files - Add smart re-review logic that resolves fixed comments - Add inline comment support via MCP tool with 5-7 comment limit	2026-02-03 22:51:32 -05:00
Saurabh Misra	99b8b8e5f0	Merge branch 'omni-java' into feature/java-verbose-logging	2026-02-03 19:51:24 -08:00
Saurabh Misra	dbb26dfee8	Merge pull request #1337 from codeflash-ai/fix/java-test-timeout-issue fix: increase Java test timeout from 15s to 120s	2026-02-03 19:50:25 -08:00
codeflash-ai[bot]	0b055ccc53	Optimize ReferenceFinder._find_reexports The optimization achieves a 14% runtime improvement (702μs → 614μs) by adding an inexpensive pre-check before performing expensive tree-sitter parsing operations. Key optimization: The code now checks if the `export_name` exists anywhere in the `source_code` string before calling `analyzer.find_exports()`. This simple substring check (`if export_name not in source_code`) acts as a fast filter to skip files that definitely don't contain re-exports of the target function. Why this is faster: 1. Avoids expensive parsing: The line profiler shows `analyzer.find_exports()` consumes 45.4% of original runtime (1.37ms of 3.01ms total). The optimized version reduces this to 39.8% of a smaller total (1.02ms of 2.56ms), with 6 out of 25 calls completely avoided. 2. String containment is O(n) with highly optimized C implementation in Python, while tree-sitter parsing involves building and traversing an AST, making it orders of magnitude more expensive. 3. Cascading savings: When the pre-check fails, we also skip `source_code.splitlines()` (3.2% of original runtime) and all subsequent loop iterations. Impact: The profiler shows that in the test dataset, 6 out of 25 files (24%) don't contain the export name and can short-circuit immediately. For codebases with many files that import/re-export from various sources, this ratio could be even higher, making the optimization particularly valuable when searching across large projects. Trade-offs: This is a purely beneficial optimization with no downsides - the string check has negligible overhead compared to tree parsing, and it only returns early when the result would have been an empty list anyway.	2026-02-04 01:44:45 +00:00
claude[bot]	9e81b2be46	style: apply linting and formatting fixes - Fixed 89 linting issues (imports, type annotations, code style) - Formatted 22 files with ruff - Updated auto-generated version.py Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>	2026-02-04 01:33:31 +00:00
claude[bot]	2ad731d3d6	style: fix linting and formatting issues in function_optimizer.py - Fix quote formatting (15 instances) - Remove unused import - Prefix unused concolic_tests variable with underscore - Apply code formatting Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>	2026-02-04 01:29:34 +00:00
codeflash-ai[bot]	0459bd340e	Optimize PrComment.to_json This optimization achieves a 45% runtime improvement (729μs → 500μs) through strategic performance enhancements focused on eliminating repeated dictionary creation and reducing loop overhead. ## Key Optimizations ### 1. Module-level Dictionary Caching (`TestType.to_name()`) The original code reconstructed a 5-element dictionary on every call to `to_name()`. The optimized version uses a module-level `_TO_NAME_MAP` constant that's created once at import time. Line profiler data shows this change reduced `to_name()` execution time from 1.27ms to 184μs (~85% faster), as the dictionary construction overhead (610μs, 48% of function time) is eliminated entirely. ### 2. Dict Comprehension Initialization (`get_test_pass_fail_report_by_type()`) Replaced a loop that iteratively built the report dictionary with a single dict comprehension: `{tt: {"passed": 0, "failed": 0} for tt in TestType}`. This reduces the loop overhead from iterating over all `TestType` enum members and performing multiple dictionary assignments to a single comprehension operation, cutting function time from 403μs to 376μs. ### 3. Early Continue Pattern (`get_test_pass_fail_report_by_type()`) Changed nested if-else logic to use early `continue` when `loop_index != 1`, reducing indentation and eliminating redundant condition checks for test results that don't meet the filter criteria. ### 4. Filtered Report Table Construction (`PrComment.to_json()`) Instead of using a dict comprehension with a filter inside, the code now builds `report_table` with an explicit loop that checks `if name:` before insertion. This avoids creating intermediate tuples for the comprehension and provides clearer filtering logic. The profiler shows `to_json()` improved from 5.23ms to 3.36ms (~36% faster). ## Test Case Performance The annotated tests demonstrate consistent improvements across all scenarios: - Simple cases: 74-97% faster (27.4μs → 15.7μs, 19.9μs → 10.4μs) - Large data cases: 81-92% faster (maintaining performance even with 100+ benchmark details) - Edge cases: 9-15% faster (even extreme runtime values benefit) The optimizations are particularly effective for the common use case where `to_name()` is called multiple times during report generation, and `get_test_pass_fail_report_by_type()` initializes its data structures. Since these functions are used in PR comment generation, the speedup directly improves CI/CD feedback loop performance.	2026-02-04 01:28:15 +00:00
HeshamHM28	7a7bf329cf	refactor: use DEBUG_MODE from console.py for verbose logging - Remove duplicate is_verbose_mode() function - Import and reuse existing DEBUG_MODE from console.py - Update all verbose logging functions to use DEBUG_MODE consistently - Make language parameter required in log_instrumented_test Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 03:24:14 +02:00
Kevin Turcios	575c760cd9	Merge pull request #1333 from codeflash-ai/test-claude-perms test Claude perms	2026-02-03 20:19:52 -05:00
codeflash-ai[bot]	c889a8f75d	Optimize extract_imports_for_class The optimized code achieves a 428% runtime speedup (2.33ms → 441μs) by replacing the expensive `ast.walk(class_node)` traversal with direct iteration over `class_node.body`. ## Key Optimization Original approach: Used `ast.walk(class_node)` which recursively visits every node in the AST subtree, including all nested function definitions, their arguments, return types, and deeply nested expression nodes. For a typical class with methods, this traverses ~2500 nodes. Optimized approach: Iterates only `class_node.body`, which contains just the direct children of the class (typically 200-400 nodes for the same class). This is sufficient because: - Type annotations for fields are in `class_node.body` as `ast.AnnAssign` nodes - Field assignments with `field()` calls are in `class_node.body` as `ast.Assign` nodes - Base classes and decorators are already extracted separately before the loop The line profiler confirms this: the original's `ast.walk()` loop consumed 66% of total runtime (12.76ms out of 19.3ms), while the optimized version's direct iteration takes only 2.3% (112μs out of 4.96ms). ## Additional Refinement The optimized code also improves the `field()` detection by changing from checking `ast.Call` nodes anywhere in the tree to specifically checking `ast.Assign` nodes where the value is a `Call` with a `Name` func. This more accurately targets dataclass field assignments and uses `elif` to avoid redundant checks. ## Test Case Performance The optimization excels across all test categories: - Simple classes (2-3 fields): 186-436% faster - Complex annotations (nested generics): 335-591% faster - Large-scale tests (50+ fields, 200 imports): 495-949% faster The performance gain scales with class complexity because larger classes have more nested nodes that `ast.walk()` unnecessarily traverses, while the optimized version still only iterates the direct body elements. ## Impact on Workloads Based on function_references, `extract_imports_for_class` is called from: 1. Test suite replay tests - indicating it's in a performance-critical testing path 2. `get_code_optimization_context` - suggesting it's used during code analysis/optimization workflows Since the function extracts context for optimization decisions, the 428% speedup directly reduces latency in code analysis pipelines, making the optimization particularly valuable for CI/CD systems or developer tooling that analyzes many classes.	2026-02-04 01:09:50 +00:00
codeflash-ai[bot]	3ee339f075	Optimize get_external_base_class_inits This optimization achieves a 343% speedup (88.7ms → 20.0ms) by eliminating redundant expensive operations through strategic caching and deduplication. ## Key Optimizations 1. Deduplication of External Base Classes - Changed from list to set (`external_bases_set`) to automatically deduplicate base class entries - Prevents processing the same (base_name, module_name) pair multiple times - Removed the need for the `extracted` tracking set and subsequent membership checks 2. Module Project Check Caching - Added `is_project_cache` to memoize `_is_project_module()` results per module - This is critical because the profiler shows `_is_project_module()` consumed 79% of original runtime (265ms out of 336ms) - Each call involves expensive `importlib.util.find_spec()` and `path_belongs_to_site_packages()` operations - In the optimized version, this drops to just 16.7% (11.9ms) since most modules are checked only once 3. Module Import Caching - Added `imported_module_cache` to avoid repeated `importlib.import_module()` calls - When multiple classes inherit from the same base, the module is imported only once - Reduces import overhead from 4.84ms to 2.12ms in the line profiler ## Performance Impact by Test Case The optimization particularly excels when: - Multiple classes inherit from the same base: `test_multiple_classes_same_base_extracted_once` shows 565% speedup (19.0ms → 2.86ms) - Large codebases with many classes: `test_large_single_code_string` (500 classes) shows 1113% speedup (45.7ms → 3.77ms) - Many different external bases: `test_many_classes_single_external_base` (100 classes) shows 949% speedup (9.19ms → 875μs) These improvements directly benefit production workloads since `function_references` shows this function is called from `get_code_optimization_context`, which is part of the code analysis pipeline. When analyzing projects with extensive class hierarchies that inherit from external libraries (like web frameworks, ORMs, or data processing libraries), the optimization prevents redundant module introspection and imports, making the code context extraction phase significantly faster.	2026-02-04 01:03:50 +00:00
Kevin Turcios	9f4776eb2e	chore: migrate from pre-commit to prek Replace pre-commit with prek (faster Rust-based alternative) for linting. - Add prek to dev dependencies - Replace pre-commit workflow with prek workflow using setup-uv@v6 - Update Claude workflow allowed tools to use prek	2026-02-03 19:56:58 -05:00
Mohamed Ashraf	aa718c88f6	chore: remove documentation markdown files from PR	2026-02-04 00:48:05 +00:00
Mohamed Ashraf	3e8dfb8061	fix: prevent Maven running all tests + fix TestFile type annotation Bug #3: Maven Runs All Tests Instead of Specific Tests - Added validation in _run_maven_tests() to raise ValueError when test filter is empty - Added detailed error logging in _build_test_filter() to track why tests are skipped - Added warnings when TestFile objects have None paths - Prevents silent failure where Maven runs ALL tests instead of target tests Bug #4: Incorrect Type Annotation in TestFile Model - Fixed benchmarking_file_path: Path = None -> Optional[Path] = None - Original annotation caused Pydantic validation errors when path was None - This was preventing proper testing and validation of None paths Changes: - codeflash/languages/java/test_runner.py: Added validation and logging - codeflash/models/models.py: Fixed type annotation - codeflash/discovery/discover_unit_tests.py: Added Bug #2 fix (tests_project_rootdir) - tests/test_java_test_filter_validation.py: 4 comprehensive test cases Tests: - test_build_test_filter_with_none_benchmarking_paths: Verifies None paths handled correctly - test_build_test_filter_with_valid_paths: Verifies valid paths work - test_run_maven_tests_raises_on_empty_filter: Verifies validation catches empty filter - test_run_maven_tests_succeeds_with_valid_filter: Verifies normal case works All 4 tests passing ✓ Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-04 00:48:05 +00:00
Mohamed Ashraf	a23d0ca7d1	fix: set tests_project_rootdir to tests_root for Java Applying Bug #2 fix to this branch for testing. Java needs tests_project_rootdir set to actual test directory (src/test/java) instead of project root for test file resolution.	2026-02-04 00:48:05 +00:00
HeshamHM28	2c48e9c9a9	feat: Add verbose logging for existing instrumented tests	2026-02-04 02:36:28 +02:00
aseembits93	92c54b7a6f	move earlier to avoid more work	2026-02-03 16:34:07 -08:00

1 2 3 4 5 ...

6098 commits