codeflash-agent

mirror of https://github.com/codeflash-ai/codeflash-agent.git synced 2026-05-04 18:25:19 +00:00

Author	SHA1	Message	Date
Kevin Turcios	a4276d658a	Refine engagement report and case study for executive review - Hero metrics: -89% cost, -52% peak memory, flat scaling, -12.9% latency - Add lightspeed canvas animation via assets/lightspeed.js for Plotly Cloud - Add platform-libs CI/CD migration to timeline (Phase 1b) with PR links - Update next-engagement card with POC branch and PR references - Replace RSS with peak memory in user-facing copy - Add flat memory scaling to case study results table	2026-04-16 17:51:54 -05:00
Kevin Turcios	380bd59503	Add iterative-discovery narrative and missing findings across all reports Weave "optimizations reveal deeper issues" framing into engagement report executive summary, case study, and optimization README. Add O(N²) text extraction fix, per-request RSS creep (24→17 MB), and memray profiling data that were previously undocumented.	2026-04-16 15:02:39 -05:00
Kevin Turcios	3c705d4e2d	Rewrite Unstructured case study for public-facing clarity Apply research-backed case study structure: headline anchoring on biggest numbers, customer-as-hero framing, loss aversion, narrative arc, methodology for developer credibility. Collapse PR inventory to category summary, ~1,100 words in optimal range.	2026-04-16 14:40:05 -05:00
Kevin Turcios	6d05aea09c	Revamp engagement report layout and timeline for executive clarity - Move Infrastructure Cost Impact above hero metrics and tab toggle - Extract shared above-fold content into _above_fold_content() for /jpc parity - Replace plotly Gantt chart with pure-HTML vertical timeline - Fix cross-browser flex layout (explicit flex: 1 1 0%, minWidth: 0) - Remove redundant "The Results" and "How This Was Tested" sections - Rename Engineering Team → Engineering Details - Rename Peak RSS → Peak Memory Usage - Update timeline dates: 1-week buffer after Phase 1, cascade phases - Rename section headers: Vertical Optimization Roadmap, Proposed Next Engagement	2026-04-16 14:31:32 -05:00
Kevin Turcios	aa259b4652	Update uv.lock for security audit app dependencies	2026-04-16 06:19:28 -05:00
Kevin Turcios	3e63326876	Add standalone security audit app for Plotly Cloud deployment Separate deployment at https://19727fbf-a6a0-45ac-968f-680035ab6b3b.plotly.app with its own pyproject.toml, lockfile, and plotly-cloud.toml config.	2026-04-16 06:18:33 -05:00
Kevin Turcios	514c1e28c9	Tailor security report for Lawrence, add UX improvements and talking points - Rewrite executive summary to reference his PR #1465 lockfile fix and existing tooling (Renovate, Anchore, Chainguard) - Reorder findings by category priority (supply chain > container > CI/CD) to lead with what matters most to the audience - Add animated parallelogram background matching codeflash.ai aesthetic - 6 research-backed UX changes: severity icons (WCAG 1.4.1), title-first cards (F-pattern), loss-framed 85% CTA, distinct status colors, card opacity for figure-ground separation - Correct SEC-021 from 67% to 97% mutable Action pins per VM verification (only 2 of 96 SHA-pinned in core-product) - Add talking-points-lawrence.md with profile, pain points, pitch strategy	2026-04-16 06:01:52 -05:00
Kevin Turcios	8c42f27eed	Add 4-tab navigation to security audit report Split the 39-finding wall into tabbed views matching the engagement report pattern: Summary, Critical & High (21), Medium & Low (18), and By Category with both category and repository breakdowns.	2026-04-16 05:05:32 -05:00
Kevin Turcios	3dc58775e3	Consolidate report into 4-tab view and clean up for production - Replace Executive Brief with JPC Summary as default tab (Executive Summary) - Add Timeline as 4th tab; standalone /jpc and /timeline routes preserved - Remove dead code: build_exec_view, make_k8s_chart, unused latency vars - Extract _logo_lockup helper, _TAB_BTN_STYLE constants to reduce duplication - Use app.layout as function, env-configurable debug/port, update docstring	2026-04-16 04:48:16 -05:00
Kevin Turcios	c22c5babd1	Organize screenshots by date and session - 2026-04-15: exec restructure, team view, engagements - 2026-04-16-methodology: methodology notes across all views - 2026-04-16-jpc: standalone JPC summary and route verification - 2026-04-16-timeline: timeline iterations (reordering, date fixes, chart tuning)	2026-04-16 03:49:15 -05:00
Kevin Turcios	c3e7dba47b	Add report screenshots to reports/unstructured/screenshots/	2026-04-16 03:48:14 -05:00
Kevin Turcios	b20c05a799	Add /timeline route with proposed engagement roadmap - Gantt chart with 5 phases: Core-Product (completed), DevEx & CI/CD, Platform API, Security Hardening (concurrent with DevEx), Cost Discovery - Phase detail cards with duration, dates, deliverables, dependencies - DevEx as Phase 2 (POC already done, sets up faster CI for Phase 3) - Security runs concurrent with Phase 2 (uv workspace enables lockfile) - Investment summary with ~5 month total timeline - Fixed x-axis range and removed rangeslider for clean proportional bars	2026-04-16 03:46:50 -05:00
Kevin Turcios	90091ccc12	Add /jpc standalone summary route and methodology notes - Add build_jpc_view() with clean standalone layout at /jpc for JPC (no tabs, no hero — just the document that "stands on its own") - Add URL routing via dcc.Location: / serves full report, /jpc serves summary - Add methodology notes to exec view (How This Was Tested annotations) - Add methodology notes to detail view (7-entry "why" card) - Enrich team view Memory + Standalone vs. Cumulative explanations	2026-04-16 03:07:33 -05:00
Kevin Turcios	2da186d4df	Apply learnings to team + detail views, remove redundancy Team view: - Add Engineering Impact Summary at top (4 metrics: memory, density, latency, idle vCPU) with pointer to sections below - Remove Production Context card (redundant with Impact Summary) - Trim memory table to only metrics not shown in chart (RSS per request, K8s allocation) — chart already shows pre/post/delta - Fix "10-page scan" → "10-page scanned document" in methodology Detail view: - Add intro callout explaining this is the raw data backing the other two views	2026-04-16 02:46:01 -05:00
Kevin Turcios	c1b603afc4	Fix technical terminology in exec brief - "CFS quota" → "1-CPU limit" (CFS is implementation detail, too technical for exec audience) - "jemalloc" → "jemalloc, opt-in for 1-CPU pods" (missed instance) - "requests 1 CPU / 32 GB RAM resource requests" → "per pod" (double "requests" was grammatically broken) - "10-page scan" → "10-page scanned document" (consistent with workload profiles section)	2026-04-16 02:41:17 -05:00
Kevin Turcios	2c3aad4325	Restructure exec view: enablement-first flow for JPC audience Reorder based on persuasion research (Three-Talk Model, Prospect Theory, Kotter): 1. "The Engagement" — collaborative shared context (team talk) 2. "What This Enables" — loss-framed enablement: 9.2x pod density, 41 idle vCPUs now available, -12.9% latency for agentic API 3. "The Results" — before/after proof of execution 4. Infrastructure Cost Impact (anchored on $100K/mo) 5. Workload Profiles + Methodology (credibility) 6. Delivered + Proposed Next Engagements Key shift: lead with what the work unlocks (feature velocity, platform capacity, API speed) rather than the technical achievement (memory reduction). Cost savings is proof of execution, not the headline.	2026-04-16 02:36:29 -05:00
Kevin Turcios	6143c38d78	Move workload profile explanations into Executive Brief The 1p/10p/16p benchmark rationale belongs in the exec view — JPC needs to understand that page count != workload before seeing the numbers. Added "Benchmark Workload Profiles" section before "How This Was Tested" with the three profiles and the data punchline (#1505 at -32.6% on 1 page vs -7.4% on 16 pages).	2026-04-16 02:32:35 -05:00
Kevin Turcios	eeebf6eec2	Add workload profile explanations to latency benchmark table The 1p/10p/16p column headers weren't self-explanatory. Added a "Benchmark Workload Profiles" card above the latency table in the Detail view explaining that each document tests a distinct workload shape (table-dense, scanned, mixed), not just different page counts. Also added annotation below the table calling out that #1505 has 4x the impact on the 1-page doc vs. the 16-page doc — letting the data demonstrate that per-document cost depends on content, not page count.	2026-04-16 02:27:00 -05:00
Kevin Turcios	ddb4cf8258	Update engagement report: reframe for JPC audience, fix technical inaccuracies - Reframe Future Engagements → Proposed Next Engagements based on Crag meeting: lead with Platform API speed/stability, add Infrastructure Cost Discovery ($100K/mo), remove Codeflash product pitch - Add Broader Context callout after cost section (core-product = ~10% of total Azure spend) - Fix Knative terminology throughout: "Knative pods" → "pods with a 1-CPU resource request" (CFS quota, not Knative config) - Fix CPU detection description: three-tier logic (cgroup v2 cpu.max → sched_getaffinity → os.cpu_count, take minimum) - Clarify jemalloc is opt-in (MALLOC_IMPL=jemalloc), 1-CPU serial OCR only; multi-CPU pods should use glibc default due to ~50 MB/process arena overhead	2026-04-16 02:11:27 -05:00
Kevin Turcios	9102d14a00	continue	2026-04-16 02:00:33 -05:00
Kevin Turcios	e65b8a3564	Add security audit report and infrastructure cost analysis Standalone security report (security_report.py) covering 6 supply chain and build pipeline findings from the performance engagement. Add infra cost section to exec view showing $10K → $1.1K/mo projection based on D48s_v5 node packing at 4 GB vs 32 GB per pod.	2026-04-15 18:22:07 -05:00
Kevin Turcios	f8281a24a0	Update engagement_report.py	2026-04-15 13:27:43 -05:00
Kevin Turcios	49a7d586d4	Update engagement_report.py	2026-04-15 13:26:04 -05:00
Kevin Turcios	87a906e704	Update Unstructured engagement report (#25 ) * Update engagement report: add logos, grid theme, scope to core-product - Add Codeflash x Unstructured logo lockup in hero and footer - Apply roadmap grid pattern (48px, 5% opacity) and zinc-900 background - Update cards to rounded-2xl with semi-transparent zinc-900/50 bg - Remove all platform-libs, CI/CD, and security audit sections - Remove stacked optimizations PR #1500 from open PRs - Update data to latest FastAPI endpoint measurements - Filter PR tables to core-product only * Add methodology section to team view, fix DataTable type safety Add benchmark environment, measurement protocol, and production context cards to the top of the Engineering Team view. Split TABLE_STYLE into individually typed constants (TABLE_HEADER, TABLE_CELL, TABLE_DATA, TABLE_DATA_CONDITIONAL, TABLE_WRAP) so DataTable kwargs pass ty and mypy strict checks. * Add engagement report screenshot assets * Add PRs from unstructured, unstructured-inference, unstructured-od-models Expand report scope beyond core-product: 14 new merged PRs and 2 new open PRs across 3 additional repos. Update PR counts (24 merged, 5 in progress), add Repo column to detail view tables, update subtitle and meta description. * Make PR numbers clickable links in detail view tables Use DataTable markdown columns with link_target=_blank so PR numbers link to their GitHub PRs. Add REPO_BASES mapping for per-repo URL resolution. Override default purple link color with blue (#60a5fa) to stay readable on the dark background. * main * Add Future Engagements section with notes panels to exec view Prominent banner heading, four numbered cards (CI/CD, Security, Runtime, Product Integration) each with a right-hand Notes panel for discussion points. Refactored _next_card helper to accept optional notes parameter.	2026-04-15 13:11:28 -05:00
Kevin Turcios	7e00007569	Improve deep optimizer: profiling script + failure modes + dist fix (#24 ) * Exclude dev docs from plugin dist builds README.md, ARCHITECTURE.md, and ROADMAP.md are development docs that shouldn't ship in the assembled plugin distributions. * Improve deep optimizer: fix profiling script, add failure mode awareness Profiling script: Accept source root and command as CLI args instead of hardcoding `src` and requiring manual `# === RUN TARGET HERE ===` edits. The agent now copies the script from references and runs it with the project's actual source root and test command. Failure modes: Wire failure-modes.md into the on-demand reference table and stuck recovery checklist so the agent consults it when workflows break (deadlocks, silent failures, context loss, stale results). * Fix ruff lint errors in unified profiling script Refactor main() into parse_args(), profile_command(), and report_results() to fix C901 (complexity) and PLR0915 (too many statements). Also fix S306 (mktemp → NamedTemporaryFile), PLW1510 (explicit check=False), and add noqa for intentional os.path usage (PTH112) and subprocess with CLI args (S603).	2026-04-15 04:11:52 -05:00
Kevin Turcios	20f6c59f05	Lint and format entire repo, not just packages (#23 ) Remove .codeflash/ from ruff extend-exclude, add per-file ignores for .codeflash/, scripts/, evals/, and plugin/ (benchmark/script patterns like print, eval, magic values). Remove shebangs. Widen pre-commit hooks to check the full repo.	2026-04-15 03:16:15 -05:00
Kevin Turcios	33faedf427	Add Unstructured report, rewrite statusline, format evals/scripts (#20 ) * Add Unstructured engagement report as uv workspace member Three-tier Plotly Dash app (Executive Brief, Engineering Team, Full Detail) with data in JSON, theme constants in theme.py, and Dash production improvements (Google Fonts, clientside callbacks, meta tags). Also: add .playwright-mcp/ to .gitignore, add reports/* ruff overrides, remove tracked .codeflash/observability/read-tracker. * Rewrite statusline to derive context from git state Detects active area from changed files (reports, packages, plugin, .codeflash, case-studies, evals), falls back to branch name convention (perf/, feat/, fix/), shows dirty indicator. Uses whoami for cross-platform user detection. Add pre-push lint rule to commit guidelines * Exclude .codeflash/ from ruff linting Benchmark and profiling scripts in .codeflash/ are scratch work, not package source. Excluding them prevents CI failures from ad-hoc scripts. * Run ruff format across packages, scripts, evals, and plugin refs * Fix github-app async test failures in CI Add asyncio_mode = "auto" to root pytest config so async tests are detected when running from the repo root via uv run pytest packages/.	2026-04-15 03:06:16 -05:00
Kevin Turcios	2caaf6af7c	Fix CI: mypy errors, ruff formatting, switch to prek (#22 ) * Fix mypy errors and apply ruff formatting across packages Fix ast.FunctionDef calls missing type_params for Python 3.12+, correct type: ignore error codes in _comparator and _plugin, and run ruff format on all package source and test files. * Switch CI to prek for lint/typecheck checks Use j178/prek-action for consistent lint+typecheck (ruff check, ruff format, interrogate, mypy) matching local pre-commit config. Keep test as a separate parallel job for test-env support.	2026-04-15 02:52:47 -05:00
Kevin Turcios	a1710f7f92	Adopt shared CI workflow (#21 ) Replace packages-ci.yml and github-app-tests.yml with a single ci.yml that calls the shared ci-python-uv reusable workflow. Lint, typecheck, and test run as parallel jobs. Version check stays local (needs fetch-depth: 0 + PR-only conditional).	2026-04-15 02:36:17 -05:00
Kevin Turcios	7d86202524	Update metaflow README with actual results and PR status (#19 ) Replace placeholder text ("No optimizations applied yet", empty PR table) with: - CAS lz4 compression results (7-18x on realistic ML payloads) - Upstream PR status (Netflix/metaflow#3090, open) - Open questions on dependency management and forward compat - Methodology, remaining targets, and lessons learned	2026-04-14 23:41:55 -05:00
Kevin Turcios	1734199e85	Add metaflow and core-product case studies, rename pypa to python (#18 ) - Rename case-studies/pypa/ → case-studies/python/ to match .codeflash/ convention - Add case-studies/netflix/metaflow/summary.md (7-18x lz4 vs gzip) - Add case-studies/unstructured/core-product/summary.md (14.6% latency, 2.1 GB memory) - Update main README results table with all five case studies	2026-04-14 23:31:49 -05:00
Kevin Turcios	09ba9b44b2	Add typeagent-py case study (#17 ) - Add case-studies/microsoft/typeagent/summary.md with results, lessons learned (failed vector search experiment, maintainer alignment), and takeaways for codeflash - Update upstream PR statuses: #235 merged, #236 closed (rejected), #232 blocked on #230 - Add typeagent to main README results table	2026-04-14 23:25:29 -05:00
Kevin Turcios	6dd3b02168	Restructure typeagent README: separate failed vector search experiment (#16 ) Move vector search benchmarks out of main results into a Lessons Learned section. The 3.7x-14.2x numbers were real but on a non-bottleneck — maintainer confirmed model API calls and SQL dominate real latency. Results section now only shows legitimate wins: import time (1.16x), indexing pipeline (1.14-1.16x), and query batching (2.10-2.62x).	2026-04-14 23:21:53 -05:00
Kevin Turcios	cc29a27289	Migrate .codeflash/ to {teammember}/{org}/{project}/ format (#15 ) Add team member dimension to case study paths so multiple contributors can track optimization data independently. Derives member from git config user.name in session-start hooks. - Move all case studies under .codeflash/krrt7/ - Rename pypa/pip → python/pip (org grouping) - Update session-start hooks, docs, scripts, and references	2026-04-14 23:04:34 -05:00
Kevin Turcios	4a65f17bfb	Set up CODEOWNERS for Go and Java language overlays (#14 )	2026-04-14 19:18:05 -05:00
Kevin Turcios	361bb899e2	Move Go overlay to plugin/languages/go/ (#13 ) * Move Go plugin overlay from languages/go/ to plugin/languages/go/ Aligns Go with the Java/Python/JavaScript convention where all language overlays live under plugin/languages/<lang>/. The Makefile already discovers from plugin/languages/* so Go is now included in builds. * Remove accidental read-tracker changes * Ignore .codeflash/observability/ in gitignore	2026-04-14 19:14:57 -05:00
m-ali-24	044b2f190a	[FEAT] golang agents (#11 ) * go base * missing javascript --------- Co-authored-by: ali <--global>	2026-04-14 18:55:36 -05:00
mashraf-222	270cb56cee	Feat/java language support (#12 ) * Add Java/Kotlin detection to top-level language router Adds pom.xml, build.gradle, build.gradle.kts, settings.gradle, and settings.gradle.kts as markers that route to the codeflash-java router. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add Java/Kotlin agent definitions for all optimization domains 10 agents covering the full optimization pipeline: - codeflash-java: router/team lead for domain detection - codeflash-java-setup: environment detection (build tool, JDK, profiling tools) - codeflash-java-deep: cross-domain optimizer (default) - codeflash-java-cpu: data structures, algorithms, JIT deopt, JMH benchmarks - codeflash-java-memory: heap/GC tuning, escape analysis, leak detection - codeflash-java-async: virtual threads, lock contention, CompletableFuture - codeflash-java-structure: class loading, JPMS, startup time, circular deps - codeflash-java-scan: quick cross-domain diagnosis via JFR/jdeps/GC logs - codeflash-java-ci: GitHub webhook handler for Java PRs - codeflash-java-pr-prep: JMH benchmarks and PR body templates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add Java domain reference guides for all optimization domains 6 guides covering deep domain knowledge for agent consumption: - data-structures: collection selection, autoboxing, JIT patterns, sorting - memory: JVM heap layout, GC algorithms and tuning, escape analysis, leaks - async: virtual threads, structured concurrency, lock hierarchy, contention - structure: class loading, JPMS, CDS/AppCDS, ServiceLoader, Spring startup - database: JPA N+1, HikariCP, pagination, batch operations, EXPLAIN plans - native: JNI, Panama FFM API, GraalVM native-image, Vector API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add Java optimization skills: session launcher and JFR profiling - codeflash-optimize: session launcher with start/resume/status/scan/review - jfr-profiling: quick-action JFR profiling in cpu/alloc/wall modes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Slim Java agents to match Go's concise ~175-line pattern Move inline code examples, antipattern encyclopedias, JMH templates, and deep-dive sections from agent prompts into reference guides. Agents now contain only: target tables, one-liner antipatterns, reasoning checklists, profiling commands, and keep/discard trees. Line counts (before → after): cpu: 636 → 181 memory: 878 → 193 async: 578 → 165 structure: 532 → 167 deep: 507 → 186 scan: 440 → 163 Average: 595 → 176 (vs Go's 175) Adds to data-structures/guide.md: - Collection contract traps table - Reflection → MethodHandle migration pattern - JMH benchmark template Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix Makefile build: use rsync merge and portable sed -i Two bugs in the build target: 1. cp -R created nested dirs (agents/agents/, references/references/) instead of merging language overlay into shared base. Fix: rsync -a. 2. sed -i '' is macOS-only; fails silently on Linux. Fix: sed -i.bak (works on both macOS and Linux), then delete .bak files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add HANDOFF.md session lifecycle to Java agents Java agents could read HANDOFF.md on resume but never wrote or updated it. A session that hit plateau would lose all context — what was tried, what worked, why it stopped, what to do next. Changes: - Deep agent: init HANDOFF.md on fresh start, record after each experiment, write Stop Reason + learnings.md on session end - Domain agents (CPU, memory, async, structure): record to HANDOFF.md after each keep/discard, write session-end state - Handoff template: make language-agnostic (was Python-specific), add Session status, Strategy & Decisions, and Stop Reason fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Close 11 gaps between Java and Python plugins Add missing sections to Java deep agent: experiment loop depth (12 steps), library boundary breaking, Phase 0 environment setup, CI mode, pre-submit review, adversarial review, team orchestration, cross-domain results schema, and structured progress reporting. Add polymorphic dispatch safety to CPU agent and data-structures guide. Add diff hygiene to CPU agent. Add native reference to router. Create two new reference files: library-replacement.md (Guava/Commons/ Jackson/Joda replacement tables) and team-orchestration.md (full dispatch and merge protocol). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-14 18:49:41 -05:00
Kevin Turcios	043bf45415	Ignore .lprof and .prof binary files, update read-tracker	2026-04-14 18:42:38 -05:00
Kevin Turcios	9830b7b4a1	Track .codeflash/ data: unignore observability and add krrt7/odoo case study	2026-04-14 18:40:08 -05:00
Kevin Turcios	3b59d97647	squash	2026-04-13 14:12:17 -05:00
Kevin Turcios	cee3987d7b	cleanup	2026-04-06 05:58:13 -05:00
Kevin Turcios	ebb9658dfd	Merge main-teammate branch	2026-04-03 17:36:50 -05:00
Kevin Turcios	0cda0d907c	fix: align marketplace version with plugin.json and recursive .DS_Store ignore - marketplace.json metadata.version 1.0.0 → 0.1.0 to match plugin.json - .gitignore .DS_Store → **/.DS_Store for nested directories	2026-03-27 11:39:34 -05:00
Kevin Turcios	7fab0082c0	Merge pull request #6 from codeflash-ai/feat/tool-configs feat: improve skill and eval system	2026-03-27 11:31:30 -05:00
Kevin Turcios	37efa524d7	feat: improve skill, eval system, and tessl config - Optimize codeflash-optimize SKILL.md (review score 17% → 98%, eval 87% → 100%) - Fix frontmatter (allowed-tools format, argument-hint under metadata) - Lead description with concrete actions, explicit agent launch parameters - Add multi-run variance detection to eval system (--runs N flag) - score.py aggregate command: min/max/avg/stddev per criterion, flaky detection - check-regression.sh defaults to 3 runs for reliable regression detection - Add per-criterion regression tracking to baseline-scores.json (v3) - Reports exactly which criteria regressed, not just total score drops - Rename evals/ → codeflash-evals/ to avoid tessl directory conflicts - Switch tessl to managed mode, gitignore vendored tiles and symlinks	2026-03-27 11:30:17 -05:00
Kevin Turcios	999e08fb5e	Merge pull request #5 from codeflash-ai/fix/session-analysis-improvements fix: session-analysis improvements from 89 real-world sessions	2026-03-27 10:17:44 -05:00
Kevin Turcios	61c393e7ed	ci: add actions:read permission for CI status checks The claude-code-action MCP server requires 'actions: read' to enable CI status check functionality. Without it, the server is skipped with a warning.	2026-03-27 10:16:16 -05:00
Kevin Turcios	24ffa83bbf	merge: resolve conflicts with main (guard, git history, stuck recovery) Merge origin/main which added guard commands, git history review step, stuck state recovery, batched setup questions, and config audit steps. Resolved 5 conflicts by keeping both: - Our git-add-specific-files + pre-commit rules applied to the new renumbered commit steps (15 instead of 12, etc.) - Upstream's Record, Config audit, Guard steps preserved - Router keeps both AUTONOMOUS MODE and batch-questions rules - Router start steps merged: our branch verification + multi-repo detection integrated into upstream's batched-questions flow	2026-03-27 10:15:10 -05:00
Kevin Turcios	ce02fdee29	fix: add .codeflash/ gitignore and session cleanup workflow - Setup agent now ensures .codeflash/ is in .gitignore before writing session state files (prevents accidental commits of profiling artifacts) - Router agent gets a Cleanup section: preserves learnings.md and results.tsv across sessions, deletes transient files (HANDOFF.md, setup.md, conventions.md, bench scripts), removes agent-memory dir	2026-03-27 10:09:51 -05:00

1 2

66 commits