codeflash-agent

mirror of https://github.com/codeflash-ai/codeflash-agent.git synced 2026-05-04 18:25:19 +00:00

Author	SHA1	Message	Date
Kevin Turcios	e7fdae0db6	Add blackbox benchmark VM infra D2s_v5 (non-burstable, 2 vCPU, 8 GB) with cloud-init provisioning, CPU-pinned benchmarks, and A/B comparison scripts.	2026-04-29 03:12:19 -05:00
Kevin Turcios	ffadf16147	chore: add standup dashboard with CI audit integration (#36 ) Dash app at .codeflash/standups/ for weekly eng meetings. Pulls live PR data across 4 org repos, renders markdown standup notes, integrates CI audit report with corrected billing numbers from real GitHub API data. Deployed to Plotly Cloud.	2026-04-23 18:52:33 -05:00
Kevin Turcios	3ee9c22c8e	fix: resolve all ruff lint errors across repo (#38 ) * fix: resolve all ruff lint errors across repo Auto-fixed 31 errors (unused imports, formatting, simplifications). Manually fixed 14 remaining: - EXE001: removed shebangs from non-executable bench scripts - C417: replaced map(lambda) with generator expression - C901/PLR0915: extracted _write_and_instrument_tests from generate_ai_tests - C901/PLR0912: extracted _parse_toml_addopts and _ini_section_name from modify_addopts - RUF001/RUF002: replaced ambiguous Unicode chars (en dash, multiplication sign) - FBT002: made boolean params keyword-only in report functions - E402: moved `import re` to top of file in security reports * fix: resolve pre-existing mypy errors across packages - _testgen.py: annotate `generated` as `str` to avoid no-any-return - _test_runner.py: use str() for TimeoutExpired stdout/stderr (bytes\|str), remove unused type: ignore on proc.kill() - _candidate_eval.py: annotate `speedup` as `float` to avoid no-any-return from lazy-loaded performance_gain	2026-04-23 10:22:42 -05:00
Kevin Turcios	c492164fbf	Add codeflash org CI audit case study and interactive Dash report Case study in .codeflash/krrt7/codeflash-ai/ci-audit/ with README, status, and raw data (fork activity, PRs merged). Interactive Dash report in reports/codeflash-ci-audit/ with two tabs: Executive Summary (hero metrics, cost impact charts, before/after) and Full Detail (fork breakdown, findings table, PR inventory, methodology). Key numbers: 71% fewer workflow runs, ~$12K/yr in Enterprise overage savings, 200+ forks disabled, 11 PRs merged across 2 repos.	2026-04-23 03:56:04 -05:00
Kevin Turcios	0901db9fee	Update coveragepy status after E2E validation session	2026-04-21 21:19:24 -05:00
Kevin Turcios	edfdd231e0	Use attrs fork with deferred inspect import Point attrs dependency at local fork (KRRT7/attrs perf/defer-inspect-import) which defers the ~12ms inspect import until first class build. Temporary override until upstream merges python-attrs/attrs#1547. Also adds attrs optimization case study data (VM infra, status).	2026-04-21 02:27:50 -05:00
Kevin Turcios	b42417532d	Add optimization project scaffolding for plotly/plotly.py	2026-04-16 23:57:06 -05:00
Kevin Turcios	380bd59503	Add iterative-discovery narrative and missing findings across all reports Weave "optimizations reveal deeper issues" framing into engagement report executive summary, case study, and optimization README. Add O(N²) text extraction fix, per-request RSS creep (24→17 MB), and memray profiling data that were previously undocumented.	2026-04-16 15:02:39 -05:00
Kevin Turcios	20f6c59f05	Lint and format entire repo, not just packages (#23 ) Remove .codeflash/ from ruff extend-exclude, add per-file ignores for .codeflash/, scripts/, evals/, and plugin/ (benchmark/script patterns like print, eval, magic values). Remove shebangs. Widen pre-commit hooks to check the full repo.	2026-04-15 03:16:15 -05:00
Kevin Turcios	33faedf427	Add Unstructured report, rewrite statusline, format evals/scripts (#20 ) * Add Unstructured engagement report as uv workspace member Three-tier Plotly Dash app (Executive Brief, Engineering Team, Full Detail) with data in JSON, theme constants in theme.py, and Dash production improvements (Google Fonts, clientside callbacks, meta tags). Also: add .playwright-mcp/ to .gitignore, add reports/* ruff overrides, remove tracked .codeflash/observability/read-tracker. * Rewrite statusline to derive context from git state Detects active area from changed files (reports, packages, plugin, .codeflash, case-studies, evals), falls back to branch name convention (perf/, feat/, fix/), shows dirty indicator. Uses whoami for cross-platform user detection. Add pre-push lint rule to commit guidelines * Exclude .codeflash/ from ruff linting Benchmark and profiling scripts in .codeflash/ are scratch work, not package source. Excluding them prevents CI failures from ad-hoc scripts. * Run ruff format across packages, scripts, evals, and plugin refs * Fix github-app async test failures in CI Add asyncio_mode = "auto" to root pytest config so async tests are detected when running from the repo root via uv run pytest packages/.	2026-04-15 03:06:16 -05:00
Kevin Turcios	7d86202524	Update metaflow README with actual results and PR status (#19 ) Replace placeholder text ("No optimizations applied yet", empty PR table) with: - CAS lz4 compression results (7-18x on realistic ML payloads) - Upstream PR status (Netflix/metaflow#3090, open) - Open questions on dependency management and forward compat - Methodology, remaining targets, and lessons learned	2026-04-14 23:41:55 -05:00
Kevin Turcios	09ba9b44b2	Add typeagent-py case study (#17 ) - Add case-studies/microsoft/typeagent/summary.md with results, lessons learned (failed vector search experiment, maintainer alignment), and takeaways for codeflash - Update upstream PR statuses: #235 merged, #236 closed (rejected), #232 blocked on #230 - Add typeagent to main README results table	2026-04-14 23:25:29 -05:00
Kevin Turcios	6dd3b02168	Restructure typeagent README: separate failed vector search experiment (#16 ) Move vector search benchmarks out of main results into a Lessons Learned section. The 3.7x-14.2x numbers were real but on a non-bottleneck — maintainer confirmed model API calls and SQL dominate real latency. Results section now only shows legitimate wins: import time (1.16x), indexing pipeline (1.14-1.16x), and query batching (2.10-2.62x).	2026-04-14 23:21:53 -05:00
Kevin Turcios	cc29a27289	Migrate .codeflash/ to {teammember}/{org}/{project}/ format (#15 ) Add team member dimension to case study paths so multiple contributors can track optimization data independently. Derives member from git config user.name in session-start hooks. - Move all case studies under .codeflash/krrt7/ - Rename pypa/pip → python/pip (org grouping) - Update session-start hooks, docs, scripts, and references	2026-04-14 23:04:34 -05:00
m-ali-24	044b2f190a	[FEAT] golang agents (#11 ) * go base * missing javascript --------- Co-authored-by: ali <--global>	2026-04-14 18:55:36 -05:00
Kevin Turcios	043bf45415	Ignore .lprof and .prof binary files, update read-tracker	2026-04-14 18:42:38 -05:00
Kevin Turcios	9830b7b4a1	Track .codeflash/ data: unignore observability and add krrt7/odoo case study	2026-04-14 18:40:08 -05:00
Kevin Turcios	3b59d97647	squash	2026-04-13 14:12:17 -05:00

18 commits