Commit graph

34 commits

Author SHA1 Message Date
Kevin Turcios
ffadf16147
chore: add standup dashboard with CI audit integration (#36)
Dash app at .codeflash/standups/ for weekly eng meetings. Pulls live PR data across 4 org repos, renders markdown standup notes, integrates CI audit report with corrected billing numbers from real GitHub API data. Deployed to Plotly Cloud.
2026-04-23 18:52:33 -05:00
Kevin Turcios
3ee9c22c8e
fix: resolve all ruff lint errors across repo (#38)
* fix: resolve all ruff lint errors across repo

Auto-fixed 31 errors (unused imports, formatting, simplifications).
Manually fixed 14 remaining:
- EXE001: removed shebangs from non-executable bench scripts
- C417: replaced map(lambda) with generator expression
- C901/PLR0915: extracted _write_and_instrument_tests from generate_ai_tests
- C901/PLR0912: extracted _parse_toml_addopts and _ini_section_name from modify_addopts
- RUF001/RUF002: replaced ambiguous Unicode chars (en dash, multiplication sign)
- FBT002: made boolean params keyword-only in report functions
- E402: moved `import re` to top of file in security reports

* fix: resolve pre-existing mypy errors across packages

- _testgen.py: annotate `generated` as `str` to avoid no-any-return
- _test_runner.py: use str() for TimeoutExpired stdout/stderr (bytes|str),
  remove unused type: ignore on proc.kill()
- _candidate_eval.py: annotate `speedup` as `float` to avoid no-any-return
  from lazy-loaded performance_gain
2026-04-23 10:22:42 -05:00
Kevin Turcios
9e893675c9 Add Plotly Cloud deployment config for CI audit report 2026-04-23 03:59:35 -05:00
Kevin Turcios
c492164fbf Add codeflash org CI audit case study and interactive Dash report
Case study in .codeflash/krrt7/codeflash-ai/ci-audit/ with README,
status, and raw data (fork activity, PRs merged).

Interactive Dash report in reports/codeflash-ci-audit/ with two tabs:
Executive Summary (hero metrics, cost impact charts, before/after) and
Full Detail (fork breakdown, findings table, PR inventory, methodology).

Key numbers: 71% fewer workflow runs, ~$12K/yr in Enterprise overage
savings, 200+ forks disabled, 11 PRs merged across 2 repos.
2026-04-23 03:56:04 -05:00
Kevin Turcios
e3e74c3f2e Add missing pyproject.toml for codeflash-ci-audit workspace member 2026-04-23 03:34:03 -05:00
misrasaurabh1
cf67686b29 add codeflash-agent case study md 2026-04-19 19:59:23 -07:00
misrasaurabh1
d33e82b647 Merge remote-tracking branch 'origin/main'
# Conflicts:
#	reports/unstructured/engagement_report.py
2026-04-19 19:58:37 -07:00
misrasaurabh1
42debefd12 remove lightspeed canvas animation from report 2026-04-19 19:57:59 -07:00
Kevin Turcios
d63bb51800 Revert timeline phase duration back to 6 weeks 2026-04-19 03:28:46 -05:00
Kevin Turcios
c9a8e9b1ea Fix engagement duration from 6 weeks to 2 months 2026-04-19 03:28:06 -05:00
Kevin Turcios
0e248b865f no light speed anims 2026-04-19 03:24:41 -05:00
Kevin Turcios
a4276d658a Refine engagement report and case study for executive review
- Hero metrics: -89% cost, -52% peak memory, flat scaling, -12.9% latency
- Add lightspeed canvas animation via assets/lightspeed.js for Plotly Cloud
- Add platform-libs CI/CD migration to timeline (Phase 1b) with PR links
- Update next-engagement card with POC branch and PR references
- Replace RSS with peak memory in user-facing copy
- Add flat memory scaling to case study results table
2026-04-16 17:51:54 -05:00
Kevin Turcios
380bd59503 Add iterative-discovery narrative and missing findings across all reports
Weave "optimizations reveal deeper issues" framing into engagement report
executive summary, case study, and optimization README. Add O(N²) text
extraction fix, per-request RSS creep (24→17 MB), and memray profiling
data that were previously undocumented.
2026-04-16 15:02:39 -05:00
Kevin Turcios
6d05aea09c Revamp engagement report layout and timeline for executive clarity
- Move Infrastructure Cost Impact above hero metrics and tab toggle
- Extract shared above-fold content into _above_fold_content() for /jpc parity
- Replace plotly Gantt chart with pure-HTML vertical timeline
- Fix cross-browser flex layout (explicit flex: 1 1 0%, minWidth: 0)
- Remove redundant "The Results" and "How This Was Tested" sections
- Rename Engineering Team → Engineering Details
- Rename Peak RSS → Peak Memory Usage
- Update timeline dates: 1-week buffer after Phase 1, cascade phases
- Rename section headers: Vertical Optimization Roadmap, Proposed Next Engagement
2026-04-16 14:31:32 -05:00
Kevin Turcios
3e63326876 Add standalone security audit app for Plotly Cloud deployment
Separate deployment at https://19727fbf-a6a0-45ac-968f-680035ab6b3b.plotly.app
with its own pyproject.toml, lockfile, and plotly-cloud.toml config.
2026-04-16 06:18:33 -05:00
Kevin Turcios
514c1e28c9 Tailor security report for Lawrence, add UX improvements and talking points
- Rewrite executive summary to reference his PR #1465 lockfile fix and
  existing tooling (Renovate, Anchore, Chainguard)
- Reorder findings by category priority (supply chain > container > CI/CD)
  to lead with what matters most to the audience
- Add animated parallelogram background matching codeflash.ai aesthetic
- 6 research-backed UX changes: severity icons (WCAG 1.4.1), title-first
  cards (F-pattern), loss-framed 85% CTA, distinct status colors, card
  opacity for figure-ground separation
- Correct SEC-021 from 67% to 97% mutable Action pins per VM verification
  (only 2 of 96 SHA-pinned in core-product)
- Add talking-points-lawrence.md with profile, pain points, pitch strategy
2026-04-16 06:01:52 -05:00
Kevin Turcios
8c42f27eed Add 4-tab navigation to security audit report
Split the 39-finding wall into tabbed views matching the engagement
report pattern: Summary, Critical & High (21), Medium & Low (18),
and By Category with both category and repository breakdowns.
2026-04-16 05:05:32 -05:00
Kevin Turcios
3dc58775e3 Consolidate report into 4-tab view and clean up for production
- Replace Executive Brief with JPC Summary as default tab (Executive Summary)
- Add Timeline as 4th tab; standalone /jpc and /timeline routes preserved
- Remove dead code: build_exec_view, make_k8s_chart, unused latency vars
- Extract _logo_lockup helper, _TAB_BTN_STYLE constants to reduce duplication
- Use app.layout as function, env-configurable debug/port, update docstring
2026-04-16 04:48:16 -05:00
Kevin Turcios
c22c5babd1 Organize screenshots by date and session
- 2026-04-15: exec restructure, team view, engagements
- 2026-04-16-methodology: methodology notes across all views
- 2026-04-16-jpc: standalone JPC summary and route verification
- 2026-04-16-timeline: timeline iterations (reordering, date fixes, chart tuning)
2026-04-16 03:49:15 -05:00
Kevin Turcios
c3e7dba47b Add report screenshots to reports/unstructured/screenshots/ 2026-04-16 03:48:14 -05:00
Kevin Turcios
b20c05a799 Add /timeline route with proposed engagement roadmap
- Gantt chart with 5 phases: Core-Product (completed), DevEx & CI/CD,
  Platform API, Security Hardening (concurrent with DevEx), Cost Discovery
- Phase detail cards with duration, dates, deliverables, dependencies
- DevEx as Phase 2 (POC already done, sets up faster CI for Phase 3)
- Security runs concurrent with Phase 2 (uv workspace enables lockfile)
- Investment summary with ~5 month total timeline
- Fixed x-axis range and removed rangeslider for clean proportional bars
2026-04-16 03:46:50 -05:00
Kevin Turcios
90091ccc12 Add /jpc standalone summary route and methodology notes
- Add build_jpc_view() with clean standalone layout at /jpc for JPC
  (no tabs, no hero — just the document that "stands on its own")
- Add URL routing via dcc.Location: / serves full report, /jpc serves summary
- Add methodology notes to exec view (How This Was Tested annotations)
- Add methodology notes to detail view (7-entry "why" card)
- Enrich team view Memory + Standalone vs. Cumulative explanations
2026-04-16 03:07:33 -05:00
Kevin Turcios
2da186d4df Apply learnings to team + detail views, remove redundancy
Team view:
- Add Engineering Impact Summary at top (4 metrics: memory, density,
  latency, idle vCPU) with pointer to sections below
- Remove Production Context card (redundant with Impact Summary)
- Trim memory table to only metrics not shown in chart (RSS per
  request, K8s allocation) — chart already shows pre/post/delta
- Fix "10-page scan" → "10-page scanned document" in methodology

Detail view:
- Add intro callout explaining this is the raw data backing the
  other two views
2026-04-16 02:46:01 -05:00
Kevin Turcios
c1b603afc4 Fix technical terminology in exec brief
- "CFS quota" → "1-CPU limit" (CFS is implementation detail, too
  technical for exec audience)
- "jemalloc" → "jemalloc, opt-in for 1-CPU pods" (missed instance)
- "requests 1 CPU / 32 GB RAM resource requests" → "per pod" (double
  "requests" was grammatically broken)
- "10-page scan" → "10-page scanned document" (consistent with
  workload profiles section)
2026-04-16 02:41:17 -05:00
Kevin Turcios
2c3aad4325 Restructure exec view: enablement-first flow for JPC audience
Reorder based on persuasion research (Three-Talk Model, Prospect
Theory, Kotter):

1. "The Engagement" — collaborative shared context (team talk)
2. "What This Enables" — loss-framed enablement: 9.2x pod density,
   41 idle vCPUs now available, -12.9% latency for agentic API
3. "The Results" — before/after proof of execution
4. Infrastructure Cost Impact (anchored on $100K/mo)
5. Workload Profiles + Methodology (credibility)
6. Delivered + Proposed Next Engagements

Key shift: lead with what the work unlocks (feature velocity,
platform capacity, API speed) rather than the technical achievement
(memory reduction). Cost savings is proof of execution, not the
headline.
2026-04-16 02:36:29 -05:00
Kevin Turcios
6143c38d78 Move workload profile explanations into Executive Brief
The 1p/10p/16p benchmark rationale belongs in the exec view — JPC
needs to understand that page count != workload before seeing the
numbers. Added "Benchmark Workload Profiles" section before "How This
Was Tested" with the three profiles and the data punchline (#1505 at
-32.6% on 1 page vs -7.4% on 16 pages).
2026-04-16 02:32:35 -05:00
Kevin Turcios
eeebf6eec2 Add workload profile explanations to latency benchmark table
The 1p/10p/16p column headers weren't self-explanatory. Added a
"Benchmark Workload Profiles" card above the latency table in the
Detail view explaining that each document tests a distinct workload
shape (table-dense, scanned, mixed), not just different page counts.

Also added annotation below the table calling out that #1505 has 4x
the impact on the 1-page doc vs. the 16-page doc — letting the data
demonstrate that per-document cost depends on content, not page count.
2026-04-16 02:27:00 -05:00
Kevin Turcios
ddb4cf8258 Update engagement report: reframe for JPC audience, fix technical inaccuracies
- Reframe Future Engagements → Proposed Next Engagements based on
  Crag meeting: lead with Platform API speed/stability, add
  Infrastructure Cost Discovery ($100K/mo), remove Codeflash product
  pitch
- Add Broader Context callout after cost section (core-product = ~10%
  of total Azure spend)
- Fix Knative terminology throughout: "Knative pods" → "pods with a
  1-CPU resource request" (CFS quota, not Knative config)
- Fix CPU detection description: three-tier logic (cgroup v2 cpu.max →
  sched_getaffinity → os.cpu_count, take minimum)
- Clarify jemalloc is opt-in (MALLOC_IMPL=jemalloc), 1-CPU serial OCR
  only; multi-CPU pods should use glibc default due to ~50 MB/process
  arena overhead
2026-04-16 02:11:27 -05:00
Kevin Turcios
9102d14a00 continue 2026-04-16 02:00:33 -05:00
Kevin Turcios
e65b8a3564 Add security audit report and infrastructure cost analysis
Standalone security report (security_report.py) covering 6 supply chain
and build pipeline findings from the performance engagement. Add infra
cost section to exec view showing $10K → $1.1K/mo projection based on
D48s_v5 node packing at 4 GB vs 32 GB per pod.
2026-04-15 18:22:07 -05:00
Kevin Turcios
f8281a24a0 Update engagement_report.py 2026-04-15 13:27:43 -05:00
Kevin Turcios
49a7d586d4 Update engagement_report.py 2026-04-15 13:26:04 -05:00
Kevin Turcios
87a906e704
Update Unstructured engagement report (#25)
* Update engagement report: add logos, grid theme, scope to core-product

- Add Codeflash x Unstructured logo lockup in hero and footer
- Apply roadmap grid pattern (48px, 5% opacity) and zinc-900 background
- Update cards to rounded-2xl with semi-transparent zinc-900/50 bg
- Remove all platform-libs, CI/CD, and security audit sections
- Remove stacked optimizations PR #1500 from open PRs
- Update data to latest FastAPI endpoint measurements
- Filter PR tables to core-product only

* Add methodology section to team view, fix DataTable type safety

Add benchmark environment, measurement protocol, and production
context cards to the top of the Engineering Team view. Split
TABLE_STYLE into individually typed constants (TABLE_HEADER,
TABLE_CELL, TABLE_DATA, TABLE_DATA_CONDITIONAL, TABLE_WRAP) so
DataTable kwargs pass ty and mypy strict checks.

* Add engagement report screenshot assets

* Add PRs from unstructured, unstructured-inference, unstructured-od-models

Expand report scope beyond core-product: 14 new merged PRs and 2 new
open PRs across 3 additional repos. Update PR counts (24 merged, 5 in
progress), add Repo column to detail view tables, update subtitle and
meta description.

* Make PR numbers clickable links in detail view tables

Use DataTable markdown columns with link_target=_blank so PR numbers
link to their GitHub PRs. Add REPO_BASES mapping for per-repo URL
resolution. Override default purple link color with blue (#60a5fa)
to stay readable on the dark background.

* main

* Add Future Engagements section with notes panels to exec view

Prominent banner heading, four numbered cards (CI/CD, Security, Runtime,
Product Integration) each with a right-hand Notes panel for discussion
points. Refactored _next_card helper to accept optional notes parameter.
2026-04-15 13:11:28 -05:00
Kevin Turcios
33faedf427
Add Unstructured report, rewrite statusline, format evals/scripts (#20)
* Add Unstructured engagement report as uv workspace member

Three-tier Plotly Dash app (Executive Brief, Engineering Team, Full
Detail) with data in JSON, theme constants in theme.py, and Dash
production improvements (Google Fonts, clientside callbacks, meta tags).

Also: add .playwright-mcp/ to .gitignore, add reports/* ruff overrides,
remove tracked .codeflash/observability/read-tracker.

* Rewrite statusline to derive context from git state

Detects active area from changed files (reports, packages, plugin,
.codeflash, case-studies, evals), falls back to branch name convention
(perf/*, feat/*, fix/*), shows dirty indicator. Uses whoami for
cross-platform user detection.

* Add pre-push lint rule to commit guidelines

* Exclude .codeflash/ from ruff linting

Benchmark and profiling scripts in .codeflash/ are scratch work, not
package source. Excluding them prevents CI failures from ad-hoc scripts.

* Run ruff format across packages, scripts, evals, and plugin refs

* Fix github-app async test failures in CI

Add asyncio_mode = "auto" to root pytest config so async tests
are detected when running from the repo root via uv run pytest packages/.
2026-04-15 03:06:16 -05:00