* Fix Dependabot security updates and bump GitPython to 3.1.47+
Dependabot's uv ecosystem resolver was inferring Python 3.9 from the
workspace root's requires-python, then failing because sub-packages
require >=3.12. Adding .python-version=3.12 tells the resolver to use
a compatible Python. Also bumps gitpython>=3.1.47 to resolve the two
open security advisories (GHSA unsafe option check, command injection).
* Bump codeflash-core and codeflash-python versions for release
* feat(blackbox): add package with models, CLI, and HTMX dashboard
* test(blackbox): add comprehensive test coverage for dashboard
* feat(blackbox): cache session scanning via watcher invalidation
* docs(blackbox): add README and use fastapi[standard] for dev server
* refactor(blackbox): extract presentation logic into formatter classes
* refactor(blackbox): extract classify_error helpers
* feat(blackbox): wire analytics into session detail view
Show token usage, tool breakdowns, and session stats in a
collapsible panel when viewing a session.
* feat(blackbox): add codeflash plugin detection
Detect codeflash agent names, skills, and commands in transcripts.
Surface language, optimization domain, and capability badges in
the analytics panel.
* refactor(blackbox): remove underscore prefixes from internal functions
* chore: add ty python-version to root pyproject.toml
* chore(blackbox): fix lint errors in test files
* style(blackbox): apply ruff formatting to analytics
* feat(blackbox): add Playwright E2E tests for dashboard
Refactor app.py to expose create_app() factory accepting a projects_dir
override, enabling tests to run against fixture data instead of the real
~/.claude/projects/ directory. Routes now read projects_dir from
app.state instead of the module-level constant.
Add 26 Playwright tests across 5 files covering dashboard loading,
session list, session detail with filters and analytics, sidebar
collapse/localStorage persistence, and SSE log streaming. All tests
pass on chromium, firefox, and webkit (78 total).
CI gets a new e2e-blackbox job with a browser matrix strategy running
all three engines in parallel, conditional on blackbox path changes,
with trace upload on failure.
* fix(ci): sync only blackbox package in e2e job
* fix(ci): exclude e2e tests from unit test job
The test job doesn't install Playwright browsers, so e2e tests error
when pytest collects them. Ignore tests/e2e/ directories in the test
job — those are handled by the dedicated e2e-blackbox job.
The 12 DB integration tests in codeflash-api need testcontainers to spin
up a real PostgreSQL instance via Docker. Was already declared in the
package's own dev deps but missing from the root workspace.
The package was a workspace member but not listed in the root dev
group, so its tests couldn't import codeflash_api when running
from the monorepo root.
12 tests covering all Queries methods against a real PostgreSQL
instance via testcontainers. Automatically skipped when Docker is
unavailable. Tests: api key lookup, last_used update, organization
fetch, subscription CRUD, usage increment, cumulative increments.
FastAPI app factory with lifespan, CORS, optional Sentry. Pydantic-settings
config for all env vars. Full directory structure for all 15 endpoints per
the architecture doc. Workspace integration: ruff src paths, isort, pytest
testpaths, per-file ignores. aiohttp for production, httpx for test client.
Point attrs dependency at local fork (KRRT7/attrs perf/defer-inspect-import)
which defers the ~12ms inspect import until first class build. Temporary
override until upstream merges python-attrs/attrs#1547.
Also adds attrs optimization case study data (VM infra, status).
* Update engagement report: add logos, grid theme, scope to core-product
- Add Codeflash x Unstructured logo lockup in hero and footer
- Apply roadmap grid pattern (48px, 5% opacity) and zinc-900 background
- Update cards to rounded-2xl with semi-transparent zinc-900/50 bg
- Remove all platform-libs, CI/CD, and security audit sections
- Remove stacked optimizations PR #1500 from open PRs
- Update data to latest FastAPI endpoint measurements
- Filter PR tables to core-product only
* Add methodology section to team view, fix DataTable type safety
Add benchmark environment, measurement protocol, and production
context cards to the top of the Engineering Team view. Split
TABLE_STYLE into individually typed constants (TABLE_HEADER,
TABLE_CELL, TABLE_DATA, TABLE_DATA_CONDITIONAL, TABLE_WRAP) so
DataTable kwargs pass ty and mypy strict checks.
* Add engagement report screenshot assets
* Add PRs from unstructured, unstructured-inference, unstructured-od-models
Expand report scope beyond core-product: 14 new merged PRs and 2 new
open PRs across 3 additional repos. Update PR counts (24 merged, 5 in
progress), add Repo column to detail view tables, update subtitle and
meta description.
* Make PR numbers clickable links in detail view tables
Use DataTable markdown columns with link_target=_blank so PR numbers
link to their GitHub PRs. Add REPO_BASES mapping for per-repo URL
resolution. Override default purple link color with blue (#60a5fa)
to stay readable on the dark background.
* main
* Add Future Engagements section with notes panels to exec view
Prominent banner heading, four numbered cards (CI/CD, Security, Runtime,
Product Integration) each with a right-hand Notes panel for discussion
points. Refactored _next_card helper to accept optional notes parameter.
* Add Unstructured engagement report as uv workspace member
Three-tier Plotly Dash app (Executive Brief, Engineering Team, Full
Detail) with data in JSON, theme constants in theme.py, and Dash
production improvements (Google Fonts, clientside callbacks, meta tags).
Also: add .playwright-mcp/ to .gitignore, add reports/* ruff overrides,
remove tracked .codeflash/observability/read-tracker.
* Rewrite statusline to derive context from git state
Detects active area from changed files (reports, packages, plugin,
.codeflash, case-studies, evals), falls back to branch name convention
(perf/*, feat/*, fix/*), shows dirty indicator. Uses whoami for
cross-platform user detection.
* Add pre-push lint rule to commit guidelines
* Exclude .codeflash/ from ruff linting
Benchmark and profiling scripts in .codeflash/ are scratch work, not
package source. Excluding them prevents CI failures from ad-hoc scripts.
* Run ruff format across packages, scripts, evals, and plugin refs
* Fix github-app async test failures in CI
Add asyncio_mode = "auto" to root pytest config so async tests
are detected when running from the repo root via uv run pytest packages/.