codeflash-agent/packages/blackbox
Kevin Turcios 41edcf06e1
perf(transcript): cache fromisoformat, single-pass parsing (#45)
* perf(transcript): cache fromisoformat, local json.loads, single-pass parsing

Move datetime import to module level with cached fromisoformat,
use bytes split for JSONL, inline tool_result iteration in parse_user_entry,
promote decode_project_name filter set to module-level frozenset.

* fix: use splitlines for JSONL parsing

split("\n") leaves \r on lines from Windows-originated JSONL files,
which can cause json.loads failures. splitlines() handles all line
ending variants.

* fix: add noqa C901 for inlined parse_user_entry

The tool_result iteration was inlined for single-pass performance,
which pushes complexity above the C901 threshold.

* Add blackbox benchmark VM infra

D2s_v5 (non-burstable, 2 vCPU, 8 GB) with cloud-init provisioning,
CPU-pinned benchmarks, and A/B comparison scripts.

---------

Co-authored-by: codeflash[bot] <codeflash[bot]@users.noreply.github.com>
2026-04-29 03:22:44 -05:00
..
src/blackbox perf(transcript): cache fromisoformat, single-pass parsing (#45) 2026-04-29 03:22:44 -05:00
tests Add blackbox package: session flight recorder with HTMX dashboard (#39) 2026-04-28 19:58:43 -05:00
pyproject.toml Add blackbox package: session flight recorder with HTMX dashboard (#39) 2026-04-28 19:58:43 -05:00
README.md Add blackbox package: session flight recorder with HTMX dashboard (#39) 2026-04-28 19:58:43 -05:00

blackbox

A flight data recorder for AI coding agent sessions.

Why "blackbox"?

Aircraft carry black boxes (flight data recorders) that silently capture everything during a flight, then become invaluable when you need to understand what happened. This package does the same for AI coding agent sessions: it watches, records, and lets you replay what the agent did, how it spent tokens, where it got stuck, and whether the session achieved its goal.

Currently supports Claude Code. Codex and Gemini support is planned.

What it does

Dashboard -- a local HTMX web UI for browsing session transcripts in real time.

  • Sidebar with all sessions from ~/.claude/projects/, sorted by recency
  • Live session detection via filesystem watching (green dot indicator)
  • Streaming log view with filter presets (all, compact, important, errors)
  • Tool call previews, error highlighting, user message formatting

Analytics models -- structured data types for session-level metrics, weekly trends, project breakdowns, and recommendations. These feed into the analysis pipeline (in progress) that will produce session digests and surface patterns across sessions.

Usage

blackbox serve              # open dashboard at http://localhost:7100
blackbox serve --port 8080  # custom port
blackbox serve --no-open    # don't auto-open browser

Package structure

src/blackbox/
  cli.py              # CLI entry point (serve command)
  models.py           # All domain models (attrs frozen classes)
  dashboard/
    app.py            # FastAPI instance + lifespan
    routes.py         # API endpoints + SSE log streaming
    rendering.py      # HTML rendering, filtering, formatting
    transcript.py     # JSONL transcript parser + session scanner
    watcher.py        # Watchdog-based live session detection + cache
    templates/        # Jinja2 templates (Tailwind + HTMX)

Development

uv sync
uv run fastapi dev src/blackbox/dashboard/app.py  # hot reload on :8000
uv run pytest tests/ -v
uv run ruff check src/ tests/