codeflash-agent/plugin/references/shared/team-structure.md
Kevin Turcios cc29a27289
Migrate .codeflash/ to {teammember}/{org}/{project}/ format (#15)
Add team member dimension to case study paths so multiple contributors
can track optimization data independently. Derives member from
git config user.name in session-start hooks.

- Move all case studies under .codeflash/krrt7/
- Rename pypa/pip → python/pip (org grouping)
- Update session-start hooks, docs, scripts, and references
2026-04-14 23:04:34 -05:00

8.1 KiB

Team Structure Rules

This document defines optimal team configurations for different optimization and research tasks. Use these guidelines to minimize token costs while maximizing parallel productivity.

Core Principles

  • Token cost scales linearly: Each teammate adds ~15k baseline tokens (MCP servers, CLAUDE.md, skill descriptions) + work tokens
  • Coordination overhead compounds: More teammates = more SendMessage traffic, more task dependencies, more handoff complexity
  • Optimal range: 3-5 teammates. Beyond 5, coordination overhead exceeds productivity gains
  • Separation of duties: Lead coordinates, teammates execute. Lead should NOT implement while teammates work

Team Configurations by Task Type

Single Optimization (Profile → Implement → Benchmark)

Recommended: 3-person team

Team composition:
- Optimizer (profile + implement)
- Benchmarker (VM benchmarking + variance validation)
- Reviewer (code quality, test coverage, upstream readiness)

Total cost: ~45k tokens/session (15k baseline * 3)

Workflow:
1. Lead: TaskCreate("Profile", owner: "optimizer")
2. Optimizer: Profiles, finds hotspots, updates MEMORY.md
3. Lead: Reviews, approves optimization approach
4. Optimizer: Implements on perf/* branch
5. Benchmarker: Validates on VM, records results
6. Reviewer: Final code review before upstream PR

Why this works:

  • Sequential dependency chain (profile → implement → benchmark → review)
  • Each teammate owns one phase end-to-end
  • Lead coordinates but doesn't implement (saves context)
  • Minimal parallelism overhead (tasks depend on previous completion)

Multiple Independent Optimizations (3+ Projects)

Recommended: 4-5 person team

Team composition:
- Lead (coordinator, decision-maker, synthesizer)
- Optimizer-A (typeagent optimizations)
- Optimizer-B (core-product optimizations)
- Optimizer-C (metaflow optimizations)
[Optional 5th]: Benchmarker-shared (validates all 3)

Total cost: ~60-75k tokens/session

Workflow:
1. Lead: TaskCreate 3 independent profiling tasks, one per optimizer
2. Optimizer-A/B/C: Profile in parallel (no blocking)
3. Lead: Reviews all 3, decides which to pursue
4. Lead: Assign implementation tasks (can overlap)
5. Benchmarker: Validates all implementations sequentially or in batches

Why this works:

  • True parallelism (3 optimizers work on different projects simultaneously)
  • Lead context stays clean (only reviews, doesn't implement)
  • Benchmarker becomes shared bottleneck but acceptable (sequential anyway)
  • Token savings: Parallel work saves ~80k vs. lead doing all 3 sequentially

Cost calculation:

  • Lead-only approach: 200k tokens (profile A + implement A + benchmark A + ... repeat 3x)
  • Team approach: 65k tokens (15k*3 baseline + 50k work distributed)
  • Savings: 135k tokens (67%)

New Feature Across Layers (Frontend/Backend/Tests)

Recommended: 4-person team

Team composition:
- Lead (architect, synthesizer)
- Frontend specialist
- Backend specialist
- Test/integration specialist

Total cost: ~60k tokens/session

Workflow:
1. Lead: Define architecture, create 3 independent tasks
2. Frontend/Backend/Test: Work in parallel on their layers
3. Lead: Synthesizes, checks for integration issues
4. Lead: Creates follow-up tasks if coordination needed

Why this works:

  • Each teammate owns one layer (no conflicts)
  • Truly parallel (no blocking between front/back/tests)
  • Lead reviews integration points, doesn't implement
  • Clear boundaries (frontend owns UI, backend owns logic, test owns coverage)

Research/Architecture Exploration

Recommended: 3-5 teammates with competing hypotheses

Team composition (example):
- Lead (synthesizer)
- Performance specialist (explore caching angle)
- Architecture specialist (explore restructuring angle)
- Implementation specialist (explore code generation angle)
[Optional 5th]: Skeptic (devil's advocate, challenges assumptions)

Total cost: ~60-75k tokens/session

Workflow:
1. Lead: Create 3 competing hypothesis tasks
2. Specialists: Explore independently, update MEMORY.md with findings
3. Lead: Reads all MEMORY.md files, synthesizes pros/cons
4. Lead: Decides which approach to pursue

Why this works:

  • Parallel exploration of competing ideas (fast decision-making)
  • Each teammate's MEMORY.md captures detailed reasoning
  • Lead sees full decision space without doing work themselves
  • Skeptic ensures assumptions are tested

Anti-Patterns (What NOT to Do)

Anti-Pattern 1: Lead + Teammates Both Implementing

Wrong:

Lead: Starts implementing feature-A on main branch
Teammate-1: Assigned to optimize feature-B
Teammate-2: Assigned to review changes
Result: Lead context bloated, misses teammate updates, tasks pile up

Fix: Lead coordinates, teammates execute. If optimization too simple for team, use single session + subagents instead.

Anti-Pattern 2: Sequential Tasks to Same Teammate

Wrong:

TaskCreate("Profile", owner: "optimizer")
TaskCreate("Implement", owner: "optimizer", blockedBy: "profile")
TaskCreate("Benchmark", owner: "optimizer", blockedBy: "implement")
Result: 3 tasks done sequentially, no parallelism benefit, wasted team overhead

Fix: Use single session instead. Team adds 45k token overhead for no parallelism benefit.

Anti-Pattern 3: Overly Large Team (8+ Teammates)

Wrong:

10 teammates all working independently
Result: Lead drowning in SendMessage traffic, task list unmanageable, coordination overhead kills productivity

Fix: Cap at 5. If task is larger, break into phases and use multiple teams sequentially.

Anti-Pattern 4: No Lead Approval Gate

Wrong:

Optimizer autonomously implements batch-size optimization
Result: Wrong approach, wastes benchmarking time/VM cost, leads to dead-end branch

Fix: Add TaskUpdate approval gate before implementation.

Token Budget by Configuration

Team Size Baseline (tokens) Per-Session Work Total When to Use
1 lead 15k 150-200k 165-215k Single optimization, sequential tasks
1 lead + 1 teammate 30k 100-150k 130-180k Parallel exploration, 2 independent tasks
3 teammates 45k 50-100k 95-145k Single optimization chain (prof→impl→bench)
4 teammates 60k 60-120k 120-180k Multi-project or multi-layer work
5 teammates 75k 80-150k 155-225k Large exploration, competing hypotheses

Cost tracking rule: If team approach > lead-only, switch back to single session. Use .codeflash/{teammember}/{org}/{project}/metrics.tsv to track.

Decision Tree

Use this tree to pick team size:

Do you have 1 independent task?
  YES → Use 1 lead (or lead + 1-2 subagents for isolation)
  NO → Go to next question

Do you have 2-3 independent tasks (can work in parallel)?
  YES → Use 2-3 teammates
  NO → Go to next question

Do you have 3+ independent tasks OR need competing hypotheses?
  YES → Use 4-5 teammates
  NO → Use single lead session

Is total team token budget < lead-only approach?
  NO → Reconsider. Use subagents instead, or reduce team size

Implementation Checklist

When forming a team:

  • Confirm tasks are truly independent (use dependency tree)
  • Confirm team size ≤ 5 (unless major exception)
  • Define DELIVERABLES for each teammate (see failure-modes.md section 6)
  • Create .claude/agents/{teammate}.md for each with specialization template
  • Set up MEMORY.md scope in teammate YAML (memory: project)
  • Configure SessionStart hook to inject status.md + project MEMORY.md
  • Configure Stop hook to prevent lead from working while teammates work (see failure-modes.md section 6)
  • Plan cost tracking in metrics.tsv after session completes

Session Boundary: When to Restart Team

Restart (create new team) when:

  • Current team phase complete (all tasks in "completed" state)
  • Lead has consolidated MEMORY.md findings
  • Ready to move to next set of independent tasks

DON'T restart when:

  • Mid-task (partial work handed off)
  • Tasks still in "in_progress" state
  • Lead hasn't reviewed team outputs yet

See also: failure-modes.md (what breaks), agent-teams.md (Claude Code agent team docs)