perf: fix CI build + lazy-load heavy libs + parallelize DB queries (#2601 )

## Summary
- **Fix CI build failure**: Auth0Client crashes during Next.js
prerendering when env vars aren't set. Returns a no-op stub (`getSession
→ null`) when domain is missing — semantically correct for static
generation
- **Lazy-load markdown libs (~260kb)**: ReactMarkdown, remarkGfm, and
react-syntax-highlighter were eagerly imported in monaco-diff-viewer but
only rendered when user expands "Generated Tests". Extracted into a
dynamic component
- **Parallelize repo detail query**: `getRepositoryById` ran the
activity count sequentially after the repo lookup. Since `repoId` is
already available, all three queries now run in parallel

## Test plan
- [ ] CI `build` check passes (was failing since #2598)
- [ ] Trace page still renders generated tests correctly when expanded
- [ ] Repository detail page loads correctly with activity status

2026-04-13 11:03:05 -05:00

6.3 KiB

Raw Blame History

Cross-Session Learnings

Personal Account Queries Use Bitmap OR Merge

Dashboard statistics and PR data queries for personal accounts (users without an organization) originally used a 3-way OR condition: WHERE userId = $1 OR orgMember.userId = $1 OR orgAdmin.userId = $1. PostgreSQL optimized this with a bitmap OR merge scan across multiple indexes, which is less efficient than individual index-backed scans.

Solution: Rewrite as UNION queries where each branch uses its own index-backed scan:

WITH filtered AS (
  -- Branch 1: personal repos
  SELECT id FROM repositories WHERE userId = $1
  UNION
  -- Branch 2: org member repos
  SELECT r.id FROM repositories r JOIN org_members om ON ... WHERE om.userId = $1
  UNION
  -- Branch 3: org admin repos
  SELECT r.id FROM repositories r JOIN org_admins oa ON ... WHERE oa.userId = $1
)
SELECT * FROM repositories WHERE id IN (SELECT id FROM filtered)

Each UNION branch hits a specific index cleanly instead of merging bitmaps.

findFirst with Composite Index Lookup

Many Prisma queries used findFirst with a composite unique key (e.g., {organizationId, userId}) that could be replaced with findUnique for guaranteed single-row index seek.

Evidence: members/action.ts, repositories/action.ts, members/data.ts, privacy-actions.ts all had patterns like:

const member = await prisma.organization_members.findFirst({
  where: { organizationId, userId }
})

When the schema has a unique constraint @@unique([organizationId, userId]), use:

const member = await prisma.organization_members.findUnique({
  where: { organizationId_userId: { organizationId, userId } }
})

This guarantees Prisma uses the unique index for a single-row seek instead of a table scan with LIMIT 1.

Permission Checks Load All Members

Several functions loaded all organization members into arrays, then used Array.some() or Array.find() for permission checks:

const members = await prisma.organizations.findFirst({...}).members
return members.some(m => m.userId === userId)

This fetches all N members (O(N) DB transfer), then scans the array (O(N) CPU).

Solution: Use indexed Prisma query that exits early:

const member = await prisma.organization_members.findUnique({
  where: { organizationId_userId: { organizationId, userId } }
})
return member !== null

This is O(1) DB query with early exit. For multiple permission checks in parallel, use Promise.all with individual indexed queries instead of loading all members once.

Select Narrowing Skips Large Columns

Many Prisma queries fetched all columns when only a few were consumed in the UI or API response. This is especially wasteful for:

Large JSON columns: messages, parsed_response, context, experiment_metadata, optimizations_raw
Text columns: stack_trace
Unused metadata: github_org_id, auto_add_github_members, retry_count, python_version, is_async, etc.

Solution: Add explicit select clause listing only consumed fields:

const call = await prisma.llm_calls.findUnique({
  where: { id },
  select: {
    id: true, model: true, status: true, // ... only fields used in page
    // Omit: messages, parsed_response, context (large JSON)
  }
})

Evidence: llm-call/[id]/page.tsx reduced from fetching all 30 llm_calls columns to 22 (skipped 3 large JSON blobs + metadata). get-trace-data.ts reduced optimization_features from 30+ columns to 12 consumed fields.

CTE Phase 1: LIMIT Before JOIN

When paginating a query that joins large tables, restructure the CTE to identify the page of IDs first (with LIMIT), then JOIN only those IDs in phase 2.

Before (inefficient):

WITH data AS (
  SELECT e.id, e.created_at, f.*, r.*
  FROM optimization_events e
  LEFT JOIN optimization_features f ON ...
  LEFT JOIN repositories r ON ...
  WHERE <filters>
  ORDER BY e.created_at DESC
  LIMIT 10
)
SELECT * FROM data

This creates a large intermediate JOIN set before applying LIMIT.

After (efficient):

WITH page_ids AS (
  SELECT e.id
  FROM optimization_events e
  WHERE EXISTS (SELECT 1 FROM optimization_features f WHERE f.optimization_event_id = e.id)
    AND <filters>
  ORDER BY e.created_at DESC
  LIMIT 10
),
data AS (
  SELECT e.id, e.created_at, f.*, r.*
  FROM optimization_events e
  JOIN page_ids p ON e.id = p.id
  LEFT JOIN optimization_features f ON ...
  LEFT JOIN repositories r ON ...
)
SELECT * FROM data

Phase 1 uses EXISTS (index-only check, no full JOIN) to identify ~10 event IDs. Phase 2 joins only those 10 IDs with the large tables.

Evidence: getOptimizationPRs in dashboard/action.ts — both org and personal account paths now use this two-phase CTE structure.

EXISTS Subqueries vs LEFT JOIN for Filtering

When filtering rows based on the existence of related data, using LEFT JOIN with a boolean check is often faster than EXISTS subqueries, especially when the subquery would be evaluated row-by-row for many candidate rows.

Before (slow):

SELECT id FROM candidates c
WHERE c.field IS NOT NULL
  OR EXISTS (
    SELECT 1 FROM related_table r
    WHERE r.key = c.key AND r.field IS NOT NULL
  )

This evaluates the EXISTS subquery once per row in candidates. If there are 10,000 candidates, that's 10,000 subquery executions.

After (fast):

SELECT c.id, r.field IS NOT NULL AS has_related_field
FROM candidates c
LEFT JOIN related_table r ON c.key = r.key
WHERE c.field IS NOT NULL OR r.field IS NOT NULL

The LEFT JOIN is evaluated once with a hash join or index seek, then the filter is applied. Much more efficient for large candidate sets.

Evidence: getOptimizationPRs in dashboard/action.ts — replaced EXISTS checks for optimization_features.pull_request with LEFT JOIN in both count and data queries, for both org and personal account paths. Expected 921ms + 1435ms → <800ms combined.

Pre-existing Failures Masked by Test Runner

Found 3 test failures and 5 type errors that were pre-existing but not caught in previous sessions:

Missing _count field in getRepositoryById test mock (test runner didn't fail until accessed)
Missing id and created_at in optimization_errors select clause (TypeScript TS2339 errors when accessed in UI)

Lesson: Always run full test suite AND type check (tsc --noEmit) after each optimization session, even if individual experiments passed their guard checks.

6.3 KiB Raw Blame History