codeflash-internal/.codeflash/learnings.md

163 lines
6.3 KiB
Markdown
Raw Permalink Normal View History

# Cross-Session Learnings
## Personal Account Queries Use Bitmap OR Merge
Dashboard statistics and PR data queries for personal accounts (users without an organization) originally used a 3-way OR condition: `WHERE userId = $1 OR orgMember.userId = $1 OR orgAdmin.userId = $1`. PostgreSQL optimized this with a bitmap OR merge scan across multiple indexes, which is less efficient than individual index-backed scans.
**Solution:** Rewrite as UNION queries where each branch uses its own index-backed scan:
```sql
WITH filtered AS (
-- Branch 1: personal repos
SELECT id FROM repositories WHERE userId = $1
UNION
-- Branch 2: org member repos
SELECT r.id FROM repositories r JOIN org_members om ON ... WHERE om.userId = $1
UNION
-- Branch 3: org admin repos
SELECT r.id FROM repositories r JOIN org_admins oa ON ... WHERE oa.userId = $1
)
SELECT * FROM repositories WHERE id IN (SELECT id FROM filtered)
```
Each UNION branch hits a specific index cleanly instead of merging bitmaps.
## findFirst with Composite Index Lookup
Many Prisma queries used `findFirst` with a composite unique key (e.g., `{organizationId, userId}`) that could be replaced with `findUnique` for guaranteed single-row index seek.
**Evidence:** `members/action.ts`, `repositories/action.ts`, `members/data.ts`, `privacy-actions.ts` all had patterns like:
```ts
const member = await prisma.organization_members.findFirst({
where: { organizationId, userId }
})
```
When the schema has a unique constraint `@@unique([organizationId, userId])`, use:
```ts
const member = await prisma.organization_members.findUnique({
where: { organizationId_userId: { organizationId, userId } }
})
```
This guarantees Prisma uses the unique index for a single-row seek instead of a table scan with LIMIT 1.
## Permission Checks Load All Members
Several functions loaded all organization members into arrays, then used `Array.some()` or `Array.find()` for permission checks:
```ts
const members = await prisma.organizations.findFirst({...}).members
return members.some(m => m.userId === userId)
```
This fetches all N members (O(N) DB transfer), then scans the array (O(N) CPU).
**Solution:** Use indexed Prisma query that exits early:
```ts
const member = await prisma.organization_members.findUnique({
where: { organizationId_userId: { organizationId, userId } }
})
return member !== null
```
This is O(1) DB query with early exit. For multiple permission checks in parallel, use `Promise.all` with individual indexed queries instead of loading all members once.
## Select Narrowing Skips Large Columns
Many Prisma queries fetched all columns when only a few were consumed in the UI or API response. This is especially wasteful for:
- Large JSON columns: `messages`, `parsed_response`, `context`, `experiment_metadata`, `optimizations_raw`
- Text columns: `stack_trace`
- Unused metadata: `github_org_id`, `auto_add_github_members`, `retry_count`, `python_version`, `is_async`, etc.
**Solution:** Add explicit `select` clause listing only consumed fields:
```ts
const call = await prisma.llm_calls.findUnique({
where: { id },
select: {
id: true, model: true, status: true, // ... only fields used in page
// Omit: messages, parsed_response, context (large JSON)
}
})
```
**Evidence:** `llm-call/[id]/page.tsx` reduced from fetching all 30 llm_calls columns to 22 (skipped 3 large JSON blobs + metadata). `get-trace-data.ts` reduced optimization_features from 30+ columns to 12 consumed fields.
## CTE Phase 1: LIMIT Before JOIN
When paginating a query that joins large tables, restructure the CTE to identify the page of IDs first (with LIMIT), then JOIN only those IDs in phase 2.
**Before (inefficient):**
```sql
WITH data AS (
SELECT e.id, e.created_at, f.*, r.*
FROM optimization_events e
LEFT JOIN optimization_features f ON ...
LEFT JOIN repositories r ON ...
WHERE <filters>
ORDER BY e.created_at DESC
LIMIT 10
)
SELECT * FROM data
```
This creates a large intermediate JOIN set before applying LIMIT.
**After (efficient):**
```sql
WITH page_ids AS (
SELECT e.id
FROM optimization_events e
WHERE EXISTS (SELECT 1 FROM optimization_features f WHERE f.optimization_event_id = e.id)
AND <filters>
ORDER BY e.created_at DESC
LIMIT 10
),
data AS (
SELECT e.id, e.created_at, f.*, r.*
FROM optimization_events e
JOIN page_ids p ON e.id = p.id
LEFT JOIN optimization_features f ON ...
LEFT JOIN repositories r ON ...
)
SELECT * FROM data
```
Phase 1 uses EXISTS (index-only check, no full JOIN) to identify ~10 event IDs. Phase 2 joins only those 10 IDs with the large tables.
**Evidence:** `getOptimizationPRs` in `dashboard/action.ts` — both org and personal account paths now use this two-phase CTE structure.
## EXISTS Subqueries vs LEFT JOIN for Filtering
When filtering rows based on the existence of related data, using `LEFT JOIN` with a boolean check is often faster than `EXISTS` subqueries, especially when the subquery would be evaluated row-by-row for many candidate rows.
**Before (slow):**
```sql
SELECT id FROM candidates c
WHERE c.field IS NOT NULL
OR EXISTS (
SELECT 1 FROM related_table r
WHERE r.key = c.key AND r.field IS NOT NULL
)
```
This evaluates the EXISTS subquery once per row in candidates. If there are 10,000 candidates, that's 10,000 subquery executions.
**After (fast):**
```sql
SELECT c.id, r.field IS NOT NULL AS has_related_field
FROM candidates c
LEFT JOIN related_table r ON c.key = r.key
WHERE c.field IS NOT NULL OR r.field IS NOT NULL
```
The LEFT JOIN is evaluated once with a hash join or index seek, then the filter is applied. Much more efficient for large candidate sets.
**Evidence:** `getOptimizationPRs` in `dashboard/action.ts` — replaced EXISTS checks for `optimization_features.pull_request` with LEFT JOIN in both count and data queries, for both org and personal account paths. Expected 921ms + 1435ms → <800ms combined.
## Pre-existing Failures Masked by Test Runner
Found 3 test failures and 5 type errors that were pre-existing but not caught in previous sessions:
- Missing `_count` field in `getRepositoryById` test mock (test runner didn't fail until accessed)
- Missing `id` and `created_at` in optimization_errors select clause (TypeScript TS2339 errors when accessed in UI)
**Lesson:** Always run full test suite AND type check (`tsc --noEmit`) after each optimization session, even if individual experiments passed their guard checks.