codeflash-agent/plugin/references/shared/learnings-template.md

# Cross-Session Learnings

Non-obvious technical discoveries about this codebase. Read at session start to avoid repeating dead ends.

## How to use this file

- **Domain agents**: Add entries after discovering something non-obvious (keep or discard).
- **Router agent**: Read this file at every session start and include it in the domain agent's launch prompt.
- **Entries should be**: specific, technical, evidence-based. Not opinions or preferences.
- **Remove entries** when they become outdated (e.g., a library version changes and the workaround no longer applies).

## Template

```markdown
## <Short descriptive title>
**Domain:** memory | cpu | async | structure
**Discovered:** <date>

<1-3 sentences with the specific technical finding. Include evidence: profiler output, version numbers, error messages.>

**Implication:** <What this means for future optimization attempts. What to do or avoid.>
```

## Example entries

```markdown
## pytest-memray measures per-test peak only
**Domain:** memory
**Discovered:** 2026-03-17

pytest-memray's `@pytest.mark.limit_memory` and `--memray` flag measure memory allocated during the test function body only. Import-time allocations (module globals, C extension init) are NOT counted. Verified: 40 MiB english_words list invisible in pytest-memray but visible in `memray run`.

**Implication:** Import-time memory optimizations will show zero improvement in pytest-memray benchmarks. Use `memray run` on the full process to capture import-time.

## Paddle inference engine allocates in 500 MiB arena chunks
**Domain:** memory
**Discovered:** 2026-03-19

PaddleOCR's C++ inference engine allocates memory in 500 MiB arena chunks via `auto_growth` strategy. These are native memory pools, not proportional to data size. `config.memory_pool_init_size_mb()` is read-only (100 MiB default, but pool grows to 500 MiB). `enable_ort_optimization()` requires Paddle compiled with ONNX Runtime support. `rec_batch_num` controls the number of arena chunks allocated during recognition (6 -> 4 chunks, 1 -> 1 chunk).

**Implication:** Cannot cap Paddle arena size directly. Only lever is `rec_batch_num` to reduce number of chunks. Don't waste time on arena configuration APIs.
```
fix: address session-analysis findings from 89 unstructured_org sessions Analyzed ~89 Claude Code sessions across 7 unstructured_org projects to identify recurring failures and friction points, then applied fixes: - Fix "ask then die" bug: skill now injects AUTONOMOUS MODE directive so domain agents work without interactive questions that kill the Agent tool - Fix git add -A: all 4 domain agents now stage specific files instead of blindly staging everything (caused accidental commits of scratch files) - Add pre-commit step: agents run pre-commit before every commit to catch linting failures before CI (ruff/undersort failures were recurring) - Add measurement methodology lock: prevents changing profiling flags mid-experiment which created uninterpretable deltas - Add branch state verification to router startup (prevents wrong-branch confusion that wasted multiple sessions) - Add multi-repo detection to router (original work spanned 4 repos) - Add library vs application awareness to memory agent (prevents wasting time on import-time optimizations in library projects) - Add dependency resilience to setup agent (uv run --with isolation warning, private PyPI failure guidance) - Add PR text quality guidelines (sessions showed AI-sounding text that required multiple user corrections) - Add chart generation guidelines to pr-preparation.md - Add context conservation rules (max 2 background tasks, use subagents) - Add cross-session learnings template for .codeflash/learnings.md - All domain agents now read learnings.md at startup 2026-03-27 15:08:50 +00:00			`# Cross-Session Learnings`

			`Non-obvious technical discoveries about this codebase. Read at session start to avoid repeating dead ends.`

			`## How to use this file`

			`- Domain agents: Add entries after discovering something non-obvious (keep or discard).`
			`- Router agent: Read this file at every session start and include it in the domain agent's launch prompt.`
			`- Entries should be: specific, technical, evidence-based. Not opinions or preferences.`
			`- Remove entries when they become outdated (e.g., a library version changes and the workaround no longer applies).`

			`## Template`

			```markdown
			`## <Short descriptive title>`
			`Domain: memory \| cpu \| async \| structure`
			`Discovered: <date>`

			`<1-3 sentences with the specific technical finding. Include evidence: profiler output, version numbers, error messages.>`

			`Implication: <What this means for future optimization attempts. What to do or avoid.>`
			```

			`## Example entries`

			```markdown
			`## pytest-memray measures per-test peak only`
			`Domain: memory`
			`Discovered: 2026-03-17`

			pytest-memray's `@pytest.mark.limit_memory` and `--memray` flag measure memory allocated during the test function body only. Import-time allocations (module globals, C extension init) are NOT counted. Verified: 40 MiB english_words list invisible in pytest-memray but visible in `memray run`.

			Implication: Import-time memory optimizations will show zero improvement in pytest-memray benchmarks. Use `memray run` on the full process to capture import-time.

			`## Paddle inference engine allocates in 500 MiB arena chunks`
			`Domain: memory`
			`Discovered: 2026-03-19`

			PaddleOCR's C++ inference engine allocates memory in 500 MiB arena chunks via `auto_growth` strategy. These are native memory pools, not proportional to data size. `config.memory_pool_init_size_mb()` is read-only (100 MiB default, but pool grows to 500 MiB). `enable_ort_optimization()` requires Paddle compiled with ONNX Runtime support. `rec_batch_num` controls the number of arena chunks allocated during recognition (6 -> 4 chunks, 1 -> 1 chunk).

			Implication: Cannot cap Paddle arena size directly. Only lever is `rec_batch_num` to reduce number of chunks. Don't waste time on arena configuration APIs.
			```