K5.4.2 Task 5.4

The Scratchpad: Writing Findings Down Before the Context Forgets Them

Context degrades. That is established. The question is what to do about it. The answer is disarmingly simple: write things down.

A scratchpad file — a plain file the agent reads and writes during a session — externalizes findings from the conversation context. Once a fact is in a file, it is immune to context degradation. The agent can re-read the file at any point and recover the exact specifics it recorded, regardless of how much context has accumulated since.

Why It Works: Context Is Volatile, Files Are Not

The conversation context is a fixed-capacity buffer. As new information enters, older information gets compressed or displaced. A scratchpad file sits outside that buffer entirely. It does not compete for context space, does not get summarized, and does not lose precision over time.

Measured impact on a research exploration agent:

MetricWithout scratchpadWith scratchpad
Specificity at 30 min75%88%
Specificity at 60 min28%79%
Specificity drop47 points9 points
Final report accuracy64%91%

The scratchpad agent spends about 8% of each turn on file I/O. That is a trivial cost for a 27-point accuracy improvement on the final output.

What to Record: Specifics, Not Impressions

The scratchpad’s value comes entirely from what goes into it. A vague entry reproduces the exact problem the scratchpad should prevent.

Bad entry:

Found relevant paper on AI healthcare adoption — good data

Good entry:

Chen et al. (2025) "AI Adoption in Rural Healthcare" — 47% reduction in
diagnostic delay across 12 clinics, n=3,400, published in JAMA Digital Health

The same principle applies to code exploration. auth module looks standard is worthless. src/auth/jwt.ts:12 → verifyJWT(token: string): Promise<Claims> is exactly what the agent needs 45 minutes later when the context has degraded that discovery into “the authentication middleware handles this.”

The rule is straightforward: if the entry could apply to any project, it is too vague. Scratchpad entries should contain file paths with line numbers, function signatures, exact values, specific identifiers — the details that context degradation erodes first.

Structure Matters: Organize for Retrieval

A flat list of findings becomes hard to navigate as the exploration progresses. Two structural patterns work well:

Functional-area sections for codebase exploration:

## Authentication
- src/auth/jwt.ts:12 → verifyJWT() validates token signature + expiry
- src/auth/middleware.ts:45 → authGuard() wraps route handlers

## Database
- src/db/pool.ts:8 → connection pool max=20, timeout=5000ms

Per-issue rows for multi-issue tracking (customer support, bug triage):

| Issue | Order ID | Amount | Status     | Resolution       |
|-------|----------|--------|------------|------------------|
| 1     | ORD-4421 | $89.99 | Refunded   | Policy exception |
| 2     | ORD-4455 | $34.50 | In dispute | Awaiting manager |

The structured format prevents a specific failure mode: cross-issue confusion. When a support agent handles 3 orders and 2 policy exceptions, a narrative summary blends details across issues. A tabular format keeps each issue’s data in its own row, so the agent never applies the wrong refund amount to the wrong order.

Update Continuously, Not Once

A scratchpad written once at the start of a session and never updated becomes stale. This is measurable:

One team deployed a write-once scratchpad for their support agent. Overall error rate dropped from 18% to 7% — a significant improvement. But 85% of the remaining errors occurred on facts that changed mid-conversation: a customer correcting an order number, adding a new issue, updating a shipping preference.

The fix is obvious: update the scratchpad whenever key facts change. Not just at creation, not just at regular intervals — whenever the agent encounters information that modifies or adds to what is already recorded.

For long-running sessions (multi-hour codebase audits, extended research), the update cycle is:

  1. Record findings as they are discovered
  2. Update entries when facts change
  3. Re-read the scratchpad when referencing earlier work

This is not optional. A stale scratchpad is worse than no scratchpad because it gives the agent false confidence in outdated data.

The Scratchpad + Compact Cycle

For sessions that fill the context window repeatedly (a 500-file audit that hits 90% capacity every 30-45 minutes), the scratchpad combines with /compact in a repeating cycle:

  1. Write findings to the scratchpad continuously during exploration
  2. Compact when context approaches capacity — this frees space but is lossy
  3. Re-read the scratchpad after compaction to restore specific details that the lossy compression may have removed
  4. Continue exploring with restored specifics and freed context space

This cycle can repeat throughout a multi-hour session. The key insight: compaction without a scratchpad is risky because /compact may summarize away the exact file paths and line numbers the agent needs. The scratchpad ensures those details survive compaction intact.

The anti-pattern is writing to the scratchpad only at the end, just before generating the final report. By that point, the agent’s findings have already degraded. It can only record the vague references that remain after hours of context degradation — the scratchpad captures what the agent remembers, and by then, it does not remember much.

Cross-Domain Application: Test-Driven Iteration

A developer using test-driven iteration (write tests, implement, share failures, fix, repeat) generates substantial context per round. After 15 rounds, design constraints established in rounds 1-5 get compressed as test output and fix code accumulate.

The result: fixes in later rounds conflict with early constraints. Not because the model’s reasoning degrades, but because the early constraints are no longer in the context with enough specificity to guide decisions.

A scratchpad file preserving key constraints (“max 3 retries per endpoint — agreed in round 2”, “no synchronous DB calls in request path — round 4”) survives the context accumulation and keeps all iterations consistent. This is where D3 session management (the test-driven workflow) meets D5 context preservation (the scratchpad) — each technique is less effective alone.

What the Scratchpad Is Not

It is not a transcript copy. Dumping the full conversation to a file is pointless — the agent already has the conversation in context. The value of a scratchpad is extraction and organization, not duplication.

It is not a format question. JSON vs Markdown vs plain text does not matter if the content is vague. { "note": "good data" } is no better than good data in a text file.

It is not a substitute for managing context. The scratchpad complements compaction and sub-agent isolation; it does not replace them. An agent that writes to a scratchpad but never compacts will still fill its context window.


One-liner: Write specific findings to a file as you discover them, update the file when facts change, and re-read it after compaction — files do not forget, context does.