K3.5.3 Task 3.5

Three Rounds of Rework Because Nobody Asked About Deployment Topology

“Add rate limiting to our API.” Claude immediately implements a global token bucket. Round 1: developer needed per-user limits. Round 2: needed tiered endpoint limits. Round 3: needed distributed rate limiting for multi-instance deployment. Each round required significant code restructuring.

All three requirements would have surfaced from one question: “What’s the scope — global or per-user? Any tiering? Single-instance or distributed?”

Interview mode means Claude asks clarifying questions before implementing. It is not a universal practice — it is a tool for ambiguous requests where the design space is large and wrong assumptions are expensive.

When to Interview

Request is brief and ambiguous. “Add caching to our API endpoints” does not specify invalidation strategy, TTL values, or which endpoints to cache. Each missing detail could lead to a fundamentally different implementation.

Multiple valid approaches with different implications. PDF parsing, NLP extraction, database storage, and deduplication each have multiple valid approaches. The choice depends on paper volume, required metadata fields, and output format — information only the developer has.

Financial or safety-critical systems. “When a payment fails, retry it automatically.” Without asking about retry limits, backoff strategy, and idempotency handling, Claude might implement unlimited immediate retries on a payment system — causing duplicate charges, triggering fraud detection, and exhausting rate limits.

Complex multi-component tasks. A data pipeline with 4 components and multiple valid approaches per component. Wrong architectural choices would require complete rewrites.

When Not to Interview

Task is clear and well-specified. A bug fix with a stack trace pointing to line 42. A well-specified function with clear algorithm and single target file. No questions needed.

Request comes with complete specifications. The developer has already specified invalidation strategy, TTL, and endpoints. Asking questions they have already answered wastes their time.

A rigid “always ask 5 questions” policy penalizes clear requests with unnecessary friction. The trigger for interview mode is ambiguity and design-space size, not task category.

The Data

A team compared direct implementation versus interview-first across 30 features over one month:

MetricDirect implementationInterview-first
Avg revision rounds3.21.4
Avg time per feature4.5 hours2.8 hours
Architectural rework40% of features8% of features

Interview-first reduced revision rounds by 56%, development time by 38%, and architectural rework by 80%. The upfront investment in questions paid for itself many times over.

Ask the Right Questions

The wrong questions: “What language?” “What framework?” “What testing library?” These are answerable by reading the codebase. Claude should read the code context first.

The right questions surface what the developer has not considered:

  • Failure modes — What happens when the cache is unavailable? What if the payment processor returns an ambiguous error?
  • Scale constraints — What volume of notifications? Single-instance or distributed deployment?
  • Edge cases — What if a customer message contains zero order IDs? What about near-miss patterns like “ORD-XXXX” with 4 characters?
  • Business logic — Which endpoints need caching? What constitutes a retryable payment failure versus a permanent one?
  • Concurrency — Will multiple processes access the rate limiter simultaneously?

The strategy: read the codebase for technical context, then ask about considerations the developer may not have anticipated. Good interview questions reveal requirements the developer did not know they had.

Anti-Pattern: Implement With Assumed Defaults

“Build a notification system.” Claude immediately generates email delivery, 3 retries, 2 priority levels, no user preferences.

Every assumption might be wrong. The team might need SMS and push. The retry policy might need different intervals per channel. Priority levels might need to match the existing ticket system. User preference handling might be legally required.

TODO comments (”// configure retry policy”) shift critical decisions to code review time. For production systems, these decisions should be made before code exists — not discovered as review comments.

Interview Is Not Iteration

Interview mode gathers requirements before the first line of code. Iteration refines the implementation after code exists. They serve different phases:

  • Interview prevents wrong architectural direction (40% → 8% rework)
  • Iteration refines behavior within the correct architecture (3.2 → 1.4 rounds)

Skipping interview and going straight to iteration means iterating on the wrong foundation — each revision discovers a requirement that restructures everything.


One-liner: When the request is ambiguous and the design space is large, ask questions before writing code — the 2.8 hours of interview-first development is faster than the 4.5 hours of implement-and-rework.