Always Test on a Sample Before Full Batch — 18% Failure vs 3% | Prompt Engineering & Optimization

The Sample Testing Pattern

Before submitting thousands of documents to the Batch API, test your prompt on a diverse 20-50 document sample. Iterate until the success rate exceeds your target (e.g., 95%+). Then submit the full batch.

Approach	Failure rate	Total cost
No sample testing	18%	$740
Sample test first	3%	$519

30% cost savings. The $8/month sample investment saved $300/month in reprocessing over 6 months — 37x ROI.

Sample Selection Matters

Select diverse samples covering edge cases: long documents, short documents, documents with missing fields, documents in different formats. Using only simple/short documents in the sample misses the edge cases that cause batch failures.

Test Every Batch, Even With Proven Prompts

New document collections may contain format variations not seen before. A prompt that works perfectly on invoices from Vendor A may fail on invoices from Vendor B. Sample test every batch, not just new prompts.

The Workflow

Select 20-50 diverse documents from the batch
Test with claude -p (sync, immediate feedback)
Iterate on failures — fix prompt, retry sample
Submit full batch only when sample success rate meets target
Monitor batch results and feed new failure patterns back into the prompt

One-liner: Sample test every batch on 20-50 diverse documents before full submission — the $8 investment prevents $300 in reprocessing failures, a 37x return.