K4.4.4 Task 4.4

Schema Says Valid. Line Items Don't Sum to Total. Both Are True.

tool_use guarantees structural correctness: valid JSON, required fields present, correct types. It does NOT guarantee semantic correctness. A schema-valid extraction can have line items that do not sum to the total, an email without @, or a start date after the end date.

The Gap

In one pipeline, tool_use caught 0 errors at generation time (it prevents structural issues by design). A separate semantic validation layer caught 20% of extractions with logical errors:

  • 12% cross-field inconsistency (sums wrong, dates contradictory)
  • 5% values in wrong fields (author name in version field)
  • 3% fabricated data

Without semantic validation, all 20% reached downstream systems undetected.

Two-Layer Validation

LayerWhat it catchesMechanism
Structural (tool_use)Missing fields, wrong types, malformed JSONJSON Schema at generation time
Semantic (application code)Sum mismatches, date ordering, value-in-wrong-fieldBusiness logic after generation

JSON Schema cannot express “line items must sum to total” or “start_date must precede end_date.” These checks require application-layer validation code.

The False Sense of Security

“If the schema passes, the data is good” is wrong. Schema-valid output is structurally correct — it may be semantically nonsensical. Both layers are necessary, and they address fundamentally different error classes.


One-liner: tool_use guarantees structure, not meaning — add semantic validation (sum checks, date ordering, cross-field logic) because 20% of schema-valid extractions contain logical errors that schema alone cannot detect.