S1.5.1 Task 1.5

PostToolUse Normalization: 12% → 0.3% Date Errors, 90% Token Reduction

When different tools return dates as Unix timestamps, currencies as cents, and status codes as HTTP numbers, the agent miscompares, miscalculates, and misinterprets. PostToolUse hooks fix this at the system level — deterministically, invisibly, before the agent sees anything.

Three normalization domains

Timestamps

Tool A returns 1704067200 (Unix epoch). Tool B returns "December 15, 2023" (string). The agent compares them lexicographically: 1704067200 > "December 15, 2023". Conclusion: a 2024 paper was published before a 2023 paper. Wrong.

PostToolUse hook converts all timestamps to ISO 8601 before the agent sees them. Both become comparable strings.

Currencies

Tool A returns 12750 (cents). Tool B returns "$127.50" (string with symbol). Tool C returns {"amount": 127.50, "currency": "USD"} (object). The agent adds cents to dollars: 12750 + 127.50 = 12877.50. Absurd total.

PostToolUse hook normalizes all monetary values to {"amount": <dollars>, "currency": <string>}.

Status codes

Tool A returns 200 (HTTP). Tool B returns "OK" (string). Tool C returns true (boolean). The agent may misinterpret HTTP 201 as an error, treat “OK” differently from true, or fail to recognize tool-specific conventions.

PostToolUse hook normalizes to a consistent status schema.

The normalization results

After deploying PostToolUse hooks for dates and currencies:

Error typeBeforeAfter
Date comparison errors12%0.3%
Currency calculation errors8%0.1%
Status code misinterpretation6%6% (not yet covered)
Overall accuracy78%91%

Status code errors persist because the hook doesn’t normalize them yet. The pattern works — extend it to cover remaining format types.

Sensitive data redaction

Tool results often contain data the agent doesn’t need and shouldn’t see: full SSN, credit card numbers, internal employee IDs. Two risks: (1) security exposure in the agent’s context, (2) the agent occasionally echoes sensitive data back to customers.

PostToolUse hook redacts: SSN → "***-**-1234", card → "****1234". The agent never sees full sensitive data, making it impossible to echo. This addresses both security and customer safety deterministically — not via prompt instruction “never repeat SSN” (which fails occasionally).

Token reduction: 90% savings

Tool results averaging 5,000 tokens contain ~500 tokens of fields the agent actually uses. 8 calls/session × 5,000 = 40,000 tokens of tool results, accounting for 60% of API input cost.

PostToolUse hook extracts only relevant fields: 5,000 → 500 tokens per result. Session total: 40,000 → 4,000 tokens. 90% reduction in tool result tokens, cutting overall API input costs by ~54%.

Layered transformation design

A PostToolUse hook that normalizes, redacts, and extracts needs careful layering:

  1. Normalize formats (lossless) — same data, standard format. ISO 8601 dates, dollar amounts, consistent status codes. No information loss.
  2. Redact secrets (lossy but security-required) — SSN, credit cards, tokens. The agent never needed this data; removing it is a security requirement.
  3. Extract relevant fields (lossy, verified only) — remove fields the agent demonstrably doesn’t use, verified through usage analysis. Don’t aggressively strip — remove only what’s confirmed unnecessary.

Each layer has clear justification. Aggressive blanket compression risks removing fields the agent occasionally needs, causing intermittent failures.

Why not prompt-based format handling?

PostToolUse hooks are:

  • Deterministic — never fails to convert
  • Centralized — one place to maintain
  • Invisible to the agent — reduces cognitive load

Prompt-based format documentation is:

  • Probabilistic — agent may forget rules under load
  • Distributed — spread across the prompt for 8 tools
  • Overhead — adds reasoning burden to every tool interaction

The data confirms: hook-based normalization achieves 0.3% error rates where prompt-based handling produces 12%.

Audit first, then normalize

For a system with 8 MCP tools and occasional misinterpretations:

  1. Audit all 8 tools’ output formats to identify inconsistencies (which fields, which formats)
  2. Implement PostToolUse hooks targeting the identified inconsistencies
  3. Verify error rates drop for normalized fields
  4. Extend to remaining format types as needed

One-liner: PostToolUse hooks normalize timestamps (12%→0.3% errors), currencies, and status codes deterministically; redact sensitive data so agents can’t echo it; and extract relevant fields for 90% token reduction — layered as lossless normalization → required redaction → verified extraction.