Reading the Response: stop_reason Is Your Control Signal | Foundations

When the Claude API responds, three fields matter most: content (what the model produced), stop_reason (why it stopped), and usage (how many tokens were consumed).

stop_reason drives agentic loops

The stop_reason field is the single control signal for agentic loop behavior:

end_turn — The model finished naturally. It said everything it wanted to say. Your loop should terminate.
tool_use — The model wants to call a tool. Execute it, send back the result, and loop again.
max_tokens — The response was cut short because it hit the token limit. The output may be incomplete.

The critical distinction: end_turn means the model chose to stop. max_tokens means it was forced to stop. A truncated response might have incomplete JSON, unfinished sentences, or missing conclusions. Your code should handle these differently.

content is an array, not a string

The response content is always an array of typed blocks. A tool-requesting response might look like:

{
  "content": [
    {"type": "text", "text": "Let me look that up for you."},
    {"type": "tool_use", "id": "toolu_abc", "name": "search", "input": {"query": "..."}}
  ],
  "stop_reason": "tool_use"
}

Text and tool_use coexist. Your agentic loop checks stop_reason, not the presence of text.

usage tracks cost

The usage object contains input_tokens and output_tokens. Input tokens include everything you sent — messages, system prompt, and tool definitions. Output tokens are what the model generated. Both count toward billing. In multi-turn conversations, input tokens grow with each turn because you resend the full history.

One-liner: stop_reason is the loop control signal — tool_use means continue, end_turn means stop, and max_tokens means the response was cut short.