When the Claude API responds, three fields matter most: content (what the model produced), stop_reason (why it stopped), and usage (how many tokens were consumed).
stop_reason drives agentic loops
The stop_reason field is the single control signal for agentic loop behavior:
end_turn— The model finished naturally. It said everything it wanted to say. Your loop should terminate.tool_use— The model wants to call a tool. Execute it, send back the result, and loop again.max_tokens— The response was cut short because it hit the token limit. The output may be incomplete.
The critical distinction: end_turn means the model chose to stop. max_tokens means it was forced to stop. A truncated response might have incomplete JSON, unfinished sentences, or missing conclusions. Your code should handle these differently.
content is an array, not a string
The response content is always an array of typed blocks. A tool-requesting response might look like:
{
"content": [
{"type": "text", "text": "Let me look that up for you."},
{"type": "tool_use", "id": "toolu_abc", "name": "search", "input": {"query": "..."}}
],
"stop_reason": "tool_use"
}
Text and tool_use coexist. Your agentic loop checks stop_reason, not the presence of text.
usage tracks cost
The usage object contains input_tokens and output_tokens. Input tokens include everything you sent — messages, system prompt, and tool definitions. Output tokens are what the model generated. Both count toward billing. In multi-turn conversations, input tokens grow with each turn because you resend the full history.
One-liner: stop_reason is the loop control signal — tool_use means continue, end_turn means stop, and max_tokens means the response was cut short.