The API Does Not Remember Your Conversation — You Must Send It Every Time | Context Management & Reliability

The Claude Messages API is stateless. Every request must include the complete conversation history — all previous user messages, assistant responses, and tool interactions. If you send only the latest message, Claude has zero access to prior context. The conversation starts fresh.

There is no session ID, no conversation ID, no server-side state, no automatic context retention.

The Common Failure

A chatbot works perfectly for turns 1 and 2. On turn 3, Claude responds as if no conversation happened: “I don’t have any search results to analyze” — despite a successful tool call in turn 2 that returned valid data.

The developer sent only the new user message in turn 3’s request. The tool_result from turn 2 was not included. Claude literally does not know the search happened.

What Must Be in Every Request

The complete message sequence:

All user messages
All assistant responses (including tool_use blocks)
All tool_result messages
The current new user message

Missing any piece — especially tool_result messages — breaks the conversation. The assistant’s tool_use request and its corresponding tool_result must both be present for Claude to know what the tool returned.

The Data

History management	Customer satisfaction	Context-loss complaints
Complete history every request	95%	2%
Intermittently truncated history	52%	41%

A 43-point satisfaction gap from a serialization bug that intermittently dropped messages from the array. Same model, same prompt, same topics — the only variable was whether the full history was included.

Cross-Session Continuity

Customer leaves a chat, returns hours later expecting context preserved. The API remembers nothing between requests — let alone between sessions.

The solution: store the complete message array in a database, keyed by customer/conversation ID. When the customer returns, load the full history and include it in the next request. Claude sees a seamless conversation.

Managing Very Long Conversations

For 50+ turn conversations, sending the full history every request becomes expensive. The adaptive strategy:

Normal conversations (≤20 turns): Send complete history every request
Long conversations (50+ turns): Use a Case Facts block (preserving critical values) plus recent messages. The Case Facts persist essential context while trimming old narrative turns.

This is where K5.1.1 (structured case facts) connects: the persistence mechanism that preserves precision during summarization also serves as the compact representation for long conversations in the stateless API.

Things That Do Not Exist

Conversation ID parameter — No such feature. The API has no reference mechanism for prior requests.
Server-side session state — No conversation is stored server-side. Each request is independent.
Automatic context retention — Not for 24 hours, not for any duration. Zero server-side persistence.
Cache reference tokens — The API cannot retrieve state from external caches.
Turn limits on server state — No server state exists to have limits on.

Debugging Intermittent Context Loss

If some turns maintain context while others lose it, the issue is in the client-side history management. The first diagnostic step: log the complete messages array being sent with each request. Compare working turns against broken turns. The broken ones will have incomplete or truncated history — likely a race condition or serialization bug.

System prompt instructions cannot compensate for missing history. If previous messages are not in the request, no instruction can make Claude reference information it does not have.

One-liner: The Claude API is stateless — every request needs the complete message history (all turns, all tool results) because nothing is remembered server-side, and a serialization bug that drops history causes a 43-point satisfaction collapse.