The difference between an agent and a chatbot isn’t about model quality, multimodal support, or conversation length. It’s about one thing: does it take autonomous actions using tools?
The defining line
Chatbot — receives a message, generates a text response, waits for the next message. It answers questions, summarizes text, translates languages. It talks about things. It doesn’t do things to external systems.
Agent — receives a task, plans an approach, uses tools to interact with external systems, and executes multi-step actions to complete the task. It reads files, searches databases, processes refunds, creates issues. It does things.
A customer FAQ bot that answers “What’s your return policy?” from a knowledge base is a chatbot. A resolution system that looks up your order, processes a refund, and updates your account is an agent.
Four distinguishing capabilities
- Autonomous task completion — agents drive the process. They don’t wait for instructions between steps.
- Multi-step planning — agents break complex goals into sub-steps and execute them.
- Tool use — agents interact with external systems (databases, APIs, files) through tools.
- Proactive execution — agents take action rather than passively responding.
What DOESN’T define the difference
- Model quality — agents and chatbots can use the same model. The distinction is behavioral, not about model tier.
- Multi-turn capability — chatbots can maintain conversation context across turns too. Multi-turn ≠ agent.
- Question complexity — a chatbot can answer complex questions from its training data. Complexity of questions doesn’t make it an agent.
- Multimodal input — processing images or audio doesn’t make a system an agent. A text-only system with tools is more “agent” than a multimodal chatbot without tools.
- System prompts — both agents and chatbots use system prompts for behavioral configuration.
The tool use test
If you’re unsure whether something is an agent: does it call tools to interact with external systems? Does it chain multiple actions to complete a task? If both answers are yes, it’s an agent. If it only generates text responses, it’s a chatbot — regardless of how sophisticated those responses are.
One-liner: Agents autonomously execute multi-step tasks using tools; chatbots generate text responses — the distinction is tool-mediated action vs passive response, not model quality or conversation turns.