AI, decoded

How does an AI agent decide which tool to use?

The agent is given a set of tools, each with a name and a description of what it does and when to use it. At each step the model reads the task and those descriptions and picks a tool, then generates the arguments to call it — a search query, an API payload, a database lookup. It runs the tool, reads the result, and decides the next move. The quality of that choice rides almost entirely on the tool descriptions: vague descriptions produce wrong tool calls, which is one of the most common ways agents fail.

· Chain of Thought

AI AgentsAI Evaluation & Reliability

Tools are described, not hard-coded

An agent doesn’t “know” your APIs. It’s handed a list of tools, each with a name and a plain-language description of what it does and when it’s the right choice. At each step the model reads the current task against those descriptions and decides which tool fits — or that no tool is needed and it can answer directly.

Selecting and calling

Once it picks a tool, the model generates the call: the search string, the JSON payload, the function arguments. It runs the tool, reads what comes back, and loops — using that result to decide the next action. Protocols like MCP standardize how those tools are exposed, so one agent can reach many systems through a consistent interface instead of a bespoke integration each time.

Why tool selection is where agents break

Tool choice is one of the most common failure points. If two tools have overlapping or fuzzy descriptions, the model picks the wrong one or calls the right one with bad arguments — and the whole task derails from there. This is why “tool selection quality” is a metric worth measuring on its own, and why writing precise tool descriptions does more for reliability than most prompt tuning.

From the conversation

This explainer is drawn from these episodes — each carries its full transcript.