Agent Patterns

Tool Use

The mechanism by which LLMs call external functions — defining tools as JSON schemas, receiving structured call requests from the model, executing them, and returning results.

Structured
JSON schema input
Parallel
Multi-tool calls
Provider-native
API support

Table of Contents

SECTION 01

Tool use vs prompt engineering

Before tool use existed, people would put instructions like "respond with JSON in format {action: ..., args: ...}" in the prompt and then parse the output with regex. It worked sometimes and broke spectacularly other times.

Native tool use (also called function calling) is different: you define tools as structured JSON schemas, and the model returns a structured call object — not free text it asks you to parse, but a first-class API type designed for reliable machine parsing. The model knows the exact set of available tools and their input requirements, reducing hallucinated tool calls dramatically.

SECTION 02

How tool calling works at the API level

The flow:

  1. You send a message with a tools array defining available functions.
  2. The model responds with a tool_use content block containing the tool name and input arguments.
  3. You execute the tool in your code and get the result.
  4. You send a follow-up message with the tool result in a tool_result block.
  5. The model continues (possibly calling more tools, or generating a final response).

Your code is the runtime. The model is the planner. The API is the communication channel between them.

SECTION 03

Defining tools with Anthropic

import anthropic, json

client = anthropic.Anthropic()

# Define tools as JSON schema objects
tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a location. Returns temperature in Celsius and conditions.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name, e.g. 'London' or 'New York, NY'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit. Defaults to celsius."
                }
            },
            "required": ["location"]
        }
    },
    {
        "name": "get_stock_price",
        "description": "Get the current stock price for a ticker symbol.",
        "input_schema": {
            "type": "object",
            "properties": {
                "ticker": {"type": "string", "description": "Stock ticker, e.g. AAPL, MSFT"}
            },
            "required": ["ticker"]
        }
    }
]

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "What's the weather like in Paris and London?"}]
)

print(response.stop_reason)   # "tool_use"
for block in response.content:
    if block.type == "tool_use":
        print(f"Tool: {block.name}, Input: {block.input}")
# Tool: get_weather, Input: {'location': 'Paris', 'unit': 'celsius'}
# Tool: get_weather, Input: {'location': 'London', 'unit': 'celsius'}
SECTION 04

Handling tool results

import anthropic

client = anthropic.Anthropic()

def get_weather(location: str, unit: str = "celsius") -> dict:
    '''Mock weather API — replace with real implementation.'''
    data = {"Paris": {"temp": 18, "conditions": "Partly cloudy"},
            "London": {"temp": 14, "conditions": "Overcast"}}
    d = data.get(location, {"temp": 20, "conditions": "Unknown"})
    return {"location": location, "temperature": d["temp"], "unit": unit, "conditions": d["conditions"]}

def run_with_tools(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=1024,
            tools=tools,    # defined in previous section
            messages=messages
        )

        if response.stop_reason == "end_turn":
            # No more tool calls — extract text response
            return next(b.text for b in response.content if hasattr(b, "text"))

        if response.stop_reason == "tool_use":
            # Add assistant's response (including tool_use blocks) to history
            messages.append({"role": "assistant", "content": response.content})

            # Execute all tool calls and collect results
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    if block.name == "get_weather":
                        result = get_weather(**block.input)
                    elif block.name == "get_stock_price":
                        result = {"ticker": block.input["ticker"], "price": 189.42}
                    else:
                        result = {"error": f"Unknown tool: {block.name}"}

                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    })

            # Return tool results to the model
            messages.append({"role": "user", "content": tool_results})

result = run_with_tools("Compare the weather in Paris and London.")
print(result)
SECTION 05

Parallel tool calls

Claude can request multiple tools in a single turn. This is efficient — all tool calls in one response can be executed concurrently:

import asyncio, anthropic

client = anthropic.Anthropic()

async def get_weather_async(location: str) -> dict:
    await asyncio.sleep(0.1)   # simulate API latency
    return {"location": location, "temp": 18}

async def execute_tool_calls_parallel(tool_use_blocks):
    '''Execute all tool calls concurrently.'''
    tasks = []
    for block in tool_use_blocks:
        if block.name == "get_weather":
            tasks.append((block.id, get_weather_async(block.input["location"])))

    results = await asyncio.gather(*[t[1] for t in tasks])
    return [
        {"type": "tool_result", "tool_use_id": tasks[i][0], "content": str(r)}
        for i, r in enumerate(results)
    ]

# If Claude calls get_weather for Paris AND London in one response,
# we execute both API calls simultaneously — 2× faster than sequential

Always execute parallel tool calls concurrently. Sequential execution unnecessarily doubles your latency — Claude grouped them in one response precisely because they're independent.

SECTION 06

Tool use with OpenAI

from openai import OpenAI
import json

client = OpenAI()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit":     {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    tools=tools,
    messages=[{"role": "user", "content": "Weather in Tokyo?"}]
)

# Parse tool call
tool_call = response.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)

# Execute and return result
weather_result = get_weather(**args)

messages = [
    {"role": "user", "content": "Weather in Tokyo?"},
    response.choices[0].message,   # include assistant's tool_call message
    {"role": "tool", "tool_call_id": tool_call.id, "content": str(weather_result)}
]
final = client.chat.completions.create(model="gpt-4o", messages=messages)
print(final.choices[0].message.content)
SECTION 07

Gotchas

Always handle stop_reason == "tool_use" explicitly. If you check for "end_turn" only and return early, you silently discard tool calls and the user gets a truncated response with no explanation.

Tool results must reference the correct tool_use_id. If you have 3 parallel tool calls and only return 2 results (mismatching IDs), the API returns a 400 error. Always return one result per tool call, in the correct ID order.

Keep descriptions accurate and specific. The model decides which tool to call based on descriptions alone. Vague descriptions ("process input") lead to wrong selections. Specific descriptions ("Search the web for current news and factual information; use for questions about recent events") lead to correct ones.

Tool schemas are part of your prompt budget. Complex schemas with many fields and long descriptions consume significant tokens on every call. Audit your schema for verbosity — a 2,000-token tool definition repeated across 1,000 requests is 2M tokens of overhead per day.

07 — Patterns

Tool Use Patterns & Best Practices

Tool use unlocks a qualitative shift in what LLMs can do — from pattern matching on training data to actively querying live systems, running code, and manipulating state. The key design decisions are: how many tools to expose (fewer is better — too many choices degrades tool selection), how to describe them (natural language descriptions matter more than parameter names), and whether to allow parallel tool calls (yes, for independent operations).

Common failure modes: the model calls the wrong tool due to ambiguous descriptions; the model hallucinates tool arguments for tools it doesn't fully understand; the model gets stuck in tool-call loops when a tool returns an error. Mitigations: use distinct tool names, include examples in descriptions, set a max-turns limit, and handle tool errors gracefully by returning structured error messages the model can reason about.

PatternDescriptionWhen to UseRisk
Single tool callModel calls one tool, gets result, respondsSimple lookups, calculationsLow — easy to audit
Sequential chainingOutput of tool A feeds into tool BMulti-step workflowsMedium — error propagation
Parallel callsMultiple independent tools called simultaneouslyFetching from multiple sourcesMedium — harder to debug
Agentic loopsModel iterates tool calls until task completeComplex open-ended tasksHigh — needs loop limit, human gate
Human-in-loopPause for human approval before high-stakes toolsWrite/delete/send operationsLow — safest for irreversible actions