Why Function Calling Matters
An LLM without tools is limited to what it knows from training. Function calling changes that fundamentally — it gives the model a way to request actions in the real world and receive structured results back.
This is what enables agents to:
- Look up real-time information
- Run calculations
- Read and write files
- Query databases
- Control browsers
- Call APIs
Function calling is the technical plumbing behind all of that.
How It Works: The Three-Step Flow
Function calling follows a predictable three-step loop:
Step 1: You Define the Tools
Before the conversation starts, you tell the model what tools are available by providing a schema for each one:
[
{
"name": "get_weather",
"description": "Get the current weather for a city. Use this when the user asks about weather conditions.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g. 'London' or 'Tokyo'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit. Default to celsius."
}
},
"required": ["city"]
}
}
]
Step 2: The Model Decides to Call a Tool
When relevant, the model responds not with text but with a tool call request — a structured message specifying the tool name and arguments:
{
"type": "tool_use",
"name": "get_weather",
"input": {
"city": "London",
"unit": "celsius"
}
}
Your code detects this, runs the actual get_weather("London", "celsius") function, and returns the result.
Step 3: The Model Receives the Result and Continues
You send the tool result back to the model:
{
"type": "tool_result",
"content": "Current weather in London: 12°C, overcast with light rain. Wind: 15 km/h from the southwest."
}
The model now incorporates this into its response, continuing the conversation or making another tool call if needed.
Writing Good Tool Descriptions
The description field is the most important part of any tool definition. It's what the model reads to decide when to call the tool.
Bad description:
{
"name": "search",
"description": "Search for information"
}
The model has no idea when to use this versus just answering from memory.
Good description:
{
"name": "search_web",
"description": "Search the internet for current, real-time information. Use this when: (1) the question requires up-to-date data like prices, news, or recent events, (2) you are not confident your training data is accurate or recent enough, or (3) the user asks about something that changes frequently. Do NOT use this for general knowledge questions you can answer reliably."
}
This tells the model exactly when to reach for this tool — and when not to.
Rule: Write your description as if you're telling a capable but literal junior employee when to use this resource.
Parallel Tool Calls
Modern models can call multiple tools simultaneously in a single turn when the tasks are independent. This dramatically speeds up complex workflows.
Example: A research agent building a competitive analysis might simultaneously:
- Search for Company A's recent news
- Search for Company B's recent news
- Fetch Company A's latest financial data
- Fetch Company B's latest financial data
Instead of 4 sequential turns, the agent does all 4 in one turn and waits for all results.
Enable parallel tool calls in your API configuration — most providers support this by default on capable models.
Handling Tool Results Well
How you structure your tool results affects agent reliability significantly.
Include context, not just data
Poor result:
"12°C"
Better result:
"Current weather in London (as of 14:32 UTC, Feb 26 2026): 12°C, overcast with light rain. Forecast: rain continuing through the evening."
The richer result gives the model more to work with and reduces follow-up tool calls.
Surface errors clearly
If a tool fails, return a structured error:
{
"error": true,
"message": "Could not retrieve weather data for 'Londn' — city not found. Did you mean 'London'?"
}
This gives the agent a chance to self-correct rather than silently failing.
Keep results focused
Don't flood the model with unnecessary data. If a database query returns 500 rows, summarize or paginate. Too much data can overwhelm the context window and obscure the relevant information.
A Complete Example: Claude with Tool Use
Here's how function calling looks with Claude's API (simplified):
import anthropic
client = anthropic.Anthropic()
tools = [
{
"name": "search_web",
"description": "Search the internet for current information.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query"}
},
"required": ["query"]
}
}
]
# First turn — model decides to call a tool
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the latest news about fusion energy?"}]
)
# Check if model wants to use a tool
if response.stop_reason == "tool_use":
tool_call = next(b for b in response.content if b.type == "tool_use")
# Run the actual function
search_result = search_web(tool_call.input["query"])
# Send result back
final_response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=tools,
messages=[
{"role": "user", "content": "What's the latest news about fusion energy?"},
{"role": "assistant", "content": response.content},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": tool_call.id, "content": search_result}
]}
]
)
Common Pitfalls
Pitfall 1: Too many tools Giving the model 30 tools at once leads to confusion and wrong selections. Keep your active tool set small and relevant. If you need many tools, use a routing layer to present only the relevant subset for each task type.
Pitfall 2: Vague tool names
process_data tells the model nothing. extract_entities_from_text is immediately clear.
Pitfall 3: Not handling tool failures If your tool throws an exception and crashes the loop, the agent stops entirely. Always wrap tool execution in error handling and return structured error messages the model can reason about.
Pitfall 4: Ignoring the result Always send tool results back to the model in the next message. A common mistake is running the tool but not adding the result to the conversation, leaving the model to hallucinate what the result might have been.
Key Takeaways
- Function calling is a three-step loop: define tools → model requests a call → you run it and return the result
- Tool descriptions are prompts — write them clearly to guide the model's selection
- Good results include context, timestamps, and clear error messages
- Parallel tool calls speed up multi-step tasks dramatically
- Keep tool sets small and focused; use routing if you need many tools
- Next lesson: ReAct prompting — the reasoning pattern that makes function-calling agents reliable