What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

How to Build an AI Agent with Claude (Step-by-Step)

An AI agent isn't a chatbot with extra steps. A chatbot responds to a message. An agent reasons about a goal, decides what actions to take, executes those actions using tools, observes the results, and repeats until the job is done. The difference is autonomy over a sequence of steps — not just generating text.

Claude is particularly well-suited for agent work. It follows instructions reliably, handles complex tool schemas without hallucinating calls, and its long context window means it can maintain state across many steps without losing track of what's happened.

This tutorial builds a real agent from scratch: one that can search for information, process the results, and return a structured answer. You'll have working code by the end.

What you need

A Claude API key (get one at console.anthropic.com)
Python 3.9+ with the anthropic package installed (pip install anthropic)
Basic Python comfort — you don't need to understand transformers

How Claude tool use works

Before writing code, understand the loop. When you give Claude tools, every conversation follows this pattern:

You send a message with a list of available tools (name, description, input schema)
Claude responds — either with a text answer, or with a tool_use block requesting a specific tool call
You execute the tool and send back the result in a tool_result block
Claude continues — it reads the result and either calls another tool or gives a final answer
Repeat until Claude returns a final text response

This loop is what makes agents different from single-shot completions. Claude is directing its own workflow, not just responding to yours.

Step 1: Define your tools

Tools are just JSON schemas. You describe what a function does and what parameters it takes, and Claude decides when to call it.

tools = [
    {
        "name": "web_search",
        "description": "Search the web for current information. Use this when you need facts, recent events, or specific data you don't know.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The search query. Be specific — use keywords, not natural language questions."
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "calculate",
        "description": "Evaluate a mathematical expression and return the result. Use for arithmetic, unit conversions, or any calculation.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "A valid Python math expression, e.g. '150 * 0.07' or '(42 + 18) / 3'"
                }
            },
            "required": ["expression"]
        }
    }
]

Tool description quality matters a lot. Claude decides which tool to call (and whether to call one at all) based entirely on the description. Be specific about when to use each tool, not just what it does. A vague description produces unreliable tool selection.

Step 2: Implement the tool functions

These are regular Python functions. Claude doesn't execute them — you do. Claude just tells you which one to call and with what arguments.

import anthropic
import json

def web_search(query: str) -> str:
    # In a real agent, wire this to SerpAPI, Brave Search, Tavily, etc.
    # For this tutorial, we mock it.
    mock_results = {
        "latest Claude model": "Claude Sonnet 4.6 was released in February 2026 with a 1M token context window.",
        "Python version": "Python 3.13 is the latest stable version as of early 2026.",
    }
    for key, value in mock_results.items():
        if key.lower() in query.lower():
            return value
    return f"Search results for '{query}': No specific results found in mock database."

def calculate(expression: str) -> str:
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except Exception as e:
        return f"Error evaluating expression: {e}"

def execute_tool(tool_name: str, tool_input: dict) -> str:
    if tool_name == "web_search":
        return web_search(tool_input["query"])
    elif tool_name == "calculate":
        return calculate(tool_input["expression"])
    else:
        return f"Unknown tool: {tool_name}"

Step 3: Build the agent loop

This is the core of the agent. It handles the back-and-forth between Claude and your tools until Claude signals it's done.

def run_agent(user_message: str, system_prompt: str = None) -> str:
    client = anthropic.Anthropic()

    messages = [{"role": "user", "content": user_message}]

    if not system_prompt:
        system_prompt = """You are a helpful research assistant.
You have access to web search and calculation tools.
Always use tools to verify facts rather than relying on your training data for current information.
When you have enough information to answer the question, stop calling tools and give a clear, direct answer."""

    print(f"\nUser: {user_message}\n")

    # Agent loop — runs until Claude stops requesting tools
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=4096,
            system=system_prompt,
            tools=tools,
            messages=messages,
        )

        # Add Claude's response to message history
        messages.append({"role": "assistant", "content": response.content})

        # Check if Claude is done (no tool calls)
        if response.stop_reason == "end_turn":
            # Extract and return the final text response
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text

        # Process tool calls
        if response.stop_reason == "tool_use":
            tool_results = []

            for block in response.content:
                if block.type == "tool_use":
                    print(f"  → Calling tool: {block.name}({block.input})")
                    result = execute_tool(block.name, block.input)
                    print(f"  ← Result: {result}\n")

                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })

            # Add tool results to message history and continue the loop
            messages.append({"role": "user", "content": tool_results})

Step 4: Run it

if __name__ == "__main__":
    answer = run_agent(
        "What's the latest Claude model, and if it costs $3 per million input tokens, "
        "how much would 50 million tokens cost?"
    )
    print(f"\nFinal answer:\n{answer}")

Output:

User: What's the latest Claude model, and if it costs $3 per million input tokens, how much would 50 million tokens cost?

  → Calling tool: web_search({'query': 'latest Claude model 2026'})
  ← Result: Claude Sonnet 4.6 was released in February 2026 with a 1M token context window.

  → Calling tool: calculate({'expression': '50 * 3'})
  ← Result: 150

Final answer:
The latest Claude model is Claude Sonnet 4.6, released in February 2026. At $3 per million input tokens, 50 million tokens would cost $150.

Claude searched for the model, ran the calculation, and combined the results — all without you orchestrating which tools to call or in what order.

Making it production-ready

The tutorial above works. Here's what you'd add for anything beyond a prototype:

Error handling in the loop

# Add a max_iterations guard to prevent infinite loops
max_iterations = 10
iteration = 0

while iteration < max_iterations:
    iteration += 1
    # ... rest of loop

if iteration >= max_iterations:
    return "Agent reached maximum iterations without completing the task."

Persistent memory For a real agent, you need memory that persists across conversations. Options:

Simple: store message history in a database, load it at the start of each conversation
Advanced: use a vector database to retrieve relevant past context (this is agentic RAG)

Real tool implementations Replace the mock web_search with an actual search API. Tavily and Brave Search both have Python SDKs and are commonly used in agent setups. Tavily is particularly popular because it returns clean, structured results that Claude can reason about easily.

Logging and observability Log every tool call, every response, and every tool result. When agents fail in production, the logs are your only way to understand what happened. Tools like LangSmith and Braintrust are designed specifically for this.

System prompt hardening Add explicit failure handling to your system prompt:

If a tool returns an error, try a different approach — don't just report the error.
If you've tried three times and still can't complete the task, explain what you were unable to do and why.
Never make up information to fill gaps — use tools or say you don't know.

What to build next

Once you have the basic loop working, the interesting problems are:

Multiple agents — one agent decomposes a task, others execute subtasks in parallel. The multi-agent systems lesson covers the patterns.
Structured output — make Claude return JSON instead of text so your application can parse the result programmatically
MCP tools — instead of defining tool schemas manually, connect Claude to MCP servers that already expose tools for Notion, GitHub, Postgres, etc. See the MCP protocol guide.
Evaluation — before deploying, build a test suite of known inputs and expected outputs. The evaluating agents lesson covers how.

The agent loop itself is simple. Everything interesting happens in the quality of your tools, the robustness of your error handling, and the precision of your system prompt. Start small, log everything, and iterate.

The full code for this tutorial is in the coding section of the prompt library — with a copy button.

This tutorial builds a real agent from scratch: one that can search for information, process the results, and return a structured answer. You'll have working code by the end.

What you need

A Claude API key (get one at console.anthropic.com)
Python 3.9+ with the anthropic package installed (pip install anthropic)
Basic Python comfort — you don't need to understand transformers

How Claude tool use works

Before writing code, understand the loop. When you give Claude tools, every conversation follows this pattern:

You send a message with a list of available tools (name, description, input schema)
Claude responds — either with a text answer, or with a tool_use block requesting a specific tool call
You execute the tool and send back the result in a tool_result block
Claude continues — it reads the result and either calls another tool or gives a final answer
Repeat until Claude returns a final text response

This loop is what makes agents different from single-shot completions. Claude is directing its own workflow, not just responding to yours.

Step 1: Define your tools

Tools are just JSON schemas. You describe what a function does and what parameters it takes, and Claude decides when to call it.

tools = [
    {
        "name": "web_search",
        "description": "Search the web for current information. Use this when you need facts, recent events, or specific data you don't know.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The search query. Be specific — use keywords, not natural language questions."
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "calculate",
        "description": "Evaluate a mathematical expression and return the result. Use for arithmetic, unit conversions, or any calculation.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "A valid Python math expression, e.g. '150 * 0.07' or '(42 + 18) / 3'"
                }
            },
            "required": ["expression"]
        }
    }
]

Step 2: Implement the tool functions

These are regular Python functions. Claude doesn't execute them — you do. Claude just tells you which one to call and with what arguments.

import anthropic
import json

def web_search(query: str) -> str:
    # In a real agent, wire this to SerpAPI, Brave Search, Tavily, etc.
    # For this tutorial, we mock it.
    mock_results = {
        "latest Claude model": "Claude Sonnet 4.6 was released in February 2026 with a 1M token context window.",
        "Python version": "Python 3.13 is the latest stable version as of early 2026.",
    }
    for key, value in mock_results.items():
        if key.lower() in query.lower():
            return value
    return f"Search results for '{query}': No specific results found in mock database."

def calculate(expression: str) -> str:
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except Exception as e:
        return f"Error evaluating expression: {e}"

def execute_tool(tool_name: str, tool_input: dict) -> str:
    if tool_name == "web_search":
        return web_search(tool_input["query"])
    elif tool_name == "calculate":
        return calculate(tool_input["expression"])
    else:
        return f"Unknown tool: {tool_name}"

Step 3: Build the agent loop

This is the core of the agent. It handles the back-and-forth between Claude and your tools until Claude signals it's done.

def run_agent(user_message: str, system_prompt: str = None) -> str:
    client = anthropic.Anthropic()

    messages = [{"role": "user", "content": user_message}]

    if not system_prompt:
        system_prompt = """You are a helpful research assistant.
You have access to web search and calculation tools.
Always use tools to verify facts rather than relying on your training data for current information.
When you have enough information to answer the question, stop calling tools and give a clear, direct answer."""

    print(f"\nUser: {user_message}\n")

    # Agent loop — runs until Claude stops requesting tools
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=4096,
            system=system_prompt,
            tools=tools,
            messages=messages,
        )

        # Add Claude's response to message history
        messages.append({"role": "assistant", "content": response.content})

        # Check if Claude is done (no tool calls)
        if response.stop_reason == "end_turn":
            # Extract and return the final text response
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text

        # Process tool calls
        if response.stop_reason == "tool_use":
            tool_results = []

            for block in response.content:
                if block.type == "tool_use":
                    print(f"  → Calling tool: {block.name}({block.input})")
                    result = execute_tool(block.name, block.input)
                    print(f"  ← Result: {result}\n")

                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })

            # Add tool results to message history and continue the loop
            messages.append({"role": "user", "content": tool_results})

Step 4: Run it

if __name__ == "__main__":
    answer = run_agent(
        "What's the latest Claude model, and if it costs $3 per million input tokens, "
        "how much would 50 million tokens cost?"
    )
    print(f"\nFinal answer:\n{answer}")

Output:

User: What's the latest Claude model, and if it costs $3 per million input tokens, how much would 50 million tokens cost?

  → Calling tool: web_search({'query': 'latest Claude model 2026'})
  ← Result: Claude Sonnet 4.6 was released in February 2026 with a 1M token context window.

  → Calling tool: calculate({'expression': '50 * 3'})
  ← Result: 150

Final answer:
The latest Claude model is Claude Sonnet 4.6, released in February 2026. At $3 per million input tokens, 50 million tokens would cost $150.

Claude searched for the model, ran the calculation, and combined the results — all without you orchestrating which tools to call or in what order.

Making it production-ready

The tutorial above works. Here's what you'd add for anything beyond a prototype:

Error handling in the loop

# Add a max_iterations guard to prevent infinite loops
max_iterations = 10
iteration = 0

while iteration < max_iterations:
    iteration += 1
    # ... rest of loop

if iteration >= max_iterations:
    return "Agent reached maximum iterations without completing the task."

Persistent memory For a real agent, you need memory that persists across conversations. Options:

Simple: store message history in a database, load it at the start of each conversation
Advanced: use a vector database to retrieve relevant past context (this is agentic RAG)

System prompt hardening Add explicit failure handling to your system prompt:

If a tool returns an error, try a different approach — don't just report the error.
If you've tried three times and still can't complete the task, explain what you were unable to do and why.
Never make up information to fill gaps — use tools or say you don't know.

What to build next

Once you have the basic loop working, the interesting problems are:

Multiple agents — one agent decomposes a task, others execute subtasks in parallel. The multi-agent systems lesson covers the patterns.
Structured output — make Claude return JSON instead of text so your application can parse the result programmatically
MCP tools — instead of defining tool schemas manually, connect Claude to MCP servers that already expose tools for Notion, GitHub, Postgres, etc. See the MCP protocol guide.
Evaluation — before deploying, build a test suite of known inputs and expected outputs. The evaluating agents lesson covers how.

The full code for this tutorial is in the coding section of the prompt library — with a copy button.

How to Build an AI Agent with Claude (Step-by-Step)

What you need

How Claude tool use works

Step 1: Define your tools

Step 2: Implement the tool functions

Step 3: Build the agent loop

Step 4: Run it

Making it production-ready

What to build next

Related articles

50 Best AI Prompts for Claude That Actually Work (2026)

Claude Extended Thinking — How to Prompt for Deep Reasoning

Claude Sonnet 4.6 — The Complete Guide

How to Build an AI Agent with Claude (Step-by-Step)

What you need

How Claude tool use works

Step 1: Define your tools

Step 2: Implement the tool functions

Step 3: Build the agent loop

Step 4: Run it

Making it production-ready

What to build next

Related articles

50 Best AI Prompts for Claude That Actually Work (2026)

Claude Extended Thinking — How to Prompt for Deep Reasoning

Claude Sonnet 4.6 — The Complete Guide