What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

Claude Sonnet 4.6 — The Complete Guide

Claude Sonnet 4.6 is the model I reach for by default. Not because it's the most powerful thing Anthropic has shipped — Opus 4 exists for that — but because it hits a capability-to-cost-to-speed ratio that's hard to beat for production workloads. If you're building something real and want to understand exactly what you're working with, this is the guide.

What makes Sonnet 4.6 the sweet spot

There's a useful mental model for Anthropic's model lineup: Haiku for fast cheap tasks, Sonnet for most things, Opus for genuinely hard problems. That's been true for a while, but Sonnet 4.6 has closed the gap with Opus on most day-to-day tasks.

Concretely: Sonnet 4.6 handles multi-step coding tasks, complex document analysis, structured extraction at scale, and agentic tool-use loops without the cost overhead or latency of Opus. On coding benchmarks it scores close to Opus 4. On extended reasoning tasks where you enable thinking mode, it punches well above its price point.

The API model ID is claude-sonnet-4-6. Use it exactly that way in your API calls — Anthropic's API doesn't accept aliases.

Context window: 200k tokens, used correctly

200,000 tokens. That's roughly 150,000 words or a 500-page book. In practice it means you can throw an entire codebase, a legal contract, a research corpus, or a year of Slack logs at a single prompt and get coherent answers back.

But bigger isn't automatically better. A few things to know:

Performance degrades in the middle. Claude (like all transformer-based models) attends most reliably to the beginning and end of long contexts. If you have critical instructions or key facts, put them at the top of your system prompt or just before the user message — not buried in the middle of 50k tokens of context.

Cache your large context blocks. If you're repeatedly sending the same large document or codebase, use prompt caching with cache_control breakpoints. You pay full price on the first call and roughly 10% on cache reads. For a 100k-token context repeated across 100 API calls, that's a 90% reduction in input token costs.

Not everything needs 200k. Sending 150k tokens when your actual payload is 2k costs money and adds latency. Trim your context to what's relevant.

Pricing breakdown

Sonnet 4.6 pricing at time of writing:

Token type	Price per million tokens
Input	$3.00
Output	$15.00
Cache write	$3.75
Cache read	$0.30

Compare that to the lineup:

Model	Input	Output
Haiku 3.5	$0.80	$4.00
Sonnet 4.6	$3.00	$15.00
Opus 4	$15.00	$75.00

Opus is 5x more expensive on input and 5x on output. For a typical agentic loop that makes 20 API calls per task, that's a real cost difference. Sonnet 4.6 becomes the obvious choice unless you have a specific reason to go up or down the stack.

India developers: AICredits lets you access the Claude API with INR billing via UPI — no USD card or international transaction fees needed.

Capabilities by task type

Coding

This is where Sonnet 4.6 earns its reputation. It writes clean, idiomatic code across Python, TypeScript, Go, Rust, and SQL without needing heavy hand-holding. More importantly, it understands diffs — you can hand it a broken pull request and ask what's wrong, and it'll actually find the bug rather than hallucinating one.

It handles multi-file context well. Paste in three related files and ask it to refactor a shared utility and it'll track the dependencies correctly. It's not perfect, but it's the best I've used at this outside of full IDE integration.

Tool use and function calling are strong. If you're building an agent that needs to call APIs, query databases, or chain tool outputs together, Sonnet 4.6 follows tool schemas reliably. Check how to design tools for AI agents for patterns that hold up in production.

Reasoning and analysis

Sonnet 4.6 handles multi-step logical problems, structured argumentation, and policy analysis well. It doesn't drift off track in long chains of reasoning the way some models do. For simpler reasoning tasks, just use the base model. For genuinely hard problems — complex financial modeling, multi-variable tradeoff analysis, legal interpretation — enable extended thinking mode.

Writing

Strong. It matches tone, holds a voice across a long document, and doesn't produce the flat corporate prose that plagues lesser models. The main failure mode is over-caution: it'll sometimes soften a strong claim or hedge where you don't want it to. Be explicit about tone in your system prompt.

Vision

Sonnet 4.6 reads images, charts, screenshots, and diagrams accurately. It's good at extracting structured data from screenshots (tables, forms, pricing pages) and at describing visual layouts precisely enough to be useful. It won't match a dedicated OCR tool for pure text extraction throughput, but for understanding visual context it's solid.

Structured outputs

Tell it to respond in JSON and it does. Tell it to follow a specific schema and it'll follow it. This is especially reliable when you combine explicit format instructions with XML tags in your prompt — more on that below.

Extended thinking mode

Extended thinking lets Sonnet 4.6 reason through a problem step by step before returning an answer. You enable it by passing a thinking block with a budget_tokens value in your API call.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000
    },
    messages=[{"role": "user", "content": "...your hard problem here..."}]
)

The thinking tokens are billed as output tokens, so it adds cost. Turn it on when:

The task involves multi-step reasoning where intermediate steps matter
You're seeing inconsistent answers and want the model to slow down
You're doing complex math, logic proofs, or structured planning

Don't turn it on for simple classification, short generation tasks, or anything where you're just paying for tokens you don't need. The extended thinking guide covers budget tuning and when it moves the needle.

Prompting patterns that work with Sonnet 4.6

Use XML tags for structure

Sonnet 4.6 was trained with XML-tagged prompts. It picks up on <context>, <task>, <format>, <examples> blocks and handles each section with more precision than freeform text. A prompt like this:

<context>
You are reviewing a Python microservice that handles payment processing.
</context>

<task>
Identify all places where exceptions are swallowed silently (bare except clauses or 
except Exception: pass patterns). For each one, explain the risk and suggest a fix.
</task>

<format>
Return a numbered list. For each issue: file path + line number, the problematic code, 
the risk, and the recommended fix.
</format>

...will outperform an equivalent freeform prompt on precision and completeness.

Give explicit output format instructions

Don't assume the model will pick the right format. If you want JSON, say "Respond with a JSON object matching this schema: {...}". If you want a numbered list with specific fields, show the structure. Sonnet 4.6 follows explicit format instructions reliably, which means you can skip a lot of output parsing headaches downstream.

System prompt placement matters

For agentic tasks with tool use, keep your core behavioral instructions in the system prompt. Put task-specific details in the user message. Sonnet 4.6 treats the system prompt as higher-priority context — role definition, output format rules, and safety constraints all belong there.

Few-shot examples work

If you have a task where you know what good output looks like, include 2-3 examples. Sonnet 4.6 learns from them quickly. Even one good example dramatically reduces format drift on structured extraction tasks.

A complete API example

Here's a working example: basic API call with a system prompt, user message, and tool definition for a code review agent.

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "create_issue",
        "description": "Create a code review issue with a severity level and suggested fix.",
        "input_schema": {
            "type": "object",
            "properties": {
                "file": {"type": "string", "description": "File path"},
                "line": {"type": "integer", "description": "Line number"},
                "severity": {
                    "type": "string",
                    "enum": ["critical", "major", "minor", "suggestion"]
                },
                "description": {"type": "string"},
                "suggested_fix": {"type": "string"}
            },
            "required": ["file", "line", "severity", "description"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    system="""You are a senior engineer conducting a security-focused code review.
For each issue you find, call the create_issue tool. Focus on: SQL injection, 
unvalidated inputs, hardcoded credentials, and insecure deserialization.""",
    tools=tools,
    messages=[
        {
            "role": "user",
            "content": f"Review this code:\n\n```python\n{code_to_review}\n```"
        }
    ]
)

For a deeper comparison of Claude's API against OpenAI's, including auth, rate limits, and SDK differences, see Claude API vs OpenAI API.

Benchmarks: where Sonnet leads, where Opus pulls ahead

Sonnet 4.6 matches or approaches Opus 4 on:

HumanEval (coding)
MMLU (knowledge breadth)
GSM8K (math reasoning)
Most structured extraction tasks

Opus 4 pulls ahead on:

Long-document synthesis requiring deep cross-referencing
Multi-hop reasoning chains with 5+ logical steps
Research tasks requiring nuanced judgment calls

If your task involves generating code, extracting structured data, writing documents, or running agentic workflows with clear tool schemas, Sonnet 4.6 is the right call. If you're doing frontier research synthesis or genuinely hard multi-step reasoning at scale, Opus starts to justify its 5x price.

When to use Haiku instead

Haiku 3.5 is fast and cheap. Use it when:

You need sub-200ms response times for user-facing features
The task is simple classification, routing, or short extraction
You're running high-volume batch jobs where Sonnet's cost adds up
You're building a system that makes 10,000+ API calls per day on simple tasks

The pattern I use: Haiku for the cheap outer loop (routing, classification, filtering), Sonnet for the tasks that require actual reasoning. You can cut API costs by 60-70% this way on high-volume systems without degrading quality where it matters.

When to upgrade to Opus

Upgrade to Opus 4 when you're hitting the ceiling on Sonnet. Signs you need it:

Sonnet is hallucinating on complex multi-document synthesis tasks
Your multi-step reasoning chain is producing inconsistent results even with extended thinking
You're doing legal, medical, or financial analysis where nuance and accuracy outweigh cost
You're building a Claude Projects-style system with extremely complex persistent context

Don't jump to Opus as a default. Start with Sonnet, identify where quality is falling short, then upgrade selectively.

The bottom line

Claude Sonnet 4.6 is a production-grade model. It's what you run when you don't have a specific reason to go cheaper or more expensive. The 200k context window is genuinely useful, not just a spec sheet number. The XML-aware prompting, reliable tool use, and strong coding performance make it the default for anything agentic.

Use the model ID claude-sonnet-4-6, structure your prompts with XML tags, cache your large context blocks, and turn on extended thinking only when you actually need it. That's the short version.

What makes Sonnet 4.6 the sweet spot

The API model ID is claude-sonnet-4-6. Use it exactly that way in your API calls — Anthropic's API doesn't accept aliases.

Context window: 200k tokens, used correctly

But bigger isn't automatically better. A few things to know:

Not everything needs 200k. Sending 150k tokens when your actual payload is 2k costs money and adds latency. Trim your context to what's relevant.

Pricing breakdown

Sonnet 4.6 pricing at time of writing:

Token type	Price per million tokens
Input	$3.00
Output	$15.00
Cache write	$3.75
Cache read	$0.30

Compare that to the lineup:

Model	Input	Output
Haiku 3.5	$0.80	$4.00
Sonnet 4.6	$3.00	$15.00
Opus 4	$15.00	$75.00

India developers: AICredits lets you access the Claude API with INR billing via UPI — no USD card or international transaction fees needed.

Capabilities by task type

Coding

Reasoning and analysis

Writing

Vision

Structured outputs

Extended thinking mode

Extended thinking lets Sonnet 4.6 reason through a problem step by step before returning an answer. You enable it by passing a thinking block with a budget_tokens value in your API call.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000
    },
    messages=[{"role": "user", "content": "...your hard problem here..."}]
)

The thinking tokens are billed as output tokens, so it adds cost. Turn it on when:

The task involves multi-step reasoning where intermediate steps matter
You're seeing inconsistent answers and want the model to slow down
You're doing complex math, logic proofs, or structured planning

Prompting patterns that work with Sonnet 4.6

Use XML tags for structure

<context>
You are reviewing a Python microservice that handles payment processing.
</context>

<task>
Identify all places where exceptions are swallowed silently (bare except clauses or 
except Exception: pass patterns). For each one, explain the risk and suggest a fix.
</task>

<format>
Return a numbered list. For each issue: file path + line number, the problematic code, 
the risk, and the recommended fix.
</format>

...will outperform an equivalent freeform prompt on precision and completeness.

Give explicit output format instructions

System prompt placement matters

Few-shot examples work

A complete API example

Here's a working example: basic API call with a system prompt, user message, and tool definition for a code review agent.

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "create_issue",
        "description": "Create a code review issue with a severity level and suggested fix.",
        "input_schema": {
            "type": "object",
            "properties": {
                "file": {"type": "string", "description": "File path"},
                "line": {"type": "integer", "description": "Line number"},
                "severity": {
                    "type": "string",
                    "enum": ["critical", "major", "minor", "suggestion"]
                },
                "description": {"type": "string"},
                "suggested_fix": {"type": "string"}
            },
            "required": ["file", "line", "severity", "description"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    system="""You are a senior engineer conducting a security-focused code review.
For each issue you find, call the create_issue tool. Focus on: SQL injection, 
unvalidated inputs, hardcoded credentials, and insecure deserialization.""",
    tools=tools,
    messages=[
        {
            "role": "user",
            "content": f"Review this code:\n\n```python\n{code_to_review}\n```"
        }
    ]
)

For a deeper comparison of Claude's API against OpenAI's, including auth, rate limits, and SDK differences, see Claude API vs OpenAI API.

Benchmarks: where Sonnet leads, where Opus pulls ahead

Sonnet 4.6 matches or approaches Opus 4 on:

HumanEval (coding)
MMLU (knowledge breadth)
GSM8K (math reasoning)
Most structured extraction tasks

Opus 4 pulls ahead on:

Long-document synthesis requiring deep cross-referencing
Multi-hop reasoning chains with 5+ logical steps
Research tasks requiring nuanced judgment calls

When to use Haiku instead

Haiku 3.5 is fast and cheap. Use it when:

You need sub-200ms response times for user-facing features
The task is simple classification, routing, or short extraction
You're running high-volume batch jobs where Sonnet's cost adds up
You're building a system that makes 10,000+ API calls per day on simple tasks

When to upgrade to Opus

Upgrade to Opus 4 when you're hitting the ceiling on Sonnet. Signs you need it:

Sonnet is hallucinating on complex multi-document synthesis tasks
Your multi-step reasoning chain is producing inconsistent results even with extended thinking
You're doing legal, medical, or financial analysis where nuance and accuracy outweigh cost
You're building a Claude Projects-style system with extremely complex persistent context

Don't jump to Opus as a default. Start with Sonnet, identify where quality is falling short, then upgrade selectively.

The bottom line

Use the model ID claude-sonnet-4-6, structure your prompts with XML tags, cache your large context blocks, and turn on extended thinking only when you actually need it. That's the short version.

What makes Sonnet 4.6 the sweet spot

Context window: 200k tokens, used correctly

Pricing breakdown

Capabilities by task type

Coding

Reasoning and analysis

Writing

Vision

Structured outputs

Extended thinking mode

Prompting patterns that work with Sonnet 4.6

Use XML tags for structure

Give explicit output format instructions

System prompt placement matters

Few-shot examples work

A complete API example

Benchmarks: where Sonnet leads, where Opus pulls ahead

When to use Haiku instead

When to upgrade to Opus

The bottom line

Related articles

Claude Max Plan — What You Get and Whether It's Worth It

Async Python for LLM Apps — Patterns That Actually Work in Production

50 Best AI Prompts for Claude That Actually Work (2026)

What makes Sonnet 4.6 the sweet spot

Context window: 200k tokens, used correctly

Pricing breakdown

Capabilities by task type

Coding

Reasoning and analysis

Writing

Vision

Structured outputs

Extended thinking mode

Prompting patterns that work with Sonnet 4.6

Use XML tags for structure

Give explicit output format instructions

System prompt placement matters

Few-shot examples work

A complete API example

Benchmarks: where Sonnet leads, where Opus pulls ahead

When to use Haiku instead

When to upgrade to Opus

The bottom line

Related articles

Claude Max Plan — What You Get and Whether It's Worth It

Async Python for LLM Apps — Patterns That Actually Work in Production

50 Best AI Prompts for Claude That Actually Work (2026)