What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

Claude Opus 4.6 Prompting Guide: Adaptive Thinking, Effort Levels, and 1M Context

Released February 5, 2026, Claude Opus 4.6 is the largest capability jump in a single Claude generation. Three things changed how you should prompt it: adaptive thinking, the effort parameter, and 1M context going GA. If you're still using the old budget_tokens approach, you're overpaying — and the parameter is deprecated.

What changed from Claude 4.5 to 4.6 (for prompters)

The surface API looks similar. The behaviour underneath is substantially different.

Feature	Claude 4.5	Claude 4.6
Thinking mode	`budget_tokens` required	Adaptive (auto)
Effort control	Not available	`low / medium / high / max`
Context window	200K	1M (Opus & Sonnet)
Max output	64K	128K (Opus), 64K (Sonnet)
Prefill	Supported	Returns 400 error
Models available	3.5 Sonnet, Haiku 3.5	Retired; 4.6 Sonnet/Opus replace them

The retired models matter: claude-3-5-sonnet-20241022 and claude-haiku-3-5-20241022 are gone as of ~April 2026. If your code still references those strings, you'll get errors.

Adaptive thinking — how it actually works

The old extended thinking API required you to specify how many tokens Claude was "allowed" to think with. This was a blunt instrument — you either over-allocated (wasting money) or under-allocated (degrading quality on hard problems).

Old way (deprecated, avoid)

# Don't do this — budget_tokens is deprecated in Claude 4.6
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # ❌ deprecated
    },
    messages=[{"role": "user", "content": "Solve this..."}]
)

New way (adaptive)

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},  # ✅ Claude decides internally
    effort="high",  # controls depth of reasoning
    messages=[{"role": "user", "content": "Solve this..."}]
)

With adaptive thinking, Claude decides internally when and how much to think. At effort="high" (the default for Opus 4.6), it almost always engages the thinking module. At effort="low", it skips thinking entirely for simple problems and responds directly. The model is better at calibrating this than you are with a manual token budget.

The effort parameter — your main cost-quality lever

The effort parameter is new in 4.6 and is the primary way to control the cost-quality tradeoff.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4000,
    thinking={"type": "adaptive"},
    effort="medium",  # low | medium | high | max
    messages=[{"role": "user", "content": prompt}]
)

Effort	Use case	Relative cost
`low`	Classification, routing, extraction, yes/no decisions	Cheapest
`medium`	General coding, text generation, summarisation, standard Q&A	~40–60% cheaper than `high`
`high`	Default for Opus 4.6. Complex reasoning, multi-step planning.	Standard
`max`	Hardest problems only. Architecture decisions, novel research. Opus 4.6 only.	Most expensive

Anthropic's own documentation recommends medium as the default for most Sonnet 4.6 use cases. That's meaningful — it means the engineering team is confident that most tasks don't need full-depth reasoning.

Rule of thumb

Start all Sonnet 4.6 tasks at effort="medium". Drop to low for classification and routing. Reserve max exclusively for Opus 4.6 on problems you'd genuinely spend an hour thinking through yourself.

The 1M token context window — what you can actually do now

The 1M context window is now GA for both Opus 4.6 and Sonnet 4.6 — no beta header required. In practice, what fits?

Entire codebase: ~50,000 lines of code
Legal document set: ~200 average contracts
Research corpus: ~1,500 academic papers (abstract + intro sections)
Meeting transcripts: ~5 years of weekly 1-hour meetings

The "lost in the middle" problem is real but improved. Opus 4.6 MRCR v2 score is 76% on an 8-needle, 1M-token test — 4x better than Sonnet 4.5's 18.5% on the same task. Still, put the most important content at the beginning or end of the context. Structure with clear section headers so Claude can navigate when it can't hold every detail equally.

Cost at 1M tokens via AICredits.in for Indian developers: ~₹252 per 1M input tokens on Sonnet 4.6 ($3/MTok at ₹84/USD + ~10% platform markup), ~₹462 per 1M input tokens on Opus 4.6 ($5/MTok). No long-context surcharge — standard per-token pricing applies to every token in the window.

Interleaved thinking — reasoning between tool calls

When Claude uses tools in an agent loop, 4.5 would plan upfront, then execute. 4.6 with adaptive thinking reasons at each step — think before calling a tool, receive the result, think again before the next call.

This is automatically enabled when you use thinking={"type": "adaptive"} with tool use. You don't need any code changes if you're already using adaptive thinking. The practical effect: much more reliable multi-step agent workflows. Claude doesn't commit to a plan it can't revise; it updates its reasoning as evidence arrives.

Interleaved thinking is one of the main reasons Opus 4.6 hits 80.8% on SWE-bench Verified — code debugging requires updating your mental model as you read output, not following a predetermined script.

What to stop doing on 4.6

Stop using budget_tokens. The parameter is deprecated. Claude ignores it. Switch to thinking={"type": "adaptive"} and control depth via effort.

Stop prefilling assistant messages. This was a common trick to guide Claude's output format — pre-populating the start of Claude's response in the messages array. On 4.6, this returns a 400 error. The fix: move the formatting instruction into your user message or system prompt instead.

Stop using old model strings as the "cheap" option. The previous go-to for cost-sensitive tasks was claude-haiku-3-5-20241022. That model is retired. Use claude-sonnet-4-6 with effort="low" or effort="medium" instead — it's both cheaper and more capable for most tasks.

Quick migration from 4.5 to 4.6 API calls

Pattern 1: Old extended thinking → new adaptive

# Before (4.5)
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 8000},
    messages=messages
)

# After (4.6)
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    effort="high",
    messages=messages
)

Pattern 2: Old model string → new model string

# Before
model = "claude-3-5-sonnet-20241022"  # retired
model = "claude-haiku-3-5-20241022"   # retired

# After
model = "claude-sonnet-4-6"           # general purpose, cost-effective
model = "claude-opus-4-6"             # highest capability

Pattern 3: Old prefill trick → instruction in user message

# Before — returned structured output by prefilling "{"
messages = [
    {"role": "user", "content": "Extract entities from: " + text},
    {"role": "assistant", "content": "{"}  # ❌ returns 400 on 4.6
]

# After — instruction in user message
messages = [
    {
        "role": "user",
        "content": f"Extract entities from the text below. Respond with valid JSON only, starting with {{.\n\n{text}"
    }
]

The migration is mostly mechanical. The biggest functional change is the prefill breakage — if you have any code that builds the assistant turn before Claude responds, that needs fixing before you switch model versions.

SWE-bench and ARC-AGI-2 — what the benchmarks mean for you

Numbers without context are marketing. Here's what they actually mean:

SWE-bench Verified: real GitHub issues from real codebases. Sonnet 4.6 at 79.6% and Opus 4.6 at 80.8% means roughly 4 in 5 real-world coding tasks are solved correctly from a cold start. That's the bar where autonomous coding agents become useful rather than aspirational.

ARC-AGI-2: tests novel reasoning — tasks the model genuinely cannot have memorised. Sonnet 4.6 jumped from 13.6% to 58.3%, a 4.3x improvement in a single generation. This is the benchmark where throwing more training data doesn't work; you have to actually reason differently.

For your prompts: these numbers mean Claude 4.6 can handle significantly more open-ended problem specifications. You don't need to hand-hold the reasoning as much. Less scaffolding in the prompt, more trust in the model to figure out the path.

Next steps

Understand why long context isn't just "bigger" — context engineering
Build something on top of these capabilities — AI agents track
Cut your API bill with the effort parameter — effort parameter cost guide
Migration from older Claude versions — Claude 4 prompting guide

💡 Want to use Claude 4.6 in India? AICredits.in gives you access to Sonnet 4.6 and Opus 4.6 with UPI payment in ₹ — no international card needed.

What changed from Claude 4.5 to 4.6 (for prompters)

The surface API looks similar. The behaviour underneath is substantially different.

Feature	Claude 4.5	Claude 4.6
Thinking mode	`budget_tokens` required	Adaptive (auto)
Effort control	Not available	`low / medium / high / max`
Context window	200K	1M (Opus & Sonnet)
Max output	64K	128K (Opus), 64K (Sonnet)
Prefill	Supported	Returns 400 error
Models available	3.5 Sonnet, Haiku 3.5	Retired; 4.6 Sonnet/Opus replace them

The retired models matter: claude-3-5-sonnet-20241022 and claude-haiku-3-5-20241022 are gone as of ~April 2026. If your code still references those strings, you'll get errors.

Adaptive thinking — how it actually works

Old way (deprecated, avoid)

# Don't do this — budget_tokens is deprecated in Claude 4.6
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # ❌ deprecated
    },
    messages=[{"role": "user", "content": "Solve this..."}]
)

New way (adaptive)

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},  # ✅ Claude decides internally
    effort="high",  # controls depth of reasoning
    messages=[{"role": "user", "content": "Solve this..."}]
)

The effort parameter — your main cost-quality lever

The effort parameter is new in 4.6 and is the primary way to control the cost-quality tradeoff.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4000,
    thinking={"type": "adaptive"},
    effort="medium",  # low | medium | high | max
    messages=[{"role": "user", "content": prompt}]
)

Effort	Use case	Relative cost
`low`	Classification, routing, extraction, yes/no decisions	Cheapest
`medium`	General coding, text generation, summarisation, standard Q&A	~40–60% cheaper than `high`
`high`	Default for Opus 4.6. Complex reasoning, multi-step planning.	Standard
`max`	Hardest problems only. Architecture decisions, novel research. Opus 4.6 only.	Most expensive

Rule of thumb

The 1M token context window — what you can actually do now

The 1M context window is now GA for both Opus 4.6 and Sonnet 4.6 — no beta header required. In practice, what fits?

Entire codebase: ~50,000 lines of code
Legal document set: ~200 average contracts
Research corpus: ~1,500 academic papers (abstract + intro sections)
Meeting transcripts: ~5 years of weekly 1-hour meetings

Interleaved thinking — reasoning between tool calls

What to stop doing on 4.6

Stop using budget_tokens. The parameter is deprecated. Claude ignores it. Switch to thinking={"type": "adaptive"} and control depth via effort.

Quick migration from 4.5 to 4.6 API calls

Pattern 1: Old extended thinking → new adaptive

# Before (4.5)
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 8000},
    messages=messages
)

# After (4.6)
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    effort="high",
    messages=messages
)

Pattern 2: Old model string → new model string

# Before
model = "claude-3-5-sonnet-20241022"  # retired
model = "claude-haiku-3-5-20241022"   # retired

# After
model = "claude-sonnet-4-6"           # general purpose, cost-effective
model = "claude-opus-4-6"             # highest capability

Pattern 3: Old prefill trick → instruction in user message

# Before — returned structured output by prefilling "{"
messages = [
    {"role": "user", "content": "Extract entities from: " + text},
    {"role": "assistant", "content": "{"}  # ❌ returns 400 on 4.6
]

# After — instruction in user message
messages = [
    {
        "role": "user",
        "content": f"Extract entities from the text below. Respond with valid JSON only, starting with {{.\n\n{text}"
    }
]

SWE-bench and ARC-AGI-2 — what the benchmarks mean for you

Numbers without context are marketing. Here's what they actually mean:

Next steps

Understand why long context isn't just "bigger" — context engineering
Build something on top of these capabilities — AI agents track
Cut your API bill with the effort parameter — effort parameter cost guide
Migration from older Claude versions — Claude 4 prompting guide

💡 Want to use Claude 4.6 in India? AICredits.in gives you access to Sonnet 4.6 and Opus 4.6 with UPI payment in ₹ — no international card needed.

Claude Opus 4.6 Prompting Guide: Adaptive Thinking, Effort Levels, and 1M Context

What changed from Claude 4.5 to 4.6 (for prompters)

Adaptive thinking — how it actually works

Old way (deprecated, avoid)

New way (adaptive)

The effort parameter — your main cost-quality lever

Rule of thumb

The 1M token context window — what you can actually do now

Interleaved thinking — reasoning between tool calls

What to stop doing on 4.6

Quick migration from 4.5 to 4.6 API calls

SWE-bench and ARC-AGI-2 — what the benchmarks mean for you

Next steps

Related articles

Anthropic's Claude for Open Source: How Indian Developers Can Get Claude Max Free

Build Your First MCP Server in Python: Connect Claude to Indian APIs (Under 100 Lines)

Agentic Payments in India: How Claude + Razorpay + UPI Changes Everything for Developers

Claude Opus 4.6 Prompting Guide: Adaptive Thinking, Effort Levels, and 1M Context

What changed from Claude 4.5 to 4.6 (for prompters)

Adaptive thinking — how it actually works

Old way (deprecated, avoid)

New way (adaptive)

The effort parameter — your main cost-quality lever

Rule of thumb

The 1M token context window — what you can actually do now

Interleaved thinking — reasoning between tool calls

What to stop doing on 4.6

Quick migration from 4.5 to 4.6 API calls

SWE-bench and ARC-AGI-2 — what the benchmarks mean for you

Next steps

Related articles

Anthropic's Claude for Open Source: How Indian Developers Can Get Claude Max Free

Build Your First MCP Server in Python: Connect Claude to Indian APIs (Under 100 Lines)

Agentic Payments in India: How Claude + Razorpay + UPI Changes Everything for Developers