What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

Building a Multi-Agent System with Claude: The Planner, Executor, Critic Pattern

Anthropic's documentation for Claude Opus 4.6 includes a notable warning: the model has a "strong predilection for subagents" and may spawn them in situations where a simpler, direct approach would suffice. That observation cuts both ways. It means Opus 4.6 is genuinely good at decomposing complex goals into parallelisable workstreams — and it means you need to be deliberate about when you actually want that behaviour.

The planner/executor/critic triad is the pattern that makes this strength useful without letting it run wild. It gives you parallel execution where the task warrants it, quality gates before results get integrated, and a clean architecture you can debug when something goes wrong.

Why multi-agent for complex tasks?

A single agent planning a complex task upfront has a fundamental problem: errors at step 3 compound through steps 4, 5, and 6. By the time you see a wrong output, you're unwinding multiple decisions. There's also no parallel execution — everything is sequential.

Multi-agent solves different problems:

Parallelisation: A code review that checks security, performance, and style can run all three checks simultaneously. A research task covering three market segments doesn't need them to be sequential.

Isolation: Each executor gets only the context it needs. This prevents earlier mistakes from polluting later reasoning.

Self-correction: A critic role creates a quality gate before results get integrated. The executor doesn't decide if its output is good — a separate agent does.

Context window management: For tasks that genuinely exceed one context window, parallel subagents each working on bounded chunks is the right architecture.

Use multi-agent when tasks can be parallelised, when multiple independent perspectives add value, or when the task spans more tokens than a single context can reliably handle. For simple, short tasks — don't add this complexity.

The planner/executor/critic pattern

The flow works like this:

Planner receives the goal, decomposes it into subtasks, identifies which can run in parallel vs must be sequential
Executors (one or more, running in parallel where possible) carry out bounded subtasks with narrow context
Critic reviews each executor output and issues a verdict: ACCEPT or REJECT with specific feedback
Planner (or a Synthesiser) integrates accepted outputs into the final result

Rejected outputs loop back to the executor with the critic's feedback. Typically you allow 1-2 retry attempts before escalating.

The Planner agent

The planner's job is decomposition and orchestration. It needs the full goal, constraints, and available tools.

Key instruction that prevents Opus 4.6 from over-spawning subagents:

PLANNER_SYSTEM = """You are a planning agent. Your job is to decompose the goal into concrete subtasks and assign them to executor agents.

For each subtask, specify:
- What to do (specific and bounded)
- What the expected output looks like
- Dependencies (which other subtasks this depends on)
- Whether it can run in parallel with other subtasks

Use subagents when:
- Tasks can run in parallel without sharing context
- Tasks require isolated reasoning without contamination from other subtasks
- Tasks are large enough to benefit from dedicated context

Do NOT use subagents when:
- The task is simple and sequential
- Tasks need to share context closely (do them yourself in sequence)
- A single-file edit or minor lookup is involved
- The overhead of coordination exceeds the benefit of parallelism

Return your plan as a structured list of subtasks with their dependencies and parallel/sequential classification."""

The Executor agent(s)

Executors get narrow scope and focused context. They should not ask clarifying questions — they make reasonable assumptions, note them, and complete the task.

EXECUTOR_SYSTEM = """You are an executor agent. You have been assigned a specific subtask.

Your responsibilities:
1. Complete the assigned task precisely and completely
2. Stay within your assigned scope — do not expand to related work
3. Make reasonable assumptions when information is missing; state your assumptions clearly
4. Produce the specific output format requested
5. Do not ask clarifying questions — use your best judgment and note what you assumed

Your output should be directly usable by a reviewer. Include your assumptions as a brief note at the end."""

The Critic agent

The critic is the most important role to get right. Vague feedback ("could be better") is useless. The critic must be specific.

CRITIC_SYSTEM = """You are a quality reviewer. Your job is to evaluate executor output against the task specification.

Your verdict must be exactly one of:
- ACCEPT — the output meets the requirements and is ready for integration
- REJECT — the output has specific problems that must be fixed

If REJECT: explain precisely what is wrong. Do not say "needs improvement." Say:
- What specific requirement is not met
- What the executor should do differently
- What the corrected output should look like (if it's a small fix)

Be direct. Vague feedback wastes cycles. If the output is 90% correct but has one specific fixable problem, say exactly what that problem is.

Format your response as:
VERDICT: [ACCEPT/REJECT]
FEEDBACK: [Specific feedback if REJECT, brief confirmation of what's good if ACCEPT]"""

Python implementation

Here's the full orchestration loop. Executors run in parallel using asyncio.gather. The planner and critic use Opus 4.6 — they need orchestration intelligence. Executors use Sonnet 4.6 — capable enough for bounded tasks, significantly cheaper.

import asyncio
import json
from anthropic import Anthropic

client = Anthropic()

def planner_call(goal: str, context: str = "") -> dict:
    """Call Opus 4.6 to decompose the goal into subtasks."""
    prompt = f"Goal: {goal}"
    if context:
        prompt += f"\n\nAdditional context: {context}"
    prompt += "\n\nDecompose this into subtasks. Return JSON with structure: {subtasks: [{id, description, expected_output, dependencies: [], can_parallel: bool}]}"
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=4000,
        system=PLANNER_SYSTEM,
        messages=[{"role": "user", "content": prompt}]
    )
    
    # Extract JSON from response
    text = response.content[0].text
    # Simple extraction — in production, use a more robust parser
    start = text.find('{')
    end = text.rfind('}') + 1
    return json.loads(text[start:end])


async def executor_call(task: dict, feedback: str = "") -> dict:
    """Call Sonnet 4.6 to execute a single subtask."""
    prompt = f"Your assigned task:\n{task['description']}\n\nExpected output format:\n{task['expected_output']}"
    
    if feedback:
        prompt += f"\n\n---\nPrevious attempt was rejected. Specific feedback:\n{feedback}\n\nPlease address this feedback in your revised output."
    
    # Use asyncio to run in thread pool (anthropic SDK is sync)
    loop = asyncio.get_event_loop()
    response = await loop.run_in_executor(
        None,
        lambda: client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4000,
            system=EXECUTOR_SYSTEM,
            messages=[{"role": "user", "content": prompt}]
        )
    )
    
    return {
        "task_id": task["id"],
        "output": response.content[0].text
    }


def critic_call(task: dict, executor_output: str) -> dict:
    """Call Opus 4.6 to evaluate executor output."""
    prompt = f"""Task specification:
{task['description']}

Expected output:
{task['expected_output']}

Executor output to review:
{executor_output}

Evaluate this output and provide your verdict."""
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1000,
        system=CRITIC_SYSTEM,
        messages=[{"role": "user", "content": prompt}]
    )
    
    text = response.content[0].text
    verdict = "ACCEPT" if "VERDICT: ACCEPT" in text else "REJECT"
    feedback = text.split("FEEDBACK:", 1)[-1].strip() if "FEEDBACK:" in text else text
    
    return {"verdict": verdict, "feedback": feedback}


async def run_with_review(task: dict, max_retries: int = 2) -> dict:
    """Execute a task, critique it, and retry if rejected."""
    feedback = ""
    
    for attempt in range(max_retries + 1):
        result = await executor_call(task, feedback=feedback)
        review = critic_call(task, result["output"])
        
        if review["verdict"] == "ACCEPT":
            return {"task_id": task["id"], "output": result["output"], "attempts": attempt + 1}
        
        if attempt < max_retries:
            feedback = review["feedback"]
            print(f"Task {task['id']} rejected (attempt {attempt + 1}): {feedback[:100]}...")
        else:
            # Return best attempt with a note
            return {
                "task_id": task["id"],
                "output": result["output"],
                "attempts": attempt + 1,
                "note": f"Max retries reached. Last feedback: {review['feedback']}"
            }


async def run_multi_agent(goal: str, context: str = "") -> str:
    """Full multi-agent pipeline: plan → execute in parallel → critique → synthesise."""
    
    # 1. Plan
    print("Planning...")
    plan = planner_call(goal, context)
    tasks = plan["subtasks"]
    print(f"Plan: {len(tasks)} subtasks, {sum(1 for t in tasks if t['can_parallel'])} can run in parallel")
    
    # 2. Execute parallel tasks simultaneously, sequential ones in order
    parallel_tasks = [t for t in tasks if t["can_parallel"] and not t["dependencies"]]
    sequential_tasks = [t for t in tasks if not t["can_parallel"] or t["dependencies"]]
    
    results = {}
    
    # Run parallel tasks
    if parallel_tasks:
        print(f"Running {len(parallel_tasks)} tasks in parallel...")
        parallel_results = await asyncio.gather(*[run_with_review(task) for task in parallel_tasks])
        for r in parallel_results:
            results[r["task_id"]] = r["output"]
    
    # Run sequential tasks (simplified — proper dependency resolution would check task["dependencies"])
    for task in sequential_tasks:
        print(f"Running sequential task: {task['id']}...")
        # Include outputs from dependencies as context
        dep_context = "\n\n".join([
            f"Output from {dep}: {results.get(dep, 'Not available')}"
            for dep in task.get("dependencies", [])
        ])
        if dep_context:
            task["description"] += f"\n\nContext from prior tasks:\n{dep_context}"
        
        result = await run_with_review(task)
        results[result["task_id"]] = result["output"]
    
    # 3. Synthesise
    synthesis_prompt = f"""Original goal: {goal}

Completed subtask outputs:
{chr(10).join(f'--- {k} ---{chr(10)}{v}' for k, v in results.items())}

Synthesise these outputs into a single coherent result that fully addresses the original goal."""
    
    synthesis = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=6000,
        messages=[{"role": "user", "content": synthesis_prompt}]
    )
    
    return synthesis.content[0].text

Controlling Opus 4.6's subagent eagerness

The system prompt above includes the "use subagents when / do NOT use subagents when" block. This matters. Without it, Opus 4.6 will decompose even simple tasks into subagents because it's been trained to be helpful through delegation.

Add this specific language to your planner system prompt (exact wording matters):

planner_addition = """
IMPORTANT: Subagent overhead is real. Creating a subagent for a task that takes 30 seconds 
adds coordination cost of 5-10 seconds plus additional API calls. Reserve subagents for 
tasks where parallelism saves meaningful time or where context isolation is genuinely 
necessary. A 500-token lookup does not need a subagent. A 10,000-token analysis of an 
independent workstream does.
"""

In testing, this language reduces unnecessary subagent spawning by roughly 60% on tasks that don't need it.

Real example — code review pipeline

Here's a concrete use case: reviewing a PR for security, performance, and style simultaneously.

async def review_pull_request(pr_diff: str, codebase_context: str = "") -> str:
    """Three-stream parallel code review."""
    
    tasks = [
        {
            "id": "security",
            "description": f"""Review this PR diff specifically for security vulnerabilities. 
Check against OWASP Top 10. For each vulnerability: location, severity, explanation, fix.

PR diff:
{pr_diff}""",
            "expected_output": "Numbered list of security issues with severity, location, and fix. 'No issues found' if clean.",
            "dependencies": [],
            "can_parallel": True
        },
        {
            "id": "performance",
            "description": f"""Review this PR diff specifically for performance issues.
Check for: N+1 queries, unnecessary loops, blocking I/O, memory leaks, missing indexes.

PR diff:
{pr_diff}""",
            "expected_output": "Numbered list of performance issues with estimated impact and fix. 'No issues found' if clean.",
            "dependencies": [],
            "can_parallel": True
        },
        {
            "id": "style",
            "description": f"""Review this PR diff for code style and convention issues.
Check for: naming conventions, function complexity, missing error handling, inadequate comments.

PR diff:
{pr_diff}
{f"Codebase conventions: {codebase_context}" if codebase_context else ""}""",
            "expected_output": "Numbered list of style issues with line references and suggestions. 'No issues found' if clean.",
            "dependencies": [],
            "can_parallel": True
        }
    ]
    
    # Run all three in parallel
    results = await asyncio.gather(*[run_with_review(task) for task in tasks])
    results_dict = {r["task_id"]: r["output"] for r in results}
    
    # Synthesise into a single PR comment
    synthesis = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=3000,
        messages=[{
            "role": "user",
            "content": f"""Combine these three code review outputs into a single, well-structured PR comment.
Group by severity. Use markdown formatting suitable for GitHub.

Security review:
{results_dict['security']}

Performance review:
{results_dict['performance']}

Style review:
{results_dict['style']}"""
        }]
    )
    
    return synthesis.content[0].text

# Usage
pr_comment = asyncio.run(review_pull_request(pr_diff=your_diff))
print(pr_comment)

This runs three Sonnet 4.6 calls in parallel, each reviewed by Opus 4.6, then synthesised by Opus 4.6. Total latency is roughly equal to one sequential review pass, but you get three independent perspectives.

When NOT to use multi-agent

Multi-agent is overhead. Use it when the overhead is worth it.

Don't use it for:

Simple tasks — a single client.messages.create() call with effort="high" will outperform a three-agent pipeline on most tasks under ~1,000 tokens of output.
Tasks with tight context sharing — if your executor needs to know what every other executor found, you need sequential execution, not parallel. The isolation that makes parallel useful becomes a liability.
Cost-sensitive low-complexity applications — multi-agent multiplies token usage by 3-5x. For a startup on a tight budget doing simple chat interactions, this is the wrong architecture.
Latency-critical paths — coordination adds round-trips. If you need sub-second responses, single-agent is almost always faster.

The test I apply: "If this task were assigned to three humans in parallel, would they produce better output than one human doing it sequentially?" If the answer is no, single-agent.

💡 Want to go deeper? The AI Agents track covers multi-agent orchestration with hands-on exercises and production patterns — including how to handle failures, shared state, and agent communication protocols.

Next steps

Multi-agent systems — architectural patterns beyond planner/executor/critic
Build your first AI agent — start here if you're new to agents
Claude Opus 4.6 prompting guide — get the most out of the orchestrator model
AI agent design patterns — broader survey of agent architectures

Try it now with AICredits.in

Access Claude Opus 4.6, Sonnet 4.6, GPT-4o, and 300+ models with UPI payment in ₹. Run multi-agent pipelines without an international card. Create free account →

Why multi-agent for complex tasks?

Multi-agent solves different problems:

Isolation: Each executor gets only the context it needs. This prevents earlier mistakes from polluting later reasoning.

Self-correction: A critic role creates a quality gate before results get integrated. The executor doesn't decide if its output is good — a separate agent does.

Context window management: For tasks that genuinely exceed one context window, parallel subagents each working on bounded chunks is the right architecture.

The planner/executor/critic pattern

The flow works like this:

Planner receives the goal, decomposes it into subtasks, identifies which can run in parallel vs must be sequential
Executors (one or more, running in parallel where possible) carry out bounded subtasks with narrow context
Critic reviews each executor output and issues a verdict: ACCEPT or REJECT with specific feedback
Planner (or a Synthesiser) integrates accepted outputs into the final result

Rejected outputs loop back to the executor with the critic's feedback. Typically you allow 1-2 retry attempts before escalating.

The Planner agent

The planner's job is decomposition and orchestration. It needs the full goal, constraints, and available tools.

Key instruction that prevents Opus 4.6 from over-spawning subagents:

PLANNER_SYSTEM = """You are a planning agent. Your job is to decompose the goal into concrete subtasks and assign them to executor agents.

For each subtask, specify:
- What to do (specific and bounded)
- What the expected output looks like
- Dependencies (which other subtasks this depends on)
- Whether it can run in parallel with other subtasks

Use subagents when:
- Tasks can run in parallel without sharing context
- Tasks require isolated reasoning without contamination from other subtasks
- Tasks are large enough to benefit from dedicated context

Do NOT use subagents when:
- The task is simple and sequential
- Tasks need to share context closely (do them yourself in sequence)
- A single-file edit or minor lookup is involved
- The overhead of coordination exceeds the benefit of parallelism

Return your plan as a structured list of subtasks with their dependencies and parallel/sequential classification."""

The Executor agent(s)

Executors get narrow scope and focused context. They should not ask clarifying questions — they make reasonable assumptions, note them, and complete the task.

EXECUTOR_SYSTEM = """You are an executor agent. You have been assigned a specific subtask.

Your responsibilities:
1. Complete the assigned task precisely and completely
2. Stay within your assigned scope — do not expand to related work
3. Make reasonable assumptions when information is missing; state your assumptions clearly
4. Produce the specific output format requested
5. Do not ask clarifying questions — use your best judgment and note what you assumed

Your output should be directly usable by a reviewer. Include your assumptions as a brief note at the end."""

The Critic agent

The critic is the most important role to get right. Vague feedback ("could be better") is useless. The critic must be specific.

CRITIC_SYSTEM = """You are a quality reviewer. Your job is to evaluate executor output against the task specification.

Your verdict must be exactly one of:
- ACCEPT — the output meets the requirements and is ready for integration
- REJECT — the output has specific problems that must be fixed

If REJECT: explain precisely what is wrong. Do not say "needs improvement." Say:
- What specific requirement is not met
- What the executor should do differently
- What the corrected output should look like (if it's a small fix)

Be direct. Vague feedback wastes cycles. If the output is 90% correct but has one specific fixable problem, say exactly what that problem is.

Format your response as:
VERDICT: [ACCEPT/REJECT]
FEEDBACK: [Specific feedback if REJECT, brief confirmation of what's good if ACCEPT]"""

Python implementation

import asyncio
import json
from anthropic import Anthropic

client = Anthropic()

def planner_call(goal: str, context: str = "") -> dict:
    """Call Opus 4.6 to decompose the goal into subtasks."""
    prompt = f"Goal: {goal}"
    if context:
        prompt += f"\n\nAdditional context: {context}"
    prompt += "\n\nDecompose this into subtasks. Return JSON with structure: {subtasks: [{id, description, expected_output, dependencies: [], can_parallel: bool}]}"
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=4000,
        system=PLANNER_SYSTEM,
        messages=[{"role": "user", "content": prompt}]
    )
    
    # Extract JSON from response
    text = response.content[0].text
    # Simple extraction — in production, use a more robust parser
    start = text.find('{')
    end = text.rfind('}') + 1
    return json.loads(text[start:end])


async def executor_call(task: dict, feedback: str = "") -> dict:
    """Call Sonnet 4.6 to execute a single subtask."""
    prompt = f"Your assigned task:\n{task['description']}\n\nExpected output format:\n{task['expected_output']}"
    
    if feedback:
        prompt += f"\n\n---\nPrevious attempt was rejected. Specific feedback:\n{feedback}\n\nPlease address this feedback in your revised output."
    
    # Use asyncio to run in thread pool (anthropic SDK is sync)
    loop = asyncio.get_event_loop()
    response = await loop.run_in_executor(
        None,
        lambda: client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4000,
            system=EXECUTOR_SYSTEM,
            messages=[{"role": "user", "content": prompt}]
        )
    )
    
    return {
        "task_id": task["id"],
        "output": response.content[0].text
    }


def critic_call(task: dict, executor_output: str) -> dict:
    """Call Opus 4.6 to evaluate executor output."""
    prompt = f"""Task specification:
{task['description']}

Expected output:
{task['expected_output']}

Executor output to review:
{executor_output}

Evaluate this output and provide your verdict."""
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1000,
        system=CRITIC_SYSTEM,
        messages=[{"role": "user", "content": prompt}]
    )
    
    text = response.content[0].text
    verdict = "ACCEPT" if "VERDICT: ACCEPT" in text else "REJECT"
    feedback = text.split("FEEDBACK:", 1)[-1].strip() if "FEEDBACK:" in text else text
    
    return {"verdict": verdict, "feedback": feedback}


async def run_with_review(task: dict, max_retries: int = 2) -> dict:
    """Execute a task, critique it, and retry if rejected."""
    feedback = ""
    
    for attempt in range(max_retries + 1):
        result = await executor_call(task, feedback=feedback)
        review = critic_call(task, result["output"])
        
        if review["verdict"] == "ACCEPT":
            return {"task_id": task["id"], "output": result["output"], "attempts": attempt + 1}
        
        if attempt < max_retries:
            feedback = review["feedback"]
            print(f"Task {task['id']} rejected (attempt {attempt + 1}): {feedback[:100]}...")
        else:
            # Return best attempt with a note
            return {
                "task_id": task["id"],
                "output": result["output"],
                "attempts": attempt + 1,
                "note": f"Max retries reached. Last feedback: {review['feedback']}"
            }


async def run_multi_agent(goal: str, context: str = "") -> str:
    """Full multi-agent pipeline: plan → execute in parallel → critique → synthesise."""
    
    # 1. Plan
    print("Planning...")
    plan = planner_call(goal, context)
    tasks = plan["subtasks"]
    print(f"Plan: {len(tasks)} subtasks, {sum(1 for t in tasks if t['can_parallel'])} can run in parallel")
    
    # 2. Execute parallel tasks simultaneously, sequential ones in order
    parallel_tasks = [t for t in tasks if t["can_parallel"] and not t["dependencies"]]
    sequential_tasks = [t for t in tasks if not t["can_parallel"] or t["dependencies"]]
    
    results = {}
    
    # Run parallel tasks
    if parallel_tasks:
        print(f"Running {len(parallel_tasks)} tasks in parallel...")
        parallel_results = await asyncio.gather(*[run_with_review(task) for task in parallel_tasks])
        for r in parallel_results:
            results[r["task_id"]] = r["output"]
    
    # Run sequential tasks (simplified — proper dependency resolution would check task["dependencies"])
    for task in sequential_tasks:
        print(f"Running sequential task: {task['id']}...")
        # Include outputs from dependencies as context
        dep_context = "\n\n".join([
            f"Output from {dep}: {results.get(dep, 'Not available')}"
            for dep in task.get("dependencies", [])
        ])
        if dep_context:
            task["description"] += f"\n\nContext from prior tasks:\n{dep_context}"
        
        result = await run_with_review(task)
        results[result["task_id"]] = result["output"]
    
    # 3. Synthesise
    synthesis_prompt = f"""Original goal: {goal}

Completed subtask outputs:
{chr(10).join(f'--- {k} ---{chr(10)}{v}' for k, v in results.items())}

Synthesise these outputs into a single coherent result that fully addresses the original goal."""
    
    synthesis = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=6000,
        messages=[{"role": "user", "content": synthesis_prompt}]
    )
    
    return synthesis.content[0].text

Controlling Opus 4.6's subagent eagerness

Add this specific language to your planner system prompt (exact wording matters):

planner_addition = """
IMPORTANT: Subagent overhead is real. Creating a subagent for a task that takes 30 seconds 
adds coordination cost of 5-10 seconds plus additional API calls. Reserve subagents for 
tasks where parallelism saves meaningful time or where context isolation is genuinely 
necessary. A 500-token lookup does not need a subagent. A 10,000-token analysis of an 
independent workstream does.
"""

In testing, this language reduces unnecessary subagent spawning by roughly 60% on tasks that don't need it.

Real example — code review pipeline

Here's a concrete use case: reviewing a PR for security, performance, and style simultaneously.

async def review_pull_request(pr_diff: str, codebase_context: str = "") -> str:
    """Three-stream parallel code review."""
    
    tasks = [
        {
            "id": "security",
            "description": f"""Review this PR diff specifically for security vulnerabilities. 
Check against OWASP Top 10. For each vulnerability: location, severity, explanation, fix.

PR diff:
{pr_diff}""",
            "expected_output": "Numbered list of security issues with severity, location, and fix. 'No issues found' if clean.",
            "dependencies": [],
            "can_parallel": True
        },
        {
            "id": "performance",
            "description": f"""Review this PR diff specifically for performance issues.
Check for: N+1 queries, unnecessary loops, blocking I/O, memory leaks, missing indexes.

PR diff:
{pr_diff}""",
            "expected_output": "Numbered list of performance issues with estimated impact and fix. 'No issues found' if clean.",
            "dependencies": [],
            "can_parallel": True
        },
        {
            "id": "style",
            "description": f"""Review this PR diff for code style and convention issues.
Check for: naming conventions, function complexity, missing error handling, inadequate comments.

PR diff:
{pr_diff}
{f"Codebase conventions: {codebase_context}" if codebase_context else ""}""",
            "expected_output": "Numbered list of style issues with line references and suggestions. 'No issues found' if clean.",
            "dependencies": [],
            "can_parallel": True
        }
    ]
    
    # Run all three in parallel
    results = await asyncio.gather(*[run_with_review(task) for task in tasks])
    results_dict = {r["task_id"]: r["output"] for r in results}
    
    # Synthesise into a single PR comment
    synthesis = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=3000,
        messages=[{
            "role": "user",
            "content": f"""Combine these three code review outputs into a single, well-structured PR comment.
Group by severity. Use markdown formatting suitable for GitHub.

Security review:
{results_dict['security']}

Performance review:
{results_dict['performance']}

Style review:
{results_dict['style']}"""
        }]
    )
    
    return synthesis.content[0].text

# Usage
pr_comment = asyncio.run(review_pull_request(pr_diff=your_diff))
print(pr_comment)

When NOT to use multi-agent

Multi-agent is overhead. Use it when the overhead is worth it.

Don't use it for:

Simple tasks — a single client.messages.create() call with effort="high" will outperform a three-agent pipeline on most tasks under ~1,000 tokens of output.
Tasks with tight context sharing — if your executor needs to know what every other executor found, you need sequential execution, not parallel. The isolation that makes parallel useful becomes a liability.
Cost-sensitive low-complexity applications — multi-agent multiplies token usage by 3-5x. For a startup on a tight budget doing simple chat interactions, this is the wrong architecture.
Latency-critical paths — coordination adds round-trips. If you need sub-second responses, single-agent is almost always faster.

The test I apply: "If this task were assigned to three humans in parallel, would they produce better output than one human doing it sequentially?" If the answer is no, single-agent.

💡 Want to go deeper? The AI Agents track covers multi-agent orchestration with hands-on exercises and production patterns — including how to handle failures, shared state, and agent communication protocols.

Next steps

Multi-agent systems — architectural patterns beyond planner/executor/critic
Build your first AI agent — start here if you're new to agents
Claude Opus 4.6 prompting guide — get the most out of the orchestrator model
AI agent design patterns — broader survey of agent architectures

Try it now with AICredits.in

Access Claude Opus 4.6, Sonnet 4.6, GPT-4o, and 300+ models with UPI payment in ₹. Run multi-agent pipelines without an international card. Create free account →

Building a Multi-Agent System with Claude: The Planner, Executor, Critic Pattern

Why multi-agent for complex tasks?

The planner/executor/critic pattern

The Planner agent

The Executor agent(s)

The Critic agent

Python implementation

Controlling Opus 4.6's subagent eagerness

Real example — code review pipeline

When NOT to use multi-agent

Next steps

Try it now with AICredits.in

Related articles

Build Your First MCP Server in Python: Connect Claude to Indian APIs (Under 100 Lines)

Agentic Payments in India: How Claude + Razorpay + UPI Changes Everything for Developers

How to Use Claude Code in India Without a Credit Card (2026 Guide)

Building a Multi-Agent System with Claude: The Planner, Executor, Critic Pattern

Why multi-agent for complex tasks?

The planner/executor/critic pattern

The Planner agent

The Executor agent(s)

The Critic agent

Python implementation

Controlling Opus 4.6's subagent eagerness

Real example — code review pipeline

When NOT to use multi-agent

Next steps

Try it now with AICredits.in

Related articles

Build Your First MCP Server in Python: Connect Claude to Indian APIs (Under 100 Lines)

Agentic Payments in India: How Claude + Razorpay + UPI Changes Everything for Developers

How to Use Claude Code in India Without a Credit Card (2026 Guide)