What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

OpenAI Responses API Guide: Web Search, Code Interpreter, and Built-in Tools

The Assistants API always felt like it was trying too hard. Threads, runs, run steps, polling for status — a lot of machinery for what should be a simple "call the model, get a result" interaction. OpenAI clearly agreed, because the Responses API dropped most of that complexity while keeping what actually mattered: built-in tools.

If you're building anything that needs web search, data analysis, or document Q&A, the Responses API is the right starting point in 2026.

What changed from Assistants API

The Assistants API required you to create and manage Assistants (persistent configurations), Threads (conversation history), and Runs (individual execution attempts). You'd kick off a run, poll until it completed, retrieve messages, handle tool calls in a loop. For a simple chatbot, this meant writing 80 lines of boilerplate before you got to actual logic.

The Responses API is stateless-first. You send a request, you get a response. No threads to manage, no polling, no run lifecycle. If you need conversation history, you pass previous messages yourself — just like the Chat Completions API.

The built-in tools are the reason to switch: web_search_preview, code_interpreter, and file_search work without any setup. No tool definitions, no function schemas, no webhook URLs. You enable them in the request and the model uses them when it decides they're needed.

The three built-in tools

web_search_preview

Gives the model access to real-time web search. It's useful for anything requiring current information: recent news, current pricing, competitor research, live documentation.

The model decides when to search. When it does, OpenAI handles the search request internally and injects the results into context before generating the response. You don't see the search queries or the raw results unless you inspect the response's reasoning items.

Best for: research bots, competitive intelligence tools, news summarization, anything that gets stale without live data.

code_interpreter

A sandboxed Python environment the model can write and execute code in. The model can generate code, run it, inspect the output, fix errors, and iterate — all within a single response. It can also read uploaded files (CSV, Excel, images) and write output files (charts, processed data).

This is more powerful than it sounds. You can send a messy CSV and say "find anomalies in this dataset" and the model will actually execute Python to do it, not just describe how you could. It handles matplotlib, pandas, numpy, and most standard libraries.

Best for: data analysis, chart generation, math-heavy computations, file format conversions, anything that benefits from running actual code rather than reasoning about code.

file_search

Searches across files you've uploaded to OpenAI's storage. It handles chunking, embedding, and retrieval automatically. You upload PDFs, Word docs, code files — whatever your knowledge base contains — and the model can search and cite them in responses.

It's a managed RAG system. You don't control the chunking strategy or retrieval parameters, which is a limitation for advanced use cases but a significant convenience for most applications.

Best for: document Q&A, internal knowledge bases, support bots grounded in product documentation.

Basic API usage

from openai import OpenAI

client = OpenAI()

# Simple request with web search enabled
response = client.responses.create(
    model="gpt-4o",
    tools=[{"type": "web_search_preview"}],
    input="What's the current pricing for Anthropic's Claude API?"
)

print(response.output_text)

To use multiple tools:

response = client.responses.create(
    model="gpt-4o",
    tools=[
        {"type": "web_search_preview"},
        {"type": "code_interpreter"},
    ],
    input="Search for recent benchmark comparisons between GPT-4o and Claude, then create a summary table."
)

Force tool use with tool_choice:

response = client.responses.create(
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}],
    tool_choice={"type": "code_interpreter"},  # always use code interpreter
    input="Calculate the compound interest on $10,000 at 7% over 20 years."
)

Streaming responses with tool calls

Streaming is important for user-facing applications — nobody wants to stare at a spinner for 8 seconds while code interpreter runs.

with client.responses.stream(
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}],
    input="Analyze this sales data and create a bar chart: Q1: $120k, Q2: $145k, Q3: $132k, Q4: $178k"
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)
        elif event.type == "response.completed":
            # Handle any output files (like generated charts)
            for item in event.response.output:
                if item.type == "code_interpreter_call":
                    for output in item.outputs:
                        if output.type == "image":
                            # Save or display the generated image
                            print(f"\nGenerated chart: {output.image_url}")

Three practical examples

Research bot with web search

def research_topic(query: str) -> str:
    response = client.responses.create(
        model="gpt-4o",
        tools=[{"type": "web_search_preview"}],
        instructions=(
            "You are a research assistant. Search for current, accurate information. "
            "Always cite your sources with URLs. Be concise and factual."
        ),
        input=query,
    )
    return response.output_text

# Usage
report = research_topic(
    "What are the main LLM providers competing with OpenAI in 2026 and what are their latest models?"
)

Data analyst with code interpreter

import base64

def analyze_csv(file_path: str, analysis_request: str) -> dict:
    # Upload the file first
    with open(file_path, "rb") as f:
        file = client.files.create(file=f, purpose="assistants")
    
    response = client.responses.create(
        model="gpt-4o",
        tools=[{"type": "code_interpreter"}],
        input=[
            {
                "type": "text",
                "text": analysis_request
            },
            {
                "type": "input_file",
                "file_id": file.id
            }
        ]
    )
    
    # Extract text response and any generated files
    result = {"analysis": response.output_text, "charts": []}
    for item in response.output:
        if hasattr(item, "outputs"):
            for output in item.outputs:
                if output.type == "image":
                    result["charts"].append(output.image_url)
    
    return result

# Usage
result = analyze_csv(
    "sales_data.csv",
    "Find the top 5 products by revenue, identify seasonal trends, and create a chart."
)

Document Q&A with file search

def setup_knowledge_base(file_paths: list[str]) -> list[str]:
    file_ids = []
    for path in file_paths:
        with open(path, "rb") as f:
            file = client.files.create(file=f, purpose="assistants")
            file_ids.append(file.id)
    return file_ids

def query_documents(question: str, file_ids: list[str]) -> str:
    response = client.responses.create(
        model="gpt-4o",
        tools=[{
            "type": "file_search",
            "file_search": {"file_ids": file_ids}
        }],
        input=question
    )
    return response.output_text

# Usage
file_ids = setup_knowledge_base(["product_docs.pdf", "faq.pdf", "changelog.md"])
answer = query_documents("What changed in version 3.2 of the product?", file_ids)

Cost considerations

Built-in tools add overhead. Web search adds a fixed fee per search call (currently $0.025 per search in addition to token costs). Code interpreter charges $0.03 per session, and a "session" resets after an hour of inactivity — so long batch jobs can accumulate multiple session fees.

File search has two cost components: the storage cost for uploaded files ($0.10/GB/day after the free tier) and the vector store search cost ($0.10 per 1,000 queries after the first 1,000 free per day).

For function calling use cases where you're calling your own APIs rather than OpenAI's built-in tools, the standard Chat Completions API is still more cost-effective — you only pay for tokens. The Responses API's value is in the managed tool infrastructure.

Structured outputs work with the Responses API too, and they're worth using when you need predictable JSON shapes from responses.

Responses API vs Chat Completions API

Use Responses API when:

You need web search, code execution, or document retrieval
You want OpenAI to manage tool execution complexity
You're prototyping and want to move fast

Use Chat Completions API when:

You're calling your own tools/functions
You need maximum control over system behavior
Cost efficiency is critical and you don't need built-in tools
You're building something the Responses API's abstraction doesn't fit

The Responses API doesn't support every feature the Chat Completions API has. If you're using custom function tools extensively, you'll hit friction. For straightforward applications that need one or more of the three built-in tools, it's meaningfully simpler.

Migrating from Assistants API

The migration is largely a simplification. Things you can delete:

Thread creation and management
Run lifecycle handling (polling, status checks)
Message retrieval from thread

Things that carry over directly:

Tool definitions (code_interpreter and file_search work the same way)
File uploads and file IDs
System instructions (now called instructions in the request)

The main behavior change: conversation state is now your responsibility. If you were using Threads for persistent history, you'll need to pass previous messages explicitly in each request. For most applications, this is actually cleaner — you control exactly what context the model sees.

One thing that doesn't yet exist in the Responses API: background runs for long-running async tasks. The Assistants API had a run/polling model that handled tasks taking minutes. If you have heavy code interpreter jobs that time out, you may need to keep that workload on the Assistants API for now — or handle async execution yourself.

The direction of travel is clear though. Responses API is OpenAI's preferred path forward, and the Assistants API isn't getting meaningful new features.

If you're building anything that needs web search, data analysis, or document Q&A, the Responses API is the right starting point in 2026.

What changed from Assistants API

The three built-in tools

web_search_preview

Gives the model access to real-time web search. It's useful for anything requiring current information: recent news, current pricing, competitor research, live documentation.

Best for: research bots, competitive intelligence tools, news summarization, anything that gets stale without live data.

code_interpreter

Best for: data analysis, chart generation, math-heavy computations, file format conversions, anything that benefits from running actual code rather than reasoning about code.

file_search

It's a managed RAG system. You don't control the chunking strategy or retrieval parameters, which is a limitation for advanced use cases but a significant convenience for most applications.

Best for: document Q&A, internal knowledge bases, support bots grounded in product documentation.

Basic API usage

from openai import OpenAI

client = OpenAI()

# Simple request with web search enabled
response = client.responses.create(
    model="gpt-4o",
    tools=[{"type": "web_search_preview"}],
    input="What's the current pricing for Anthropic's Claude API?"
)

print(response.output_text)

To use multiple tools:

response = client.responses.create(
    model="gpt-4o",
    tools=[
        {"type": "web_search_preview"},
        {"type": "code_interpreter"},
    ],
    input="Search for recent benchmark comparisons between GPT-4o and Claude, then create a summary table."
)

Force tool use with tool_choice:

response = client.responses.create(
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}],
    tool_choice={"type": "code_interpreter"},  # always use code interpreter
    input="Calculate the compound interest on $10,000 at 7% over 20 years."
)

Streaming responses with tool calls

Streaming is important for user-facing applications — nobody wants to stare at a spinner for 8 seconds while code interpreter runs.

with client.responses.stream(
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}],
    input="Analyze this sales data and create a bar chart: Q1: $120k, Q2: $145k, Q3: $132k, Q4: $178k"
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)
        elif event.type == "response.completed":
            # Handle any output files (like generated charts)
            for item in event.response.output:
                if item.type == "code_interpreter_call":
                    for output in item.outputs:
                        if output.type == "image":
                            # Save or display the generated image
                            print(f"\nGenerated chart: {output.image_url}")

Three practical examples

Research bot with web search

def research_topic(query: str) -> str:
    response = client.responses.create(
        model="gpt-4o",
        tools=[{"type": "web_search_preview"}],
        instructions=(
            "You are a research assistant. Search for current, accurate information. "
            "Always cite your sources with URLs. Be concise and factual."
        ),
        input=query,
    )
    return response.output_text

# Usage
report = research_topic(
    "What are the main LLM providers competing with OpenAI in 2026 and what are their latest models?"
)

Data analyst with code interpreter

import base64

def analyze_csv(file_path: str, analysis_request: str) -> dict:
    # Upload the file first
    with open(file_path, "rb") as f:
        file = client.files.create(file=f, purpose="assistants")
    
    response = client.responses.create(
        model="gpt-4o",
        tools=[{"type": "code_interpreter"}],
        input=[
            {
                "type": "text",
                "text": analysis_request
            },
            {
                "type": "input_file",
                "file_id": file.id
            }
        ]
    )
    
    # Extract text response and any generated files
    result = {"analysis": response.output_text, "charts": []}
    for item in response.output:
        if hasattr(item, "outputs"):
            for output in item.outputs:
                if output.type == "image":
                    result["charts"].append(output.image_url)
    
    return result

# Usage
result = analyze_csv(
    "sales_data.csv",
    "Find the top 5 products by revenue, identify seasonal trends, and create a chart."
)

Document Q&A with file search

def setup_knowledge_base(file_paths: list[str]) -> list[str]:
    file_ids = []
    for path in file_paths:
        with open(path, "rb") as f:
            file = client.files.create(file=f, purpose="assistants")
            file_ids.append(file.id)
    return file_ids

def query_documents(question: str, file_ids: list[str]) -> str:
    response = client.responses.create(
        model="gpt-4o",
        tools=[{
            "type": "file_search",
            "file_search": {"file_ids": file_ids}
        }],
        input=question
    )
    return response.output_text

# Usage
file_ids = setup_knowledge_base(["product_docs.pdf", "faq.pdf", "changelog.md"])
answer = query_documents("What changed in version 3.2 of the product?", file_ids)

Cost considerations

Structured outputs work with the Responses API too, and they're worth using when you need predictable JSON shapes from responses.

Responses API vs Chat Completions API

Use Responses API when:

You need web search, code execution, or document retrieval
You want OpenAI to manage tool execution complexity
You're prototyping and want to move fast

Use Chat Completions API when:

You're calling your own tools/functions
You need maximum control over system behavior
Cost efficiency is critical and you don't need built-in tools
You're building something the Responses API's abstraction doesn't fit

Migrating from Assistants API

The migration is largely a simplification. Things you can delete:

Thread creation and management
Run lifecycle handling (polling, status checks)
Message retrieval from thread

Things that carry over directly:

Tool definitions (code_interpreter and file_search work the same way)
File uploads and file IDs
System instructions (now called instructions in the request)

The direction of travel is clear though. Responses API is OpenAI's preferred path forward, and the Assistants API isn't getting meaningful new features.

OpenAI Responses API Guide: Web Search, Code Interpreter, and Built-in Tools

What changed from Assistants API

The three built-in tools

web_search_preview

code_interpreter

file_search

Basic API usage

Streaming responses with tool calls

Three practical examples

Research bot with web search

Data analyst with code interpreter

Document Q&A with file search

Cost considerations

Responses API vs Chat Completions API

Migrating from Assistants API

Related articles

Claude API vs OpenAI API — Developer Comparison Guide (2026)

Embedding Models in 2026 — Which One to Use for RAG and Semantic Search

FastAPI + Claude API — Production Patterns for AI Backends

OpenAI Responses API Guide: Web Search, Code Interpreter, and Built-in Tools

What changed from Assistants API

The three built-in tools

web_search_preview

code_interpreter

file_search

Basic API usage

Streaming responses with tool calls

Three practical examples

Research bot with web search

Data analyst with code interpreter

Document Q&A with file search

Cost considerations

Responses API vs Chat Completions API

Migrating from Assistants API

Related articles

Claude API vs OpenAI API — Developer Comparison Guide (2026)

Embedding Models in 2026 — Which One to Use for RAG and Semantic Search

FastAPI + Claude API — Production Patterns for AI Backends