What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

Build an AI App in India Without a Dollar Account: LangChain + AICredits.in Tutorial

Every LangChain tutorial on the internet starts with "set your OPENAI_API_KEY" and assumes you funded it with a dollar card. This one doesn't. We're using AICredits.in — INR billing, UPI payment, no international card needed.

By the end of this tutorial you'll have a working LangChain AI agent that can answer questions and search the web, running on GPT-4o-mini and switchable to Claude or Gemini with one line change.

Prerequisites

Python 3.9+
An AICredits account with ₹100 topped up via UPI
Basic familiarity with Python

pip install langchain langchain-openai langchain-community duckduckgo-search python-dotenv

Create a .env file:

AICREDITS_API_KEY=sk-your-aicredits-key

Setting up the LangChain client

LangChain's ChatOpenAI class accepts openai_api_base and openai_api_key directly. That's all you need to point it at AICredits:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()

# ChatOpenAI pointing to AICredits
llm = ChatOpenAI(
    model="openai/gpt-4o-mini",
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0.7
)

# Quick test
response = llm.invoke("What's 17 * 23? Show your working.")
print(response.content)

That's the foundation. Everything in LangChain that uses ChatOpenAI will now route through AICredits.

Example 1: Simple QA chain with GPT-4o-mini

GPT-4o-mini at ₹13.23/1M input tokens is the right default for most tasks. This example builds a simple question-answering chain with a custom system prompt:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

llm = ChatOpenAI(
    model="openai/gpt-4o-mini",
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0.3
)

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a technical assistant for Indian software developers.
Be specific and practical. Include code examples when relevant.
When citing prices, use INR unless the user asks for USD."""),
    ("human", "{question}")
])

chain = prompt | llm | StrOutputParser()

questions = [
    "What's the cheapest way to host a Next.js app in India?",
    "How do I set up Razorpay webhooks in Python?",
    "What's the difference between SQS and RabbitMQ for a small startup?"
]

for q in questions:
    print(f"Q: {q}")
    print(f"A: {chain.invoke({'question': q})}")
    print("---")

At 500 input tokens and 400 output tokens per call, each question costs roughly:

Input: 500 tokens × ₹13.23/1M = ₹0.0066
Output: 400 tokens × ₹52.91/1M = ₹0.0212
Total per call: ~₹0.028

Three questions: ₹0.084. That's how cheap GPT-4o-mini is for this kind of task.

Example 2: Multi-model comparison

One of the genuinely useful things about having a unified gateway is comparing models on the same prompt without touching your billing setup. Here's a simple harness:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

AICREDITS_KEY = os.environ["AICREDITS_API_KEY"]
BASE_URL = "https://api.aicredits.in/v1"

def make_chain(model: str, temperature: float = 0.5):
    llm = ChatOpenAI(
        model=model,
        openai_api_key=AICREDITS_KEY,
        openai_api_base=BASE_URL,
        temperature=temperature
    )
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a concise technical writer. Answer in 3-4 sentences maximum."),
        ("human", "{question}")
    ])
    return prompt | llm | StrOutputParser()

models = {
    "GPT-4o-mini (₹13/1M)": "openai/gpt-4o-mini",
    "Claude Haiku (₹96/1M)": "anthropic/claude-haiku-3-5-20241022",
    "Gemini Flash (₹8.84/1M)": "google/gemini-2.0-flash",
}

question = "Explain the CAP theorem and which option a startup should generally prioritize"

for label, model_id in models.items():
    chain = make_chain(model_id)
    response = chain.invoke({"question": question})
    print(f"\n=== {label} ===")
    print(response)

Running this across the three models costs less than ₹1. That's a cheap way to calibrate which model works best for your specific use case before committing to one for production.

In my experience: Gemini Flash is fastest and cheapest, good for simple extraction and classification. Claude Haiku is more consistent on instruction following and structured tasks. GPT-4o-mini is the safe default when you're not sure.

Example 3: ReAct agent with web search

This is where things get interesting. A ReAct agent loops between reasoning and tool use — it can search the web, process the results, and reason about whether it has enough information to answer.

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.prompts import PromptTemplate

load_dotenv()

# Initialize the LLM — GPT-4o for the agent (better at tool use than mini)
llm = ChatOpenAI(
    model="openai/gpt-4o",
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0
)

# Tool: web search
search = DuckDuckGoSearchRun()
tools = [search]

# ReAct prompt template
react_prompt = PromptTemplate.from_template("""Answer the following question as best you can.
You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}""")

# Create the agent
agent = create_react_agent(llm, tools, react_prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,           # shows the reasoning steps
    max_iterations=5,       # prevent infinite loops
    handle_parsing_errors=True
)

# Run it
result = agent_executor.invoke({
    "input": "What are the latest funding rounds for Indian AI startups in 2026? Summarize the top 3."
})

print("\n=== Final Answer ===")
print(result["output"])

The verbose=True flag shows you the full reasoning chain — each thought, each search query, each observation. It's worth watching the first few times to understand how the model decides when it has enough information.

For a production version, you'd turn off verbose and add error handling around the invoke call.

Switching the agent to Claude Sonnet 4

Swap one line to use Claude instead:

llm = ChatOpenAI(
    model="anthropic/claude-sonnet-4-20250514",   # changed
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0
)

Everything else stays the same. The ReAct loop, the tools, the prompt — all identical. This is the value of the OpenAI-compatible interface: your agent architecture is model-agnostic.

Claude Sonnet 4 is noticeably better at multi-step reasoning and at deciding when it has enough information to stop searching. For research-heavy agents, it's worth the 10x price increase over GPT-4o-mini.

Monitoring costs in the AICredits dashboard

After running these examples, open the AICredits dashboard → Usage Logs. You'll see a per-request breakdown showing:

Timestamp
Model used
Input tokens
Output tokens
Cost in INR

This is more useful than it sounds. When an agent makes 5 search iterations instead of 2, you'll see exactly which calls drove the cost. It's how you catch runaway loops before they drain your wallet.

Setting a ₹500 budget cap

For any agent running unattended, set a budget cap on the API key. Dashboard → API Keys → Edit → Budget Limit.

With a ₹500 cap on a key running GPT-4o, you can make roughly:

~450 typical agent calls (5 iterations each, ~1,000 tokens/call)
Or ~2,500 simple chat completion calls

At ₹500, you're not going to accidentally run up a ₹10,000 bill from a bug. Set the cap before you run anything in production.

# In your agent setup, add a try/except for budget cap errors
from openai import RateLimitError

try:
    result = agent_executor.invoke({"input": user_question})
except RateLimitError as e:
    # Budget cap hit or rate limited
    print(f"API limit reached: {e}")
    result = {"output": "Service temporarily unavailable. Please try again later."}

What ₹100 gets you

Here's what the minimum top-up (₹100) actually buys across each model:

Model	INR/1M input tokens	Calls at 500 tokens/call	Total possible input tokens
Gemini 2.0 Flash	₹8.84	~22,600	11.3B tokens
GPT-4o-mini	₹13.23	~15,100	7.6B tokens
DeepSeek-R1	₹48.59	~4,100	2.1B tokens
Claude 3.5 Haiku	₹96.30	~2,070	1.04B tokens
o3-mini	₹96.30	~2,070	1.04B tokens
Claude Sonnet 4	₹264.00	~757	379M tokens
GPT-4o	₹221.00	~904	452M tokens

For a prototype or learning project, ₹100 is more than enough. For a production app with real user traffic, estimate based on your expected token volumes and top up accordingly.

The function calling lesson covers the tool use patterns in more depth if you want to extend the agent with more sophisticated tools — database lookups, API calls, custom business logic. The ReAct prompting lesson explains why the reasoning loop works the way it does, which helps when you're debugging agents that get stuck or loop unnecessarily.

One more thing: n8n integration

If you prefer visual workflows over code, AICredits works with n8n. Add an OpenAI node, set the base URL to https://api.aicredits.in/v1, and use your AICredits key as the API key. All the same models are available through the dropdown (they show up because the endpoint is OpenAI-compatible).

This means you can build production AI workflows in INR without writing any API integration code — just connect nodes and use UPI to fund the runs.

By the end of this tutorial you'll have a working LangChain AI agent that can answer questions and search the web, running on GPT-4o-mini and switchable to Claude or Gemini with one line change.

Prerequisites

Python 3.9+
An AICredits account with ₹100 topped up via UPI
Basic familiarity with Python

pip install langchain langchain-openai langchain-community duckduckgo-search python-dotenv

Create a .env file:

AICREDITS_API_KEY=sk-your-aicredits-key

Setting up the LangChain client

LangChain's ChatOpenAI class accepts openai_api_base and openai_api_key directly. That's all you need to point it at AICredits:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()

# ChatOpenAI pointing to AICredits
llm = ChatOpenAI(
    model="openai/gpt-4o-mini",
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0.7
)

# Quick test
response = llm.invoke("What's 17 * 23? Show your working.")
print(response.content)

That's the foundation. Everything in LangChain that uses ChatOpenAI will now route through AICredits.

Example 1: Simple QA chain with GPT-4o-mini

GPT-4o-mini at ₹13.23/1M input tokens is the right default for most tasks. This example builds a simple question-answering chain with a custom system prompt:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

llm = ChatOpenAI(
    model="openai/gpt-4o-mini",
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0.3
)

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a technical assistant for Indian software developers.
Be specific and practical. Include code examples when relevant.
When citing prices, use INR unless the user asks for USD."""),
    ("human", "{question}")
])

chain = prompt | llm | StrOutputParser()

questions = [
    "What's the cheapest way to host a Next.js app in India?",
    "How do I set up Razorpay webhooks in Python?",
    "What's the difference between SQS and RabbitMQ for a small startup?"
]

for q in questions:
    print(f"Q: {q}")
    print(f"A: {chain.invoke({'question': q})}")
    print("---")

At 500 input tokens and 400 output tokens per call, each question costs roughly:

Input: 500 tokens × ₹13.23/1M = ₹0.0066
Output: 400 tokens × ₹52.91/1M = ₹0.0212
Total per call: ~₹0.028

Three questions: ₹0.084. That's how cheap GPT-4o-mini is for this kind of task.

Example 2: Multi-model comparison

One of the genuinely useful things about having a unified gateway is comparing models on the same prompt without touching your billing setup. Here's a simple harness:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

AICREDITS_KEY = os.environ["AICREDITS_API_KEY"]
BASE_URL = "https://api.aicredits.in/v1"

def make_chain(model: str, temperature: float = 0.5):
    llm = ChatOpenAI(
        model=model,
        openai_api_key=AICREDITS_KEY,
        openai_api_base=BASE_URL,
        temperature=temperature
    )
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a concise technical writer. Answer in 3-4 sentences maximum."),
        ("human", "{question}")
    ])
    return prompt | llm | StrOutputParser()

models = {
    "GPT-4o-mini (₹13/1M)": "openai/gpt-4o-mini",
    "Claude Haiku (₹96/1M)": "anthropic/claude-haiku-3-5-20241022",
    "Gemini Flash (₹8.84/1M)": "google/gemini-2.0-flash",
}

question = "Explain the CAP theorem and which option a startup should generally prioritize"

for label, model_id in models.items():
    chain = make_chain(model_id)
    response = chain.invoke({"question": question})
    print(f"\n=== {label} ===")
    print(response)

Running this across the three models costs less than ₹1. That's a cheap way to calibrate which model works best for your specific use case before committing to one for production.

Example 3: ReAct agent with web search

This is where things get interesting. A ReAct agent loops between reasoning and tool use — it can search the web, process the results, and reason about whether it has enough information to answer.

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.prompts import PromptTemplate

load_dotenv()

# Initialize the LLM — GPT-4o for the agent (better at tool use than mini)
llm = ChatOpenAI(
    model="openai/gpt-4o",
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0
)

# Tool: web search
search = DuckDuckGoSearchRun()
tools = [search]

# ReAct prompt template
react_prompt = PromptTemplate.from_template("""Answer the following question as best you can.
You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}""")

# Create the agent
agent = create_react_agent(llm, tools, react_prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,           # shows the reasoning steps
    max_iterations=5,       # prevent infinite loops
    handle_parsing_errors=True
)

# Run it
result = agent_executor.invoke({
    "input": "What are the latest funding rounds for Indian AI startups in 2026? Summarize the top 3."
})

print("\n=== Final Answer ===")
print(result["output"])

For a production version, you'd turn off verbose and add error handling around the invoke call.

Switching the agent to Claude Sonnet 4

Swap one line to use Claude instead:

llm = ChatOpenAI(
    model="anthropic/claude-sonnet-4-20250514",   # changed
    openai_api_key=os.environ["AICREDITS_API_KEY"],
    openai_api_base="https://api.aicredits.in/v1",
    temperature=0
)

Everything else stays the same. The ReAct loop, the tools, the prompt — all identical. This is the value of the OpenAI-compatible interface: your agent architecture is model-agnostic.

Monitoring costs in the AICredits dashboard

After running these examples, open the AICredits dashboard → Usage Logs. You'll see a per-request breakdown showing:

Timestamp
Model used
Input tokens
Output tokens
Cost in INR

Setting a ₹500 budget cap

For any agent running unattended, set a budget cap on the API key. Dashboard → API Keys → Edit → Budget Limit.

With a ₹500 cap on a key running GPT-4o, you can make roughly:

~450 typical agent calls (5 iterations each, ~1,000 tokens/call)
Or ~2,500 simple chat completion calls

At ₹500, you're not going to accidentally run up a ₹10,000 bill from a bug. Set the cap before you run anything in production.

# In your agent setup, add a try/except for budget cap errors
from openai import RateLimitError

try:
    result = agent_executor.invoke({"input": user_question})
except RateLimitError as e:
    # Budget cap hit or rate limited
    print(f"API limit reached: {e}")
    result = {"output": "Service temporarily unavailable. Please try again later."}

What ₹100 gets you

Here's what the minimum top-up (₹100) actually buys across each model:

Model	INR/1M input tokens	Calls at 500 tokens/call	Total possible input tokens
Gemini 2.0 Flash	₹8.84	~22,600	11.3B tokens
GPT-4o-mini	₹13.23	~15,100	7.6B tokens
DeepSeek-R1	₹48.59	~4,100	2.1B tokens
Claude 3.5 Haiku	₹96.30	~2,070	1.04B tokens
o3-mini	₹96.30	~2,070	1.04B tokens
Claude Sonnet 4	₹264.00	~757	379M tokens
GPT-4o	₹221.00	~904	452M tokens

For a prototype or learning project, ₹100 is more than enough. For a production app with real user traffic, estimate based on your expected token volumes and top up accordingly.

One more thing: n8n integration

This means you can build production AI workflows in INR without writing any API integration code — just connect nodes and use UPI to fund the runs.

Build an AI App in India Without a Dollar Account: LangChain + AICredits.in Tutorial

Prerequisites

Setting up the LangChain client

Example 1: Simple QA chain with GPT-4o-mini

Example 2: Multi-model comparison

Example 3: ReAct agent with web search

Switching the agent to Claude Sonnet 4

Monitoring costs in the AICredits dashboard

Setting a ₹500 budget cap

What ₹100 gets you

One more thing: n8n integration

Related articles

AI Engineering Career Roadmap for Indian Developers: SDET/Backend to LLM Engineer in 6 Months

25 AI Prompts for Indian Startup Founders: Product, Pitch Deck, Investor Emails, and GTM

Anthropic's Claude for Open Source: How Indian Developers Can Get Claude Max Free

Build an AI App in India Without a Dollar Account: LangChain + AICredits.in Tutorial

Prerequisites

Setting up the LangChain client

Example 1: Simple QA chain with GPT-4o-mini

Example 2: Multi-model comparison

Example 3: ReAct agent with web search

Switching the agent to Claude Sonnet 4

Monitoring costs in the AICredits dashboard

Setting a ₹500 budget cap

What ₹100 gets you

One more thing: n8n integration

Related articles

AI Engineering Career Roadmap for Indian Developers: SDET/Backend to LLM Engineer in 6 Months

25 AI Prompts for Indian Startup Founders: Product, Pitch Deck, Investor Emails, and GTM

Anthropic's Claude for Open Source: How Indian Developers Can Get Claude Max Free