What is prompt engineering?

Prompt engineering is the practice of crafting inputs to AI language models to produce accurate, useful, and reliable outputs. It involves choosing the right words, structure, context, and format to guide the AI toward the response you actually need — rather than a generic or off-target one.

Which AI models benefit most from better prompting?

All major large language models — including ChatGPT (GPT-4o), Claude, and Gemini — respond significantly to prompt quality. The same task can produce dramatically different results depending on how you structure your request. Better prompting improves output across every major model.

Do I need technical skills to do prompt engineering?

No. Prompt engineering is done in natural language — you write text instructions, not code. Basic prompting needs no technical background at all. Advanced techniques like prompt chaining or agentic workflows can benefit from light scripting knowledge, but the core skill is clear written communication.

Where can I learn more about prompt engineering?

MasterPrompting.net offers a structured curriculum from beginner to advanced, covering every major technique from basic clarity and context to chain-of-thought, meta-prompting, and agentic workflows. Start with the Beginner track to build a solid foundation.

How I Built a WhatsApp AI Bot for My Business in 3 Hours Using Claude + Twilio

My phone buzzed at 11:43pm. "What are your charges?" Third time that week, different person. Same question I'd answered on the website, in my Instagram bio, and in the pinned WhatsApp message. I'm not complaining about interested customers — I'm complaining about answering the same five questions on repeat while trying to sleep.

WhatsApp handles over 90% of Indian business communication. Unlike email, people actually read it. Unlike Instagram DMs, it works on the cheapest phones. But that also means your customers expect replies at any hour, and if you're a solo founder or small team, that's a problem.

Here's what I built to solve it: a WhatsApp bot that answers FAQs, handles basic queries, and politely escalates to me when it doesn't know something. It runs 24/7 for about ₹15-20/day in API costs. Here's exactly how.

What the bot does

Answers up to 20 common FAQs from a system prompt I wrote (charges, timelines, what's included, how to book)
Maintains conversation context for 5 messages so follow-up questions work naturally
Replies in the user's language — handles Hindi-English mix without explicit configuration
When it doesn't know something, it says so and offers to connect them to a human (me, via a specific reply phrase)
Runs 24/7 with zero uptime management on Render's free tier

What it doesn't do: real-time order lookups, payment processing, anything requiring a database query. That's a Phase 2 problem.

The stack

Component	Tool	Cost
WhatsApp integration	Twilio WhatsApp sandbox	Free for testing; $0.005/message in prod (~₹0.42/message)
AI responses	Claude Haiku 3.5 via AICredits.in	~₹15-20/day for moderate volume
Webhook server	Flask (Python)	Free
Hosting	Render free tier	Free
Conversation memory	In-process dict (demo)	Free

Total infra cost: ~₹450-600/month for a business getting 50-100 WhatsApp messages/day.

Step-by-step build

Set up Twilio WhatsApp sandbox

Create a Twilio account at twilio.com — note that Twilio requires an international card for billing. You get $15 trial credit which lasts a long time for testing. If you don't have an international card, a friend or family member abroad can help, or use a virtual card service like Niyo Global.
In the Twilio console, go to Messaging → Try it out → Send a WhatsApp message. You'll see the sandbox number and a join code.
From your WhatsApp, send the join code to the Twilio sandbox number (format: "join [your-code]"). Your number is now connected to the sandbox.
In the sandbox settings, you'll configure the webhook URL later — once we have our Flask server deployed.

Write the Flask webhook server

Here's the complete working code. This is copy-paste ready:

# app.py
import os
from flask import Flask, request
from twilio.twiml.messaging_response import MessagingResponse
from openai import OpenAI

app = Flask(__name__)

# Configure AICredits.in as the API endpoint
# Get your API key from aicredits.in after signing up
client = OpenAI(
    api_key=os.environ.get("AICREDITS_API_KEY"),
    base_url="https://api.aicredits.in/v1"
)

# System prompt: customise this for your business
# Be specific — the more specific your instructions, the better the answers
SYSTEM_PROMPT = """You are a helpful assistant for [YOUR BUSINESS NAME].

ABOUT THE BUSINESS:
[2-3 sentences describing what you do]

PRICING:
[Your pricing — be specific with ₹ amounts]

WHAT'S INCLUDED:
[List what customers get]

TIMELINE / PROCESS:
[How long things take, what the process is]

FREQUENTLY ASKED QUESTIONS:
Q: [Common question 1]
A: [Your answer]

Q: [Common question 2]
A: [Your answer]

[Add 10-15 more FAQs]

ESCALATION:
If someone asks something you're not sure about, say: "That's a great question — let me connect you with [OWNER NAME] directly. Reply 'HUMAN' and they'll get back to you within a few hours."

LANGUAGE:
Reply in the same language the customer writes in. Handle Hindi-English mix naturally. Keep responses concise — WhatsApp messages, not essays.

TONE:
Friendly, helpful, like a knowledgeable team member. Not robotic. Use the customer's name if they share it."""

# In-memory conversation store — use Redis in production
# Key: phone number, Value: list of message dicts
conversations = {}
MAX_HISTORY = 5  # Keep last 5 message pairs

def get_conversation_history(phone_number):
    """Get conversation history for a phone number."""
    return conversations.get(phone_number, [])

def update_conversation(phone_number, user_message, assistant_reply):
    """Add new messages to conversation history, trimming if needed."""
    if phone_number not in conversations:
        conversations[phone_number] = []
    
    conversations[phone_number].append({"role": "user", "content": user_message})
    conversations[phone_number].append({"role": "assistant", "content": assistant_reply})
    
    # Keep only the last MAX_HISTORY exchanges (pairs of messages)
    if len(conversations[phone_number]) > MAX_HISTORY * 2:
        conversations[phone_number] = conversations[phone_number][-(MAX_HISTORY * 2):]

def get_ai_response(phone_number, user_message):
    """Get a response from Claude via AICredits.in."""
    history = get_conversation_history(phone_number)
    
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    messages.extend(history)
    messages.append({"role": "user", "content": user_message})
    
    response = client.chat.completions.create(
        model="claude-haiku-3-5",  # Fast and cheap — ideal for FAQ bots
        messages=messages,
        max_tokens=500,  # Keep responses WhatsApp-length
        temperature=0.3  # Lower temp = more consistent, less creative
    )
    
    return response.choices[0].message.content

@app.route("/webhook", methods=["POST"])
def webhook():
    """Handle incoming WhatsApp messages from Twilio."""
    incoming_msg = request.values.get("Body", "").strip()
    from_number = request.values.get("From", "")
    
    # Handle human escalation request
    if incoming_msg.upper() == "HUMAN":
        reply = "Got it — I'll let [OWNER NAME] know you want to chat. Expect a reply within a few hours. For urgent matters, call [PHONE NUMBER]."
        # TODO: Send yourself a Slack/WhatsApp notification here
    else:
        try:
            reply = get_ai_response(from_number, incoming_msg)
            update_conversation(from_number, incoming_msg, reply)
        except Exception as e:
            reply = "Sorry, I'm having a technical hiccup. Try again in a moment, or reply 'HUMAN' to reach us directly."
            print(f"Error generating response: {e}")
    
    # Build Twilio response
    resp = MessagingResponse()
    resp.message(reply)
    return str(resp)

@app.route("/health", methods=["GET"])
def health():
    return {"status": "ok"}, 200

if __name__ == "__main__":
    port = int(os.environ.get("PORT", 5000))
    app.run(host="0.0.0.0", port=port, debug=False)

# requirements.txt
flask==3.0.0
twilio==8.13.0
openai==1.12.0
gunicorn==21.2.0

A few things worth explaining in this code:

Why base_url on the OpenAI client? AICredits.in implements the OpenAI-compatible API, so you can use the standard openai Python library pointed at their endpoint. This means you can swap between providers by just changing the model name and key.

Why temperature=0.3? FAQ bots should be consistent. You want the same answer to "what are your charges?" every time, not creative variations. Lower temperature reduces hallucination risk too.

Why max_tokens=500? WhatsApp messages read better when they're short. 500 tokens is roughly 375 words — more than enough for an FAQ answer.

Add conversation memory for production

The in-memory dict works fine for demos and low-traffic scenarios (restarts clear the history, but that's acceptable for FAQ bots). For production, replace it with Redis:

# Production conversation memory with Redis
import redis
import json

r = redis.from_url(os.environ.get("REDIS_URL"))

def get_conversation_history(phone_number):
    data = r.get(f"conv:{phone_number}")
    return json.loads(data) if data else []

def update_conversation(phone_number, user_message, assistant_reply):
    history = get_conversation_history(phone_number)
    history.append({"role": "user", "content": user_message})
    history.append({"role": "assistant", "content": assistant_reply})
    
    # Keep last 10 messages, expire after 24 hours
    if len(history) > 10:
        history = history[-10:]
    
    r.setex(f"conv:{phone_number}", 86400, json.dumps(history))

Render's free tier doesn't include Redis, but Redis Cloud has a free 30MB tier that's more than enough for conversation history.

Deploy to Render

Create a render.yaml in your project root:

services:
  - type: web
    name: whatsapp-bot
    env: python
    buildCommand: pip install -r requirements.txt
    startCommand: gunicorn app:app
    envVars:
      - key: AICREDITS_API_KEY
        sync: false  # Set this in Render dashboard, not in file

Deploy steps:

Push your code to GitHub (make sure .gitignore includes .env and any local secrets)
Go to render.com → New → Web Service → connect your GitHub repo
Add AICREDITS_API_KEY as an environment variable in the Render dashboard (get this from AICredits.in after signing up)
Deploy — takes about 2-3 minutes
Copy your Render service URL (format: https://whatsapp-bot-xxxx.onrender.com)
Go back to Twilio sandbox settings → paste your URL as the webhook: https://your-render-url.onrender.com/webhook

Send your bot a test message on WhatsApp. If it replies, you're done.

Render free tier gotcha: Free tier services spin down after 15 minutes of inactivity and take ~~30 seconds to wake up. First message after idle period gets a delay. For a business bot, upgrade to Render's $7/month plan (~~₹590/month) to keep it always-on.

What it cost to build and run

Item	Cost	Notes
Twilio trial credit	$15 free (₹1,260)	Lasts months at low volume
AICredits.in top-up	₹100 initial	Recharge as needed
Render free tier	₹0	Fine for testing; ₹590/month for always-on
My time	3 hours	First-time setup; 30 min to update prompts later

Running cost at ~100 messages/day: approximately ₹15-20/day in AI costs + ₹0.42/message in Twilio costs for the WhatsApp Business API (sandbox is free). Total: roughly ₹500-600/month for moderate volume.

Compare that to the alternative: hiring someone to reply to WhatsApp messages part-time. At ₹15,000/month minimum, the bot pays for itself in the first day.

💡 For the Claude API access in ₹, I used AICredits.in — UPI payment, no international card needed. Minimum ₹100 recharge.

What I'd improve next time

Conversation history in a database: The in-memory approach works but restarts lose context. Redis or even a simple SQLite file would persist conversation history across deploys.

Connect to order management via API: Right now the bot can only answer static FAQs. If I connected it to my booking system, it could check appointment availability, send confirmation numbers, and look up order status. That requires function calling — the bot needs to call my API with the customer's query.

Human handoff flow via Slack: When someone replies "HUMAN", I currently get nothing — the code just sends a generic message. The right implementation: send a Slack DM or WhatsApp message to me with the conversation history so I can pick up exactly where the bot left off.

Analytics: I have no idea which questions the bot answers badly. Adding logging to a simple spreadsheet or Notion database via Zapier would tell me which FAQs need better answers.

Rate limiting: A customer who's frustrated might send 50 messages in a row. Without rate limiting, that's your API bill. Add a simple counter per phone number with a reset window.

What the bot does

Answers up to 20 common FAQs from a system prompt I wrote (charges, timelines, what's included, how to book)
Maintains conversation context for 5 messages so follow-up questions work naturally
Replies in the user's language — handles Hindi-English mix without explicit configuration
When it doesn't know something, it says so and offers to connect them to a human (me, via a specific reply phrase)
Runs 24/7 with zero uptime management on Render's free tier

What it doesn't do: real-time order lookups, payment processing, anything requiring a database query. That's a Phase 2 problem.

The stack

Component	Tool	Cost
WhatsApp integration	Twilio WhatsApp sandbox	Free for testing; $0.005/message in prod (~₹0.42/message)
AI responses	Claude Haiku 3.5 via AICredits.in	~₹15-20/day for moderate volume
Webhook server	Flask (Python)	Free
Hosting	Render free tier	Free
Conversation memory	In-process dict (demo)	Free

Total infra cost: ~₹450-600/month for a business getting 50-100 WhatsApp messages/day.

Step-by-step build

Set up Twilio WhatsApp sandbox

Create a Twilio account at twilio.com — note that Twilio requires an international card for billing. You get $15 trial credit which lasts a long time for testing. If you don't have an international card, a friend or family member abroad can help, or use a virtual card service like Niyo Global.
In the Twilio console, go to Messaging → Try it out → Send a WhatsApp message. You'll see the sandbox number and a join code.
From your WhatsApp, send the join code to the Twilio sandbox number (format: "join [your-code]"). Your number is now connected to the sandbox.
In the sandbox settings, you'll configure the webhook URL later — once we have our Flask server deployed.

Write the Flask webhook server

Here's the complete working code. This is copy-paste ready:

# app.py
import os
from flask import Flask, request
from twilio.twiml.messaging_response import MessagingResponse
from openai import OpenAI

app = Flask(__name__)

# Configure AICredits.in as the API endpoint
# Get your API key from aicredits.in after signing up
client = OpenAI(
    api_key=os.environ.get("AICREDITS_API_KEY"),
    base_url="https://api.aicredits.in/v1"
)

# System prompt: customise this for your business
# Be specific — the more specific your instructions, the better the answers
SYSTEM_PROMPT = """You are a helpful assistant for [YOUR BUSINESS NAME].

ABOUT THE BUSINESS:
[2-3 sentences describing what you do]

PRICING:
[Your pricing — be specific with ₹ amounts]

WHAT'S INCLUDED:
[List what customers get]

TIMELINE / PROCESS:
[How long things take, what the process is]

FREQUENTLY ASKED QUESTIONS:
Q: [Common question 1]
A: [Your answer]

Q: [Common question 2]
A: [Your answer]

[Add 10-15 more FAQs]

ESCALATION:
If someone asks something you're not sure about, say: "That's a great question — let me connect you with [OWNER NAME] directly. Reply 'HUMAN' and they'll get back to you within a few hours."

LANGUAGE:
Reply in the same language the customer writes in. Handle Hindi-English mix naturally. Keep responses concise — WhatsApp messages, not essays.

TONE:
Friendly, helpful, like a knowledgeable team member. Not robotic. Use the customer's name if they share it."""

# In-memory conversation store — use Redis in production
# Key: phone number, Value: list of message dicts
conversations = {}
MAX_HISTORY = 5  # Keep last 5 message pairs

def get_conversation_history(phone_number):
    """Get conversation history for a phone number."""
    return conversations.get(phone_number, [])

def update_conversation(phone_number, user_message, assistant_reply):
    """Add new messages to conversation history, trimming if needed."""
    if phone_number not in conversations:
        conversations[phone_number] = []
    
    conversations[phone_number].append({"role": "user", "content": user_message})
    conversations[phone_number].append({"role": "assistant", "content": assistant_reply})
    
    # Keep only the last MAX_HISTORY exchanges (pairs of messages)
    if len(conversations[phone_number]) > MAX_HISTORY * 2:
        conversations[phone_number] = conversations[phone_number][-(MAX_HISTORY * 2):]

def get_ai_response(phone_number, user_message):
    """Get a response from Claude via AICredits.in."""
    history = get_conversation_history(phone_number)
    
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    messages.extend(history)
    messages.append({"role": "user", "content": user_message})
    
    response = client.chat.completions.create(
        model="claude-haiku-3-5",  # Fast and cheap — ideal for FAQ bots
        messages=messages,
        max_tokens=500,  # Keep responses WhatsApp-length
        temperature=0.3  # Lower temp = more consistent, less creative
    )
    
    return response.choices[0].message.content

@app.route("/webhook", methods=["POST"])
def webhook():
    """Handle incoming WhatsApp messages from Twilio."""
    incoming_msg = request.values.get("Body", "").strip()
    from_number = request.values.get("From", "")
    
    # Handle human escalation request
    if incoming_msg.upper() == "HUMAN":
        reply = "Got it — I'll let [OWNER NAME] know you want to chat. Expect a reply within a few hours. For urgent matters, call [PHONE NUMBER]."
        # TODO: Send yourself a Slack/WhatsApp notification here
    else:
        try:
            reply = get_ai_response(from_number, incoming_msg)
            update_conversation(from_number, incoming_msg, reply)
        except Exception as e:
            reply = "Sorry, I'm having a technical hiccup. Try again in a moment, or reply 'HUMAN' to reach us directly."
            print(f"Error generating response: {e}")
    
    # Build Twilio response
    resp = MessagingResponse()
    resp.message(reply)
    return str(resp)

@app.route("/health", methods=["GET"])
def health():
    return {"status": "ok"}, 200

if __name__ == "__main__":
    port = int(os.environ.get("PORT", 5000))
    app.run(host="0.0.0.0", port=port, debug=False)

# requirements.txt
flask==3.0.0
twilio==8.13.0
openai==1.12.0
gunicorn==21.2.0

A few things worth explaining in this code:

Why temperature=0.3? FAQ bots should be consistent. You want the same answer to "what are your charges?" every time, not creative variations. Lower temperature reduces hallucination risk too.

Why max_tokens=500? WhatsApp messages read better when they're short. 500 tokens is roughly 375 words — more than enough for an FAQ answer.

Add conversation memory for production

The in-memory dict works fine for demos and low-traffic scenarios (restarts clear the history, but that's acceptable for FAQ bots). For production, replace it with Redis:

# Production conversation memory with Redis
import redis
import json

r = redis.from_url(os.environ.get("REDIS_URL"))

def get_conversation_history(phone_number):
    data = r.get(f"conv:{phone_number}")
    return json.loads(data) if data else []

def update_conversation(phone_number, user_message, assistant_reply):
    history = get_conversation_history(phone_number)
    history.append({"role": "user", "content": user_message})
    history.append({"role": "assistant", "content": assistant_reply})
    
    # Keep last 10 messages, expire after 24 hours
    if len(history) > 10:
        history = history[-10:]
    
    r.setex(f"conv:{phone_number}", 86400, json.dumps(history))

Render's free tier doesn't include Redis, but Redis Cloud has a free 30MB tier that's more than enough for conversation history.

Deploy to Render

Create a render.yaml in your project root:

services:
  - type: web
    name: whatsapp-bot
    env: python
    buildCommand: pip install -r requirements.txt
    startCommand: gunicorn app:app
    envVars:
      - key: AICREDITS_API_KEY
        sync: false  # Set this in Render dashboard, not in file

Deploy steps:

Push your code to GitHub (make sure .gitignore includes .env and any local secrets)
Go to render.com → New → Web Service → connect your GitHub repo
Add AICREDITS_API_KEY as an environment variable in the Render dashboard (get this from AICredits.in after signing up)
Deploy — takes about 2-3 minutes
Copy your Render service URL (format: https://whatsapp-bot-xxxx.onrender.com)
Go back to Twilio sandbox settings → paste your URL as the webhook: https://your-render-url.onrender.com/webhook

Send your bot a test message on WhatsApp. If it replies, you're done.

What it cost to build and run

Item	Cost	Notes
Twilio trial credit	$15 free (₹1,260)	Lasts months at low volume
AICredits.in top-up	₹100 initial	Recharge as needed
Render free tier	₹0	Fine for testing; ₹590/month for always-on
My time	3 hours	First-time setup; 30 min to update prompts later

Compare that to the alternative: hiring someone to reply to WhatsApp messages part-time. At ₹15,000/month minimum, the bot pays for itself in the first day.

💡 For the Claude API access in ₹, I used AICredits.in — UPI payment, no international card needed. Minimum ₹100 recharge.

What I'd improve next time

Conversation history in a database: The in-memory approach works but restarts lose context. Redis or even a simple SQLite file would persist conversation history across deploys.

Analytics: I have no idea which questions the bot answers badly. Adding logging to a simple spreadsheet or Notion database via Zapier would tell me which FAQs need better answers.

Rate limiting: A customer who's frustrated might send 50 messages in a row. Without rate limiting, that's your API bill. Add a simple counter per phone number with a reset window.

How I Built a WhatsApp AI Bot for My Business in 3 Hours Using Claude + Twilio

What the bot does

The stack

Step-by-step build

Set up Twilio WhatsApp sandbox

Write the Flask webhook server

Add conversation memory for production

Deploy to Render

What it cost to build and run

What I'd improve next time

What to read next

Related articles

AI Engineering Career Roadmap for Indian Developers: SDET/Backend to LLM Engineer in 6 Months

25 AI Prompts for Indian Startup Founders: Product, Pitch Deck, Investor Emails, and GTM

Anthropic's Claude for Open Source: How Indian Developers Can Get Claude Max Free

How I Built a WhatsApp AI Bot for My Business in 3 Hours Using Claude + Twilio

What the bot does

The stack

Step-by-step build

Set up Twilio WhatsApp sandbox

Write the Flask webhook server

Add conversation memory for production

Deploy to Render

What it cost to build and run

What I'd improve next time

What to read next

Related articles

AI Engineering Career Roadmap for Indian Developers: SDET/Backend to LLM Engineer in 6 Months

25 AI Prompts for Indian Startup Founders: Product, Pitch Deck, Investor Emails, and GTM

Anthropic's Claude for Open Source: How Indian Developers Can Get Claude Max Free