My phone buzzed at 11:43pm. "What are your charges?" Third time that week, different person. Same question I'd answered on the website, in my Instagram bio, and in the pinned WhatsApp message. I'm not complaining about interested customers — I'm complaining about answering the same five questions on repeat while trying to sleep.
WhatsApp handles over 90% of Indian business communication. Unlike email, people actually read it. Unlike Instagram DMs, it works on the cheapest phones. But that also means your customers expect replies at any hour, and if you're a solo founder or small team, that's a problem.
Here's what I built to solve it: a WhatsApp bot that answers FAQs, handles basic queries, and politely escalates to me when it doesn't know something. It runs 24/7 for about ₹15-20/day in API costs. Here's exactly how.
What the bot does
- Answers up to 20 common FAQs from a system prompt I wrote (charges, timelines, what's included, how to book)
- Maintains conversation context for 5 messages so follow-up questions work naturally
- Replies in the user's language — handles Hindi-English mix without explicit configuration
- When it doesn't know something, it says so and offers to connect them to a human (me, via a specific reply phrase)
- Runs 24/7 with zero uptime management on Render's free tier
What it doesn't do: real-time order lookups, payment processing, anything requiring a database query. That's a Phase 2 problem.
The stack
| Component | Tool | Cost |
|---|---|---|
| WhatsApp integration | Twilio WhatsApp sandbox | Free for testing; $0.005/message in prod (~₹0.42/message) |
| AI responses | Claude Haiku 3.5 via AICredits.in | ~₹15-20/day for moderate volume |
| Webhook server | Flask (Python) | Free |
| Hosting | Render free tier | Free |
| Conversation memory | In-process dict (demo) | Free |
Total infra cost: ~₹450-600/month for a business getting 50-100 WhatsApp messages/day.
Step-by-step build
Set up Twilio WhatsApp sandbox
-
Create a Twilio account at twilio.com — note that Twilio requires an international card for billing. You get $15 trial credit which lasts a long time for testing. If you don't have an international card, a friend or family member abroad can help, or use a virtual card service like Niyo Global.
-
In the Twilio console, go to Messaging → Try it out → Send a WhatsApp message. You'll see the sandbox number and a join code.
-
From your WhatsApp, send the join code to the Twilio sandbox number (format: "join [your-code]"). Your number is now connected to the sandbox.
-
In the sandbox settings, you'll configure the webhook URL later — once we have our Flask server deployed.
Write the Flask webhook server
Here's the complete working code. This is copy-paste ready:
# app.py
import os
from flask import Flask, request
from twilio.twiml.messaging_response import MessagingResponse
from openai import OpenAI
app = Flask(__name__)
# Configure AICredits.in as the API endpoint
# Get your API key from aicredits.in after signing up
client = OpenAI(
api_key=os.environ.get("AICREDITS_API_KEY"),
base_url="https://api.aicredits.in/v1"
)
# System prompt: customise this for your business
# Be specific — the more specific your instructions, the better the answers
SYSTEM_PROMPT = """You are a helpful assistant for [YOUR BUSINESS NAME].
ABOUT THE BUSINESS:
[2-3 sentences describing what you do]
PRICING:
[Your pricing — be specific with ₹ amounts]
WHAT'S INCLUDED:
[List what customers get]
TIMELINE / PROCESS:
[How long things take, what the process is]
FREQUENTLY ASKED QUESTIONS:
Q: [Common question 1]
A: [Your answer]
Q: [Common question 2]
A: [Your answer]
[Add 10-15 more FAQs]
ESCALATION:
If someone asks something you're not sure about, say: "That's a great question — let me connect you with [OWNER NAME] directly. Reply 'HUMAN' and they'll get back to you within a few hours."
LANGUAGE:
Reply in the same language the customer writes in. Handle Hindi-English mix naturally. Keep responses concise — WhatsApp messages, not essays.
TONE:
Friendly, helpful, like a knowledgeable team member. Not robotic. Use the customer's name if they share it."""
# In-memory conversation store — use Redis in production
# Key: phone number, Value: list of message dicts
conversations = {}
MAX_HISTORY = 5 # Keep last 5 message pairs
def get_conversation_history(phone_number):
"""Get conversation history for a phone number."""
return conversations.get(phone_number, [])
def update_conversation(phone_number, user_message, assistant_reply):
"""Add new messages to conversation history, trimming if needed."""
if phone_number not in conversations:
conversations[phone_number] = []
conversations[phone_number].append({"role": "user", "content": user_message})
conversations[phone_number].append({"role": "assistant", "content": assistant_reply})
# Keep only the last MAX_HISTORY exchanges (pairs of messages)
if len(conversations[phone_number]) > MAX_HISTORY * 2:
conversations[phone_number] = conversations[phone_number][-(MAX_HISTORY * 2):]
def get_ai_response(phone_number, user_message):
"""Get a response from Claude via AICredits.in."""
history = get_conversation_history(phone_number)
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
messages.extend(history)
messages.append({"role": "user", "content": user_message})
response = client.chat.completions.create(
model="claude-haiku-3-5", # Fast and cheap — ideal for FAQ bots
messages=messages,
max_tokens=500, # Keep responses WhatsApp-length
temperature=0.3 # Lower temp = more consistent, less creative
)
return response.choices[0].message.content
@app.route("/webhook", methods=["POST"])
def webhook():
"""Handle incoming WhatsApp messages from Twilio."""
incoming_msg = request.values.get("Body", "").strip()
from_number = request.values.get("From", "")
# Handle human escalation request
if incoming_msg.upper() == "HUMAN":
reply = "Got it — I'll let [OWNER NAME] know you want to chat. Expect a reply within a few hours. For urgent matters, call [PHONE NUMBER]."
# TODO: Send yourself a Slack/WhatsApp notification here
else:
try:
reply = get_ai_response(from_number, incoming_msg)
update_conversation(from_number, incoming_msg, reply)
except Exception as e:
reply = "Sorry, I'm having a technical hiccup. Try again in a moment, or reply 'HUMAN' to reach us directly."
print(f"Error generating response: {e}")
# Build Twilio response
resp = MessagingResponse()
resp.message(reply)
return str(resp)
@app.route("/health", methods=["GET"])
def health():
return {"status": "ok"}, 200
if __name__ == "__main__":
port = int(os.environ.get("PORT", 5000))
app.run(host="0.0.0.0", port=port, debug=False)
# requirements.txt
flask==3.0.0
twilio==8.13.0
openai==1.12.0
gunicorn==21.2.0
A few things worth explaining in this code:
Why base_url on the OpenAI client? AICredits.in implements the OpenAI-compatible API, so you can use the standard openai Python library pointed at their endpoint. This means you can swap between providers by just changing the model name and key.
Why temperature=0.3? FAQ bots should be consistent. You want the same answer to "what are your charges?" every time, not creative variations. Lower temperature reduces hallucination risk too.
Why max_tokens=500? WhatsApp messages read better when they're short. 500 tokens is roughly 375 words — more than enough for an FAQ answer.
Add conversation memory for production
The in-memory dict works fine for demos and low-traffic scenarios (restarts clear the history, but that's acceptable for FAQ bots). For production, replace it with Redis:
# Production conversation memory with Redis
import redis
import json
r = redis.from_url(os.environ.get("REDIS_URL"))
def get_conversation_history(phone_number):
data = r.get(f"conv:{phone_number}")
return json.loads(data) if data else []
def update_conversation(phone_number, user_message, assistant_reply):
history = get_conversation_history(phone_number)
history.append({"role": "user", "content": user_message})
history.append({"role": "assistant", "content": assistant_reply})
# Keep last 10 messages, expire after 24 hours
if len(history) > 10:
history = history[-10:]
r.setex(f"conv:{phone_number}", 86400, json.dumps(history))
Render's free tier doesn't include Redis, but Redis Cloud has a free 30MB tier that's more than enough for conversation history.
Deploy to Render
Create a render.yaml in your project root:
services:
- type: web
name: whatsapp-bot
env: python
buildCommand: pip install -r requirements.txt
startCommand: gunicorn app:app
envVars:
- key: AICREDITS_API_KEY
sync: false # Set this in Render dashboard, not in file
Deploy steps:
- Push your code to GitHub (make sure
.gitignoreincludes.envand any local secrets) - Go to render.com → New → Web Service → connect your GitHub repo
- Add
AICREDITS_API_KEYas an environment variable in the Render dashboard (get this from AICredits.in after signing up) - Deploy — takes about 2-3 minutes
- Copy your Render service URL (format:
https://whatsapp-bot-xxxx.onrender.com) - Go back to Twilio sandbox settings → paste your URL as the webhook:
https://your-render-url.onrender.com/webhook
Send your bot a test message on WhatsApp. If it replies, you're done.
Render free tier gotcha: Free tier services spin down after 15 minutes of inactivity and take 30 seconds to wake up. First message after idle period gets a delay. For a business bot, upgrade to Render's $7/month plan (₹590/month) to keep it always-on.
What it cost to build and run
| Item | Cost | Notes |
|---|---|---|
| Twilio trial credit | $15 free (₹1,260) | Lasts months at low volume |
| AICredits.in top-up | ₹100 initial | Recharge as needed |
| Render free tier | ₹0 | Fine for testing; ₹590/month for always-on |
| My time | 3 hours | First-time setup; 30 min to update prompts later |
Running cost at ~100 messages/day: approximately ₹15-20/day in AI costs + ₹0.42/message in Twilio costs for the WhatsApp Business API (sandbox is free). Total: roughly ₹500-600/month for moderate volume.
Compare that to the alternative: hiring someone to reply to WhatsApp messages part-time. At ₹15,000/month minimum, the bot pays for itself in the first day.
💡 For the Claude API access in ₹, I used AICredits.in — UPI payment, no international card needed. Minimum ₹100 recharge.
What I'd improve next time
Conversation history in a database: The in-memory approach works but restarts lose context. Redis or even a simple SQLite file would persist conversation history across deploys.
Connect to order management via API: Right now the bot can only answer static FAQs. If I connected it to my booking system, it could check appointment availability, send confirmation numbers, and look up order status. That requires function calling — the bot needs to call my API with the customer's query.
Human handoff flow via Slack: When someone replies "HUMAN", I currently get nothing — the code just sends a generic message. The right implementation: send a Slack DM or WhatsApp message to me with the conversation history so I can pick up exactly where the bot left off.
Analytics: I have no idea which questions the bot answers badly. Adding logging to a simple spreadsheet or Notion database via Zapier would tell me which FAQs need better answers.
Rate limiting: A customer who's frustrated might send 50 messages in a row. Without rate limiting, that's your API bill. Add a simple counter per phone number with a reset window.
What to read next
- n8n + Claude for Indian business automation — automate workflows without writing code
- OpenClaw WhatsApp and Telegram setup — if you want an agent instead of a FAQ bot
- Build your first AI agent — go deeper on agent architecture
- Function calling explained — how to connect your bot to live data sources



