I used to spend 45 minutes each morning on email triage. Classification, drafting, routing, flagging. The same patterns, every day. I automated it and got that time back.
This post shows you how to build a Gmail agent with Python and Claude that reads your unread threads, classifies them, drafts replies that match the sender's tone, and saves them as Gmail drafts for your review. It never auto-sends — that's intentional.
Architecture
Gmail API (read unread threads)
→ Claude: classify each thread
→ Claude: draft reply for support/sales threads
→ Gmail API: apply labels + create draft
The agent runs on a schedule (cron or n8n timer). Every 30 minutes, it checks for new threads and processes them. You open Gmail, review drafts, edit if needed, and send. The tedious parts are done.
Step 1: Gmail API setup
Go to Google Cloud Console, create a project, enable the Gmail API, and create OAuth 2.0 credentials (Desktop app type). Download the credentials.json file.
pip install google-auth-oauthlib google-auth-httplib2 google-api-python-client anthropic
First-run auth flow — this opens a browser and saves token.json:
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
import os
SCOPES = [
"https://www.googleapis.com/auth/gmail.readonly",
"https://www.googleapis.com/auth/gmail.modify",
"https://www.googleapis.com/auth/gmail.compose",
]
def get_gmail_service():
creds = None
if os.path.exists("token.json"):
creds = Credentials.from_authorized_user_file("token.json", SCOPES)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file("credentials.json", SCOPES)
creds = flow.run_local_server(port=0)
with open("token.json", "w") as f:
f.write(creds.to_json())
from googleapiclient.discovery import build
return build("gmail", "v1", credentials=creds)
Run this once interactively. After that, token.json handles auth silently.
Step 2: Read unread threads
import base64
from email import message_from_bytes
def get_unread_threads(service, max_results: int = 50) -> list[dict]:
result = service.users().threads().list(
userId="me",
q="is:unread -from:noreply -from:no-reply -category:promotions",
maxResults=max_results,
).execute()
threads = []
for thread_meta in result.get("threads", []):
thread = service.users().threads().get(
userId="me",
id=thread_meta["id"],
format="full",
).execute()
messages = thread.get("messages", [])
if not messages:
continue
# Extract the most recent message text
latest = messages[-1]
payload = latest["payload"]
body_text = extract_body(payload)
sender = next(
(h["value"] for h in latest["payload"]["headers"] if h["name"] == "From"), ""
)
subject = next(
(h["value"] for h in latest["payload"]["headers"] if h["name"] == "Subject"), ""
)
threads.append({
"thread_id": thread_meta["id"],
"message_id": latest["id"],
"sender": sender,
"subject": subject,
"body": body_text[:3000], # cap at 3K chars — enough context, not too many tokens
"message_count": len(messages),
})
return threads
def extract_body(payload: dict) -> str:
if payload.get("body", {}).get("data"):
return base64.urlsafe_b64decode(payload["body"]["data"]).decode("utf-8", errors="replace")
for part in payload.get("parts", []):
if part["mimeType"] == "text/plain":
data = part.get("body", {}).get("data", "")
if data:
return base64.urlsafe_b64decode(data).decode("utf-8", errors="replace")
return ""
The query is:unread -from:noreply -category:promotions skips newsletters and automated emails you don't want drafts for.
Step 3: Classify with Claude
For each thread, ask Claude to classify it and decide if a draft reply is needed:
import anthropic, json
client = anthropic.Anthropic()
CLASSIFY_SYSTEM = """You are an email assistant. For each email, return a JSON object with:
- category: one of "support", "sales", "urgent", "newsletter", "internal", "other"
- priority: integer 1-5 (5 = most urgent)
- summary: one sentence describing what the email is about
- needs_reply: boolean — true if a human reply is appropriate
- tone: "formal", "casual", or "technical" — based on the sender's writing style
Return only valid JSON, no other text."""
def classify_thread(thread: dict) -> dict:
prompt = f"""From: {thread['sender']}
Subject: {thread['subject']}
{thread['body']}"""
response = client.messages.create(
model="claude-haiku-4-5-20251001", # Haiku for cheap classification
max_tokens=200,
system=CLASSIFY_SYSTEM,
messages=[{"role": "user", "content": prompt}],
)
try:
return json.loads(response.content[0].text)
except json.JSONDecodeError:
return {"category": "other", "priority": 2, "needs_reply": False, "tone": "formal"}
Haiku handles classification perfectly and costs 10× less than Sonnet. Save Sonnet for draft generation.
Step 4: Draft replies
For threads that need a reply, generate a draft that matches the sender's tone:
DRAFT_SYSTEM = """You are a professional email assistant. Write reply emails that:
- Match the sender's formality level exactly ({tone})
- Answer the specific question or address the specific request — don't be generic
- Are under 120 words unless the question requires more detail
- Don't start with "I hope this email finds you well" or similar filler
- Don't sign off — the human will add their signature
- Flag anything needing urgent human judgment with [REVIEW NEEDED: reason]
Write only the email body text, no subject line."""
def draft_reply(thread: dict, classification: dict) -> str:
system = DRAFT_SYSTEM.format(tone=classification.get("tone", "formal"))
prompt = f"""Email to reply to:
From: {thread['sender']}
Subject: {thread['subject']}
{thread['body']}
Summary: {classification.get('summary', '')}"""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=500,
system=system,
messages=[{"role": "user", "content": prompt}],
)
return response.content[0].text
The tone-matching instruction is the most important part. An email drafted in formal legal language should get a formal reply. A casual "hey quick question" from a colleague should get a casual reply. Claude reads the tone from the original and mirrors it reliably when you ask explicitly.
Step 5: Create Gmail drafts and apply labels
import base64
from email.mime.text import MIMEText
def create_draft(service, thread_id: str, to: str, subject: str, body: str) -> str:
message = MIMEText(body)
message["to"] = to
message["subject"] = f"Re: {subject}" if not subject.startswith("Re:") else subject
raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
draft = service.users().drafts().create(
userId="me",
body={"message": {"raw": raw, "threadId": thread_id}},
).execute()
return draft["id"]
def apply_label(service, thread_id: str, label_name: str):
# Get or create label
labels = service.users().labels().list(userId="me").execute().get("labels", [])
label_id = next((l["id"] for l in labels if l["name"] == label_name), None)
if not label_id:
label = service.users().labels().create(
userId="me", body={"name": label_name}
).execute()
label_id = label["id"]
service.users().threads().modify(
userId="me",
id=thread_id,
body={"addLabelIds": [label_id]},
).execute()
Step 6: The main loop
def run_email_agent():
service = get_gmail_service()
threads = get_unread_threads(service)
processed = 0
for thread in threads:
classification = classify_thread(thread)
# Apply label based on category
apply_label(service, thread["thread_id"], f"AI/{classification['category']}")
# Draft reply for actionable threads
if classification.get("needs_reply") and classification["category"] in ("support", "sales", "urgent"):
draft_body = draft_reply(thread, classification)
# Extract just the email address from "Name <email@domain.com>"
to_email = thread["sender"].split("<")[-1].rstrip(">") if "<" in thread["sender"] else thread["sender"]
create_draft(service, thread["thread_id"], to_email, thread["subject"], draft_body)
processed += 1
print(f"Processed: {thread['subject'][:60]} → {classification['category']} (priority {classification['priority']})")
print(f"\nDone. Processed {processed} threads.")
if __name__ == "__main__":
run_email_agent()
Running on a schedule
The simplest production setup: a cron job on a VPS or a free Railway instance.
# Run every 30 minutes
*/30 * * * * cd /path/to/email-agent && python main.py >> logs/agent.log 2>&1
Or use an n8n schedule trigger connected to a Code node that runs this script — see the n8n automation guide for that setup.
Cost estimate
- 50 emails/day classified with Haiku: ~$0.003/day
- 20 drafts/day generated with Sonnet: ~$0.30/day
- Total: under ₹25/day
For an Indian SMB owner spending 45 minutes on email triage, that's a strong ROI.
What to watch out for
Don't auto-send: drafts give you a review step. The agent will occasionally draft something slightly off — a tone mismatch, an incomplete answer. That's fine. You catch it before it goes out.
Test with low-stakes emails first: start with a dedicated test inbox or a label filter that only passes certain senders to the agent.
The [REVIEW NEEDED] flag: when the agent includes this in a draft, read that draft carefully before sending. It means Claude identified something that needed human judgment — legal language, a sensitive situation, an ambiguous request.
Newsletters and automated emails: the -category:promotions filter in the Gmail query handles most of them, but you'll still get some. Add -from:yourdomain.com if you don't want internal emails processed.
The agent design patterns post covers more architectures if you want to extend this — adding a CRM integration, routing to team members, or handling attachments.



