An LLM call at step 7 of a 12-step workflow fails. What happens next?
With most agent implementations: the workflow crashes, the partial work is lost, and you retry from step 1. Steps 1–6 are re-executed, their API calls are re-made, their costs are re-spent.
With Temporal: the workflow resumes from step 7. The first six steps are replayed from the event log — no re-execution, no re-cost. The LLM call at step 7 retries with exponential backoff. Everything else continues normally.
This matters the moment your agents touch anything in the real world: send emails, update databases, process payments, generate invoices. Idempotency assumptions collapse under partial failure. Temporal handles it correctly.
When you need this
A simple chatbot doesn't need Temporal. Add durability when:
- The workflow runs longer than 60 seconds
- It involves 4+ external API calls that can fail independently
- Partial completion has real cost (processing 500 invoices, you don't want to redo the first 200 on failure)
- You need exactly-once semantics (don't send the invoice email twice)
Setup
You need a Temporal server and a Python worker. The fastest local setup:
# Start Temporal server locally
brew install temporal
temporal server start-dev
# Python SDK
pip install temporalio anthropic
For production: Temporal Cloud (managed, starts at $25/month) or self-hosted on a VPS.
The core concepts
Activity — a single, retryable unit of work. Makes the API call, calls Claude, writes to the database. Each activity runs in isolation and can be retried independently.
Workflow — the orchestrator. Sequences activities, maintains state durably, survives crashes. Workflows must be deterministic — no direct I/O (that goes in activities).
Worker — the process that actually executes activities and workflows. Multiple workers can run in parallel.
Build an invoice processing pipeline
This pipeline: download a PDF → extract text → call Claude to structure the data → post to Zoho Books. Three steps, each independently retryable.
import anthropic
import requests
import pdfplumber
import io
from temporalio import activity, workflow
from temporalio.client import Client
from temporalio.worker import Worker
from temporalio.common import RetryPolicy
from datetime import timedelta
# ── Activities (actual work) ───────────────────────────────────────────────
@activity.defn
async def download_and_extract_pdf(pdf_url: str) -> str:
"""Download PDF from URL and extract text."""
response = requests.get(pdf_url, timeout=30)
response.raise_for_status()
with pdfplumber.open(io.BytesIO(response.content)) as pdf:
text = "\n".join(page.extract_text() or "" for page in pdf.pages)
if not text.strip():
raise ValueError(f"No text extracted from PDF: {pdf_url}")
return text[:6000] # Cap at 6K chars for LLM context
@activity.defn
async def extract_invoice_fields(text: str) -> dict:
"""Call Claude to extract structured invoice data."""
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1000,
messages=[{
"role": "user",
"content": f"""Extract invoice fields from this text. Return JSON with:
vendor_name, vendor_gstin, invoice_number, invoice_date,
line_items (list of {{description, amount}}), subtotal, gst_amount, total.
Text:
{text}"""
}],
)
import json
raw = response.content[0].text
# Parse JSON from response
try:
return json.loads(raw)
except json.JSONDecodeError:
# Try to extract JSON from markdown code block
if "```json" in raw:
raw = raw.split("```json")[1].split("```")[0]
return json.loads(raw)
@activity.defn
async def post_to_zoho_books(invoice_data: dict) -> dict:
"""Post structured invoice to Zoho Books."""
zoho_token = activity.info().heartbeat_details or None
response = requests.post(
"https://books.zoho.in/api/v3/bills",
headers={
"Authorization": f"Zoho-oauthtoken {zoho_token}",
"Content-Type": "application/json",
},
json={
"vendor_id": invoice_data.get("vendor_gstin", ""),
"bill_number": invoice_data.get("invoice_number", ""),
"date": invoice_data.get("invoice_date", ""),
"total": invoice_data.get("total", 0),
},
)
if not response.ok:
raise ValueError(f"Zoho API error: {response.status_code}: {response.text}")
return response.json()
# ── Workflow (orchestrator) ────────────────────────────────────────────────
@workflow.defn
class InvoiceProcessingWorkflow:
@workflow.run
async def run(self, pdf_url: str) -> dict:
# Step 1: Extract PDF text
# Retries up to 5 times with exponential backoff
pdf_text = await workflow.execute_activity(
download_and_extract_pdf,
pdf_url,
start_to_close_timeout=timedelta(seconds=60),
retry_policy=RetryPolicy(
maximum_attempts=5,
initial_interval=timedelta(seconds=2),
backoff_coefficient=2.0,
),
)
# Step 2: Claude extraction
# If Claude is down, retry up to 3 times with 30s backoff
invoice_data = await workflow.execute_activity(
extract_invoice_fields,
pdf_text,
start_to_close_timeout=timedelta(seconds=120),
retry_policy=RetryPolicy(
maximum_attempts=3,
initial_interval=timedelta(seconds=30),
),
)
# Step 3: Post to Zoho
# If step 3 fails after steps 1-2 succeeded,
# Temporal retries ONLY step 3 — not the whole workflow
result = await workflow.execute_activity(
post_to_zoho_books,
invoice_data,
start_to_close_timeout=timedelta(seconds=30),
retry_policy=RetryPolicy(maximum_attempts=3),
)
return {
"status": "success",
"invoice_number": invoice_data.get("invoice_number"),
"total": invoice_data.get("total"),
"zoho_result": result,
}
# ── Worker ────────────────────────────────────────────────────────────────
async def run_worker():
client = await Client.connect("localhost:7233")
async with Worker(
client,
task_queue="invoice-processing",
workflows=[InvoiceProcessingWorkflow],
activities=[download_and_extract_pdf, extract_invoice_fields, post_to_zoho_books],
):
print("Worker running. Processing invoice queue...")
await asyncio.Event().wait() # Run forever
# ── Trigger a workflow ─────────────────────────────────────────────────────
async def process_invoice(pdf_url: str) -> dict:
client = await Client.connect("localhost:7233")
result = await client.execute_workflow(
InvoiceProcessingWorkflow.run,
pdf_url,
id=f"invoice-{hash(pdf_url)}", # Deterministic ID for idempotency
task_queue="invoice-processing",
)
return result
# Run
import asyncio
asyncio.run(process_invoice("https://example.com/invoice-001.pdf"))
Why the workflow ID matters for idempotency
The id=f"invoice-{hash(pdf_url)}" line is critical. If you submit the same PDF URL twice (webhook retries, accidental double-click), Temporal deduplicates on workflow ID. The second submission returns the result of the first run instead of creating a duplicate. Your accounting system gets one entry, not two.
Retry policies explained
RetryPolicy(
maximum_attempts=5, # Total attempts (including the first)
initial_interval=timedelta(seconds=2), # Wait after first failure
backoff_coefficient=2.0, # Double the wait each retry: 2s, 4s, 8s, 16s
maximum_interval=timedelta(seconds=60), # Cap the wait at 60s
non_retryable_error_types=["ValueError"], # Don't retry these
)
non_retryable_error_types is important — some errors aren't transient. If Claude returns garbage output that fails JSON parsing, retrying won't help. Raise ValueError for logical failures and let it fail fast.
Temporal vs alternatives
| Temporal | Celery | Inngest | None (manual) | |
|---|---|---|---|---|
| Survives worker crash | Yes | No | Yes | No |
| Exactly-once | Yes | No | Yes | No |
| Self-hostable | Yes | Yes | No (SaaS) | N/A |
| Language support | Many | Python | JS/TS | Any |
| Complexity | High | Medium | Low | None |
| Best for | Long, complex workflows | Simple task queues | Serverless/JS | Scripts |
Temporal — right for long-running agent workflows with many steps, real-world side effects, and strict reliability requirements. Heavy to operate.
Inngest — if you're already on Vercel/Railway/serverless and your stack is TypeScript. Event-driven, no server to manage.
Celery — fine for background task queues (send email, resize image). Not for workflows where partial state matters.
Self-hosting Temporal on a VPS
For Indian deployments, a Hostinger KVM 4 (4 vCPU, 8GB RAM, ₹2,000/month) handles Temporal for moderate workloads (a few hundred workflow executions/day):
# Install Docker + Compose
curl -fsSL https://get.docker.com | sh
# Download Temporal docker-compose
git clone https://github.com/temporalio/docker-compose.git
cd docker-compose
docker-compose up -d
# Temporal UI: http://your-vps-ip:8080
# Temporal server: your-vps-ip:7233
The production checklist post covers the monitoring and alerting you'll want alongside Temporal — especially for catching LLM calls that exhaust their retries and permanently fail.



