Asking an AI to "respond in JSON" seems simple. In practice, it's one of the most common sources of production bugs in AI-integrated applications. The model sometimes forgets, produces trailing commas, nests fields differently, or wraps the JSON in markdown code fences.
The solution isn't a better prompt — it's using the right API feature for the job.
Why Prompting for JSON Isn't Reliable
When you include "respond in JSON format" in your prompt, you're relying on the model's instruction-following, not a constraint mechanism. The model can:
- Forget the instruction mid-response
- Add introductory text before the JSON
- Produce valid JSON that doesn't match your schema
- Include fields you didn't ask for
- Omit required fields
- Use slightly different key names than expected
These failures happen infrequently — maybe 2–5% of calls — which is exactly often enough to break a production pipeline silently.
OpenAI: Structured Outputs With Schema Enforcement
OpenAI's response_format with json_schema is the strongest option available. Set "strict": true and the model cannot produce output that violates the schema.
from openai import OpenAI
import json
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "Extract job posting information from the provided text."
},
{
"role": "user",
"content": "Senior Software Engineer at Stripe. $180k-$220k. Remote. 5+ years experience required. Python, Go, or Java."
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "job_posting",
"strict": True,
"schema": {
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Job title"
},
"company": {
"type": "string"
},
"salary_range": {
"type": "object",
"properties": {
"min": {"type": "number"},
"max": {"type": "number"},
"currency": {"type": "string"}
},
"required": ["min", "max", "currency"],
"additionalProperties": False
},
"remote": {"type": "boolean"},
"required_skills": {
"type": "array",
"items": {"type": "string"}
},
"experience_years": {
"type": ["number", "null"],
"description": "Minimum years of experience required. Null if not specified."
}
},
"required": ["title", "company", "remote", "required_skills", "experience_years"],
"additionalProperties": False
}
}
}
)
# Guaranteed to be valid JSON matching the schema
result = json.loads(response.choices[0].message.content)
print(result["salary_range"]) # Always present, always correct shape
Key schema design patterns:
- Set
"additionalProperties": Falseto prevent unexpected fields - Use
"required"arrays to declare all non-optional fields - Use
["type", "null"]unions for truly optional values - Keep schemas flat when possible — deeply nested schemas are harder to reason about
JSON mode (weaker option):
response = client.chat.completions.create(
model="gpt-4o",
messages=[...],
response_format={"type": "json_object"}
)
This guarantees valid JSON but not schema conformance. Use structured outputs with json_schema instead whenever you have a known output structure.
Anthropic: Claude Without Native Schema Enforcement
Claude doesn't have a schema-enforcement API parameter, but it follows format instructions reliably enough for most production use cases.
The most reliable approach with Claude:
import anthropic
import json
client = anthropic.Anthropic()
system_prompt = """Extract job posting information and return it as JSON.
Return ONLY the JSON object, no other text. Use exactly this structure:
{
"title": string,
"company": string,
"salary_min": number or null,
"salary_max": number or null,
"remote": boolean,
"required_skills": [string],
"experience_years": number or null
}
Rules:
- salary_min and salary_max: use null if not mentioned
- experience_years: minimum years required, null if not specified
- required_skills: list of technologies, languages, or tools explicitly mentioned
"""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=system_prompt,
messages=[
{
"role": "user",
"content": "Senior Software Engineer at Stripe. $180k-$220k. Remote. 5+ years experience required. Python, Go, or Java."
}
]
)
try:
result = json.loads(response.content[0].text)
except json.JSONDecodeError:
# Handle the rare case where output isn't valid JSON
# (Consider a retry or a fallback extraction)
pass
For higher reliability with Claude, add a post-processing check:
def extract_with_claude(text: str, max_retries: int = 2) -> dict:
for attempt in range(max_retries + 1):
response = client.messages.create(...)
try:
result = json.loads(response.content[0].text)
# Validate required keys
required_keys = ["title", "company", "remote", "required_skills"]
if all(k in result for k in required_keys):
return result
except (json.JSONDecodeError, KeyError):
if attempt == max_retries:
raise
return {}
Google: Gemini JSON Mode
Gemini supports JSON mode via the response_mime_type parameter:
import google.generativeai as genai
import json
genai.configure(api_key="your-api-key")
model = genai.GenerativeModel("gemini-2.0-flash")
response = model.generate_content(
"Extract job posting data from: Senior Software Engineer at Stripe. $180k-$220k. Remote.",
generation_config=genai.GenerationConfig(
response_mime_type="application/json",
response_schema={
"type": "object",
"properties": {
"title": {"type": "string"},
"company": {"type": "string"},
"remote": {"type": "boolean"}
}
}
)
)
result = json.loads(response.text)
Gemini 2.0 also supports schema validation via response_schema in the generation config. Coverage and strictness varies — test your specific use case.
When to Use Each Approach
| Use Case | Recommended Approach |
|---|---|
| Production pipeline, schema-critical | OpenAI structured outputs (strict: true) |
| Production with Claude | Explicit format prompt + post-processing validation |
| Prototyping / development | JSON mode from any provider |
| Optional/flexible schema | Prompt-based JSON with validation |
| High volume, high reliability | Schema enforcement + retry logic |
Schema Design Tips
Keep it as flat as possible. Deeply nested schemas are harder for models to conform to and harder to validate.
Use enums for constrained values:
{
"sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral", "mixed"]
}
}
Make optional fields explicit as nullable:
{
"email": {
"type": ["string", "null"],
"description": "Email address if mentioned, null otherwise"
}
}
Define what to do when data is missing. "If not mentioned, use null" prevents the model from hallucinating values for empty fields.
Test with adversarial inputs. Try inputs where fields are ambiguous, missing, or in unexpected formats. Schema enforcement catches the easy cases; your edge case testing catches the hard ones.
The Bottom Line
For any production pipeline where you're parsing AI output with code:
- Use structured outputs with schema enforcement if available (OpenAI)
- Use explicit format prompts with post-processing validation otherwise (Claude, Gemini)
- Never rely on "respond in JSON" alone in a prompt — it's reliable enough to look fine in testing and fail in production
The extra work upfront on schema design and validation prevents a class of production bugs that are notoriously hard to reproduce and diagnose.
