Risks & Safety

Risks & Safety Track

Build responsibly. These 8 lessons cover the real-world risks of LLM systems — prompt injection, jailbreaking, hallucinations, biases — and how to defend against them.

1
Prompt Injection: The Most Common AI Security Attack
Prompt injection tricks an AI into ignoring its instructions and following malicious commands embedded in user input or external data. Learn how it works and how to defend against it.
5 min read
2
Prompt Leaking: Protecting Your System Prompts
Prompt leaking is when an AI is tricked into revealing its confidential system prompt. Learn why system prompts are hard to fully protect, what you can do, and what you should never put in one.
5 min read
3
Jailbreaking: Techniques, Examples, and Defenses
Jailbreaking bypasses an AI's built-in safety guidelines through creative prompting. Learn the main jailbreak techniques, why they work, and how to make your AI systems more resistant to them.
5 min read
4
Hallucinations Deep Dive: Why AI Confidently Gets Things Wrong
LLMs hallucinate — generating plausible-sounding but false information. Learn why hallucinations happen, which types of content are highest-risk, and practical techniques to minimize them.
5 min read
5
Biases in LLM Outputs: What They Are and How to Reduce Them
LLMs inherit biases from training data, reinforcement feedback, and their own architecture. Learn the main bias types, how they surface in practice, and prompt strategies to reduce their impact.
5 min read
6
Red-Teaming Your Prompts: Stress Test Before You Ship
Red-teaming is the practice of systematically attacking your own AI system to find vulnerabilities before real users do. Learn a practical red-teaming methodology for LLM applications.
6 min read
7
AI Bias Mitigation Prompts
Practical techniques to identify, test for, and reduce bias in LLM outputs — with prompt patterns that produce fairer, more consistent results.
8 min read
8
Responsible AI Agent Design
How to design AI agents that fail safely — with principles and patterns for scope limitation, human oversight, graceful degradation, and audit trails.
9 min read

Prerequisites

This track is most valuable after completing the Intermediate or Advanced track. Some lessons reference concepts like system prompts, chain-of-thought, and RAG.

Review Intermediate Track