When people talk about AI bias, they usually mean demographic bias — the model making different assumptions about people based on their gender, race, or nationality. That's real and worth addressing. But it's a narrow slice of the bias problem. LLMs exhibit a whole family of biases that affect output quality and reliability, and most of them can be partially mitigated with the right prompting patterns.
Understanding these biases — and how to test for them — is as important as understanding jailbreaking or prompt injection. Bias doesn't usually break your application dramatically; it just quietly degrades the quality and fairness of outputs in ways that are hard to notice without deliberate testing.
The bias taxonomy
Demographic bias
The model makes different assumptions or gives different quality responses based on demographic signals in the prompt — names, pronouns, nationalities, professions. Classic example: a model that produces stronger job application feedback for "James" than for "Jamal" given identical resumes. Or a model that defaults to gendered pronouns for certain professions.
This bias comes from training data that reflects historical patterns in text. The model learns statistical regularities that encode social biases.
Sycophantic bias
The model agrees with whoever seems more confident or authoritative, even when they're wrong. If you push back on a correct answer, a sycophantic model will back down. If you assert a false claim confidently, it will tend to agree.
This is particularly dangerous in high-stakes domains. A model helping you evaluate a business plan will be less useful if it inflates confidence every time you express enthusiasm.
Anchoring bias
The model's second answer is influenced by its first answer in ways that aren't always justified. If you ask for a price estimate and the model says $50,000, then ask it to reconsider — it will tend to adjust toward $50,000 rather than reasoning from scratch. The first number serves as an anchor.
Length bias
Longer responses are perceived — by both humans and models — as more authoritative. When a model is evaluating two options, it tends to favor the longer one even when the shorter one is higher quality. When you ask a model to rate two outputs, the longer one gets an unfair advantage.
Position bias
The first item in a list, or option A in a comparison, gets a systematic advantage. If you ask a model to evaluate three marketing slogans and list them A, B, C, it will tend to favor A slightly even when B or C is objectively stronger.
Testing for bias in your prompts
Before mitigation, you need to know if you have a problem. Three testing approaches:
Contrafactual testing: Run the same prompt with a demographic variable swapped. Change "John" to "Mei-Lin", change "he" to "she", change "American" to "Nigerian". If the outputs diverge in quality, specificity, or tone, you have demographic bias.
Original: "Write feedback on this resume for John Smith, applying for a
software engineering role: [resume content]"
Contrafactual: "Write feedback on this resume for Aisha Okonkwo, applying
for a software engineering role: [same resume content]"
Run both, compare depth, specificity, and implicit assumptions in the feedback.
Consistency testing: Run the same prompt multiple times and check variance. High variance often signals the model is making arbitrary choices that could be systematically biased in a deployed context.
Blind evaluation: When evaluating options, strip identifying information before asking the model to assess. If you want to evaluate which of two code snippets is cleaner, present them without indicating which approach each represents.
Mitigation prompt patterns
Role-neutral prompts
Don't let the model fill in demographic blanks. Be explicit about what assumptions it should and shouldn't make.
Before mitigation:
"Write interview questions for a nursing candidate."
After mitigation:
"Write interview questions for a nursing candidate. Use gender-neutral
language throughout. Do not make assumptions about the candidate's background,
age, or experience level beyond what would be standard for the role."
Perspective-balancing instructions
For analysis tasks, explicitly ask for balanced treatment across relevant groups or perspectives.
Before mitigation:
"Analyze the economic impacts of this trade policy."
After mitigation:
"Analyze the economic impacts of this trade policy from the perspective of
at least three different stakeholder groups — including both those who would
benefit and those who would be harmed. Give equivalent depth to each group's
perspective."
Anti-sycophancy prompts
This is one of the most practically useful mitigation patterns. Explicitly instruct the model to maintain positions under pressure and distinguish genuine reconsideration from capitulation.
Important instruction: If I disagree with your analysis or push back on
your conclusions, do not change your answer simply because I expressed
disagreement. Only update your position if I provide a new argument or
evidence that genuinely warrants reconsidering. If my pushback doesn't
contain new information, maintain your original position and explain why.
For evaluation tasks specifically:
Rate these two options and give your honest assessment. If I tell you that
option A was created by an expert, do not let that change your evaluation —
judge based on the content alone.
Blind evaluation prompts
When comparing or evaluating options, anonymize them to remove position and identity bias.
Before mitigation:
"Which of these two cover letters is stronger? Option A: [letter] Option B: [letter]"
After mitigation:
"I'm going to give you two pieces of text labeled X and Y. Evaluate each
independently on these criteria: clarity, specificity, and relevance to the role.
Then tell me which is stronger and why.
Text X: [letter A]
Text Y: [letter B]"
Shuffling which letter appears as X vs Y across test runs also helps you check for position bias.
Diversity-of-examples instruction
When asking for examples, the model's defaults often skew toward the most statistically common representations in its training data. Counteract this explicitly.
Before mitigation:
"Give me 5 examples of successful entrepreneurs."
After mitigation:
"Give me 5 examples of successful entrepreneurs. Deliberately include
diversity across geography (not just the US), industry, time period, and
demographics. Avoid defaulting to the most well-known names."
Calibrated confidence prompts
Models often express uniform high confidence regardless of actual uncertainty. Ask for explicit calibration.
For each claim in your analysis, indicate your confidence level:
- High: well-established fact or strong evidence
- Medium: reasonable inference but uncertainty exists
- Low: speculative or limited information
Do not hedge everything uniformly — distinguish what you know well from
what you're less certain about.
When prompting isn't enough
Prompting can reduce bias; it can't eliminate it. Some biases are structural — baked into the model's weights through training data — and no prompt will fully override them.
The cases where prompting alone is insufficient:
High-stakes demographic decisions: If you're using an LLM to screen job applications, evaluate loan applications, or make any decision with significant impact on people's lives, prompt-level mitigation is not adequate safeguarding. You need human review, auditing, and probably shouldn't be using a general-purpose LLM for this task at all.
Deep cultural knowledge gaps: A model trained primarily on English-language text will have systematic gaps in knowledge and perspective about non-English-speaking cultures. Prompting can help you get more balanced outputs, but the underlying knowledge asymmetry is structural.
Subtle statistical biases in generation: Things like which names get associated with competence in generated stories, or which neighborhoods get described as "up-and-coming" vs. "troubled" — these show up in aggregate patterns across many outputs, not in any individual response. You'd need systematic output auditing to detect them, not just better prompting.
Evaluation tasks at scale: If you're using an LLM as an automated judge in a pipeline — evaluating hundreds of outputs — its position bias and length bias will systematically skew your results. You need to design your evaluation setup to counteract this (randomize option order, standardize length, evaluate criteria separately).
Building bias checks into your workflow
For any prompt that touches demographic information or comparative evaluation, build in a quick self-check:
Before giving your final response, check:
1. Have I made any assumptions about this person's background, identity,
or characteristics that weren't stated in the prompt?
2. Have I applied consistent standards to all parties mentioned?
3. Is my confidence level calibrated, or am I expressing more certainty
than the evidence warrants?
If any of these checks flag an issue, revise before responding.
This won't catch everything — models aren't perfectly self-aware about their own biases — but it does meaningfully reduce obvious demographic assumptions and inconsistent treatment.
Bias mitigation is an ongoing practice, not a one-time fix. As you build with LLMs, run contrafactual tests periodically, especially when prompts are updated or the underlying model changes. What's calibrated today may shift with a model update.