The Problem ReAct Solves
Imagine an agent that jumps straight to actions without thinking:
User: Find the population of Tokyo and compare it to New York.
Agent: [calls search("Tokyo population")]
Agent: [calls search("New York population")]
Agent: Tokyo has 13.96 million people. New York has 8.34 million.
This looks fine — but what if the first search returns an ambiguous result? What if "New York" returns the metro area for Tokyo but the city proper for New York? The agent would compare mismatched statistics and give a confidently wrong answer.
ReAct prevents this by requiring the model to reason before each action — catching these issues before they compound.
The ReAct Pattern
ReAct structures every agent turn as three explicit elements:
Thought: [The model's reasoning about what to do next and why]
Action: [The specific tool call or action to take]
Observation: [The result of the action]
This repeats until a final answer is reached:
Thought: I have enough information. I'll compile the answer now.
Final Answer: [response to the user]
ReAct in Action: A Full Example
Task: What is the GDP per capita of the top 3 countries by GDP, and how do they compare to the global average?
Thought: I need to find (1) the top 3 countries by GDP, (2) their GDP per capita, and
(3) the global average GDP per capita. Let me start by finding the top 3 countries.
Action: search_web("top 3 countries by GDP 2025")
Observation: "1. United States ($29.4T), 2. China ($18.5T), 3. Germany ($4.7T)"
Thought: Good. Now I need the GDP per capita for each of these countries. GDP per capita
requires dividing GDP by population, or I can search for it directly.
Action: search_web("GDP per capita United States China Germany 2025")
Observation: "USA: $88,400 | China: $13,100 | Germany: $56,100"
Thought: Now I need the global average GDP per capita to make the comparison meaningful.
Action: search_web("global average GDP per capita 2025")
Observation: "World Bank estimate: approximately $13,800 globally"
Thought: I now have all the data I need:
- USA: $88,400 (6.4x the global average)
- Germany: $56,100 (4.1x the global average)
- China: $13,100 (roughly equal to the global average)
- Global average: $13,800
I can now write a clear, accurate comparison.
Final Answer: Among the world's three largest economies by total GDP...
Notice how each Thought step checks the previous Observation for issues before proceeding. This is what prevents cascading errors.
Implementing ReAct in System Prompts
To enable ReAct behavior, instruct the model explicitly in your system prompt:
You are a research assistant with access to web search.
For every task, follow this reasoning format:
- Thought: Explain your reasoning and what you plan to do next
- Action: Call one of your available tools
- Observation: [This will be filled with the tool result]
Continue the Thought → Action → Observation cycle until you have sufficient
information to answer confidently. Then write:
- Final Answer: Your complete response
Important rules:
- Always write a Thought before every Action
- Check each Observation carefully before proceeding
- If a result seems incomplete or ambiguous, search again with a refined query
- Only write Final Answer when you are confident in your response
Variations of ReAct
Standard ReAct
Classic Thought → Action → Observation loop, as described above.
ReAct with Self-Reflection
After each Observation, the model also evaluates whether the result was useful:
Thought: The search returned many results about the wrong "Mercury" (the planet, not the element).
I need to refine my query.
Action: search_web("Mercury element properties chemistry NOT planet")
ReAct with Planning
The model creates a brief plan at the start before entering the loop:
Thought (planning): To answer this question I'll need to:
1. Find the top 3 companies
2. Get revenue for each
3. Get their founding dates
4. Calculate revenue CAGR
Let me start with step 1.
Action: search_web(...)
Minimal ReAct
For simpler agents, you can use just one-line thoughts to reduce verbosity while keeping the reasoning structure:
Thought: Need current price data.
Action: search("AAPL stock price today")
Why Explicit Reasoning Works
The cognitive benefit of writing thoughts before acting is well-documented in the research on chain-of-thought prompting. For agents specifically, explicit reasoning:
1. Prevents premature action The model considers whether it has enough information before acting, reducing unnecessary tool calls.
2. Surfaces assumptions Writing "I assume this query will return city-level data, not country-level" makes the assumption visible — and catchable.
3. Enables self-correction After an Observation, the model's next Thought often naturally identifies problems: "The result is for 2022, but I need current data."
4. Makes debugging tractable When an agent gives a wrong answer, you can trace back through the Thought logs to find exactly where the reasoning went wrong.
Debugging a ReAct Agent
When your agent goes wrong, the Thought logs are your primary debugging tool.
Common failure patterns:
| Pattern | What you'll see in Thoughts | Fix |
|---|---|---|
| Wrong tool called | "I'll search for this" when it should compute | Improve tool descriptions |
| Stops too early | "I have enough info" after only 1 tool call | Add "always verify with a second source" to system prompt |
| Loops indefinitely | Repeating same search with slight variations | Add a turn limit + "if you've searched 3 times without success, explain what you couldn't find" |
| Hallucinated observation | Thought references data not in Observation | This is a model reliability issue — use a stronger model or add explicit instructions to quote from observations |
ReAct vs. Other Agent Patterns
| Pattern | Approach | Best for |
|---|---|---|
| ReAct | Reason → Act → Observe loop | General-purpose agents, debugging, reliability |
| Plan-and-Execute | Full plan upfront, then execute steps | Predictable, well-defined tasks |
| Reflexion | After completing a task, reflect and improve | Tasks where quality iteration matters |
| Tree of Thought | Explore multiple reasoning branches | Complex problems with many possible paths |
ReAct is the default for most agent systems because it's flexible and transparent without requiring you to know the full task structure in advance.
Key Takeaways
- ReAct stands for Reason + Act — write a Thought before every Action
- The full pattern:
Thought → Action → Observationrepeated untilFinal Answer - Explicit reasoning prevents cascading errors, surfaces assumptions, and enables self-correction
- Implement it by instructing the model in your system prompt
- Thought logs are your primary debugging tool when agents go wrong
- Variations: self-reflection, planning phase, minimal one-line thoughts
- Next lesson: AI workflows vs. agents — knowing when to use each