Most agent frameworks give the LLM a list of tools and let it call them one at a time. smolagents does something different: it gives the LLM a Python interpreter and lets it write code to solve problems. That sounds like a small difference. It isn't.
HuggingFace released smolagents in late 2024 as a deliberate reaction to the complexity of LangChain and LangGraph. The README says it out loud: "simple agents that work." The entire library is around 1,000 lines of core code. You can read all of it in an afternoon.
The core idea: CodeAgent vs ToolCallingAgent
smolagents has two agent types. The ToolCallingAgent behaves like most other agent frameworks — the LLM selects from a list of tools, calls one, gets the result, decides what to do next. Classic ReAct prompting loop.
The CodeAgent is what makes smolagents interesting. Instead of selecting tools, the LLM writes Python code that solves the task. That code gets executed in a sandboxed interpreter, and the output feeds back into the model's context. The LLM reasons by programming.
Here's what that looks like in practice. Given the task "What's the population of France divided by the area of Germany?", a ToolCallingAgent would call a search tool twice and then do the math. A CodeAgent might write:
france_population = 68_170_000
germany_area_km2 = 357_114
result = france_population / germany_area_km2
print(f"Population density ratio: {result:.2f} people per km²")
And execute it directly. No tool calls. Just code.
This approach handles multi-step reasoning differently. If the model needs to process a list of 50 items, it can write a loop. If it needs to transform data before passing it to the next step, it does that in code. The LLM isn't constrained to calling predefined functions — it can compose arbitrary Python logic.
Getting started
Installation is minimal:
pip install smolagents
A basic CodeAgent with no external tools:
from smolagents import CodeAgent, HfApiModel
model = HfApiModel("meta-llama/Llama-3.3-70B-Instruct")
agent = CodeAgent(tools=[], model=model)
result = agent.run("What is 17 factorial? Show the calculation step by step.")
print(result)
Adding tools follows a simple decorator pattern:
from smolagents import tool, CodeAgent, HfApiModel
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city.
Args:
city: The city name to get weather for.
"""
# Your actual API call here
return f"Sunny, 22°C in {city}"
model = HfApiModel("meta-llama/Llama-3.3-70B-Instruct")
agent = CodeAgent(tools=[get_weather], model=model)
result = agent.run("What's the weather in Paris and Tokyo? Which is warmer?")
The docstring matters here — smolagents uses it to tell the model what the tool does. Write clear docstrings.
Using it with different models
smolagents works with HuggingFace Hub models out of the box. You can also use it with any model that has an OpenAI-compatible API:
from smolagents import CodeAgent, LiteLLMModel
# Works with Claude, GPT-4o, Gemini, or any OpenAI-compatible endpoint
model = LiteLLMModel("anthropic/claude-sonnet-4-5")
agent = CodeAgent(tools=[], model=model)
For local models, the integration with Ollama works well:
from smolagents import CodeAgent, LiteLLMModel
model = LiteLLMModel(
model_id="ollama/qwen2.5-coder:7b",
api_base="http://localhost:11434",
)
agent = CodeAgent(tools=[], model=model)
This is a real advantage over some frameworks — you're not locked into specific API providers, and the local model story is first-class.
Built-in tools
smolagents ships with a few ready-to-use tools you can import directly:
from smolagents import CodeAgent, HfApiModel
from smolagents.tools import DuckDuckGoSearchTool, PythonInterpreterTool
agent = CodeAgent(
tools=[DuckDuckGoSearchTool(), PythonInterpreterTool()],
model=HfApiModel("meta-llama/Llama-3.3-70B-Instruct"),
add_base_tools=True, # Adds basic file reading, web fetching
)
result = agent.run(
"Search for the top 3 Python agent frameworks in 2026 and compare their GitHub stars"
)
DuckDuckGoSearchTool does what it says. PythonInterpreterTool gives the CodeAgent an explicit interpreter tool (though CodeAgent already executes code natively — this is more useful in ToolCallingAgent).
How multi-step reasoning works
The agent loop in smolagents is transparent. You can see it working:
- The model receives the task and generates Python code
- The code runs in a sandboxed interpreter
- The output (stdout, return values) goes back into the model's context
- The model decides if the task is done or generates more code
- Repeat until the task is complete or max steps is reached
You can set max_steps to control how long it can run:
agent = CodeAgent(
tools=[search_tool],
model=model,
max_steps=10, # Default is 6
verbose=True, # Prints each step
)
With verbose=True you see every code block the model generates and every output it receives. This is invaluable for debugging. Most frameworks make this harder to inspect.
Understanding the sandboxed execution
The code execution is sandboxed — the model can't import arbitrary modules by default. You control what's available:
agent = CodeAgent(
tools=[],
model=model,
additional_authorized_imports=["pandas", "numpy", "json"],
)
This is a security consideration worth thinking about. If you're running smolagents in a web application, be deliberate about what the model can import. The default allowlist is conservative.
smolagents vs LangChain/LangGraph
The comparison that comes up most often: why use smolagents when LangChain exists?
LangChain's strength is its ecosystem — hundreds of integrations, extensive documentation, and tooling built around it. Its weakness is complexity. A simple agent in LangChain involves chains, prompts, output parsers, callbacks, and memory objects. There's a lot of framework between you and the thing that's actually running.
smolagents removes most of that. There's no chain concept, no complex memory management, no output parsers. The tool is a Python function with a docstring. The agent is CodeAgent(tools=..., model=...). If it breaks, you can read the source code to understand why.
LangGraph adds a state graph on top of LangChain, which is valuable for production workflows with complex branching. smolagents doesn't have an equivalent — it's not trying to compete on that dimension. For tasks that need stateful graphs with checkpointing and conditional routing, use LangGraph. For everything else, smolagents is often faster to work with.
The function calling lesson covers how the standard tool-calling pattern works underneath both frameworks — understanding that makes it clearer when the CodeAgent approach is an advantage.
When to use smolagents
Good fit:
- Prototypes and research experiments where you want results today
- Tasks involving data processing, calculation, or file manipulation where Python code is natural
- Working with local or open-source models
- Teams that value readable, debuggable code over extensive abstraction
- One-shot or few-shot tasks that don't require persistent state across sessions
Not a good fit:
- Production systems that need checkpointing, retry logic, and fault tolerance
- Complex multi-agent orchestration with dependency graphs between agents
- Long-running workflows that span multiple sessions
- Teams that need the extensive integrations LangChain provides
My workflow: I prototype with smolagents, understand the shape of the problem, then decide if it needs to move to LangGraph for production. Often it doesn't. The prototype turns out to be good enough, and the simplicity of smolagents becomes an asset rather than a limitation.
The HuggingFace team keeps iterating on it quickly — it's one of the faster-moving libraries in the space right now. Worth keeping an eye on.



