If you've been exploring prompt engineering for a while, you've probably hit the ceiling of what raw API calls can do cleanly. You're managing prompts in strings, writing manual retry logic, stitching outputs together with duct tape.
LangChain exists to solve that. It's a Python framework that gives you composable building blocks for LLM applications — prompt templates, output parsers, chains, retrievers, memory, and agents — so you spend less time on plumbing and more time on logic.
This guide covers the core concepts with real, runnable code. No fluff.
Installation and Setup
pip install langchain langchain-openai langchain-anthropic
Set your API keys:
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
Or use a .env file with python-dotenv:
pip install python-dotenv
from dotenv import load_dotenv
load_dotenv() # reads .env automatically
Core Concept 1: Chat Models
LangChain wraps LLM providers into a consistent interface. You swap providers by changing one import.
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
# OpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Anthropic — same interface
llm = ChatAnthropic(model="claude-opus-4-6", temperature=0)
# Simple invocation
response = llm.invoke("What is prompt engineering in one sentence?")
print(response.content)
The invoke() method returns an AIMessage object. .content gives you the string.
For streaming (display tokens as they arrive):
for chunk in llm.stream("Write a haiku about APIs."):
print(chunk.content, end="", flush=True)
Core Concept 2: Prompt Templates
Hard-coding prompts in strings is fragile. Prompt templates let you parameterise them.
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert in {domain}. Answer concisely."),
("user", "{question}")
])
# Format the template
messages = prompt.format_messages(
domain="machine learning",
question="What is overfitting?"
)
response = llm.invoke(messages)
print(response.content)
Few-Shot Prompt Templates
from langchain_core.prompts import FewShotChatMessagePromptTemplate
examples = [
{"input": "The delivery was late", "output": "Negative"},
{"input": "Best product I've ever bought!", "output": "Positive"},
{"input": "Works fine, nothing special", "output": "Neutral"},
]
example_prompt = ChatPromptTemplate.from_messages([
("human", "{input}"),
("ai", "{output}"),
])
few_shot_prompt = FewShotChatMessagePromptTemplate(
example_prompt=example_prompt,
examples=examples,
)
final_prompt = ChatPromptTemplate.from_messages([
("system", "Classify the sentiment of the review."),
few_shot_prompt,
("human", "{input}"),
])
chain = final_prompt | llm
result = chain.invoke({"input": "Exceeded my expectations."})
print(result.content) # "Positive"
Core Concept 3: LCEL — The Pipe Operator
LCEL (LangChain Expression Language) lets you compose components with |, just like Unix pipes. This is the modern LangChain way.
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
chain = (
ChatPromptTemplate.from_template("Summarise this in one sentence: {text}")
| llm
| StrOutputParser() # extracts .content as a plain string
)
result = chain.invoke({"text": "LangChain is a framework for building LLM applications..."})
print(result) # plain string, no AIMessage wrapper
Chaining Multiple Steps
summarise_prompt = ChatPromptTemplate.from_template(
"Summarise this text in 2 sentences: {text}"
)
translate_prompt = ChatPromptTemplate.from_template(
"Translate this to French: {summary}"
)
parser = StrOutputParser()
# Step 1: summarise
# Step 2: feed summary into translate prompt
chain = (
{"summary": summarise_prompt | llm | parser}
| translate_prompt
| llm
| parser
)
result = chain.invoke({"text": "Your long English document here..."})
print(result) # French summary
Core Concept 4: Output Parsers
Getting raw strings back from the LLM works fine for simple tasks. For structured data, use output parsers.
JSON Output Parser
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate
parser = JsonOutputParser()
prompt = ChatPromptTemplate.from_template(
"""Extract the following from the text and return as JSON:
- name (string)
- email (string)
- company (string)
Text: {text}
{format_instructions}
"""
)
chain = prompt | llm | parser
result = chain.invoke({
"text": "Hi, I'm Chetan Rakheja from MasterPrompting. Reach me at chetan@masterprompting.net",
"format_instructions": parser.get_format_instructions(),
})
print(result)
# {'name': 'Chetan Rakheja', 'email': 'chetan@masterprompting.net', 'company': 'MasterPrompting'}
Pydantic Output Parser (type-safe)
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
from typing import List
class BlogPost(BaseModel):
title: str = Field(description="The blog post title")
tags: List[str] = Field(description="3-5 relevant tags")
summary: str = Field(description="One sentence summary")
estimated_read_time: int = Field(description="Estimated read time in minutes")
parser = PydanticOutputParser(pydantic_object=BlogPost)
prompt = ChatPromptTemplate.from_template(
"Generate metadata for a blog post about: {topic}\n\n{format_instructions}"
)
chain = prompt | llm | parser
result = chain.invoke({
"topic": "Building RAG pipelines with LangChain",
"format_instructions": parser.get_format_instructions(),
})
print(result.title) # "Building Production RAG Pipelines with LangChain"
print(result.tags) # ['RAG', 'LangChain', 'vector-search', 'Python', 'LLM']
print(result.estimated_read_time) # 8
Core Concept 5: Retrieval-Augmented Generation (RAG)
RAG lets you feed your own documents to the LLM at query time. The basic pattern:
- Split documents into chunks
- Embed chunks into a vector store
- At query time, retrieve the most relevant chunks
- Pass them as context to the LLM
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
# 1. Load and split document
loader = TextLoader("my_docs.txt")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)
# 2. Embed and store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
# 3. Build RAG chain
prompt = ChatPromptTemplate.from_template(
"""Answer the question using only the context below.
If the answer isn't in the context, say "I don't know."
Context: {context}
Question: {question}
"""
)
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
answer = rag_chain.invoke("What are the main benefits of LangChain?")
print(answer)
Core Concept 6: Memory (Conversation History)
For chatbots or multi-turn conversations, you need to pass history back to the model.
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}"),
])
chain = prompt | llm | StrOutputParser()
# Session store (in-memory; swap for Redis in production)
store = {}
def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
if session_id not in store:
store[session_id] = InMemoryChatMessageHistory()
return store[session_id]
chain_with_memory = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history",
)
config = {"configurable": {"session_id": "user_123"}}
# Turn 1
r1 = chain_with_memory.invoke({"input": "My name is Chetan."}, config=config)
print(r1) # "Nice to meet you, Chetan!"
# Turn 2 — remembers previous turn
r2 = chain_with_memory.invoke({"input": "What's my name?"}, config=config)
print(r2) # "Your name is Chetan."
Parallel Execution with RunnableParallel
Run multiple chains concurrently and merge the results:
from langchain_core.runnables import RunnableParallel
topic = "prompt engineering"
chain = RunnableParallel(
pros=ChatPromptTemplate.from_template("List 3 pros of {topic}") | llm | StrOutputParser(),
cons=ChatPromptTemplate.from_template("List 3 cons of {topic}") | llm | StrOutputParser(),
definition=ChatPromptTemplate.from_template("Define {topic} in one sentence") | llm | StrOutputParser(),
)
results = chain.invoke({"topic": topic})
# results is a dict: {"pros": "...", "cons": "...", "definition": "..."}
# All 3 LLM calls ran concurrently
print(results["definition"])
When to Use LangChain vs Raw API Calls
| Scenario | Use |
|---|---|
| Single LLM call, simple prompt | Raw API call |
| Reusable prompt templates | LangChain |
| Chaining 2+ LLM steps | LangChain LCEL |
| Structured output with validation | LangChain + Pydantic |
| RAG over your documents | LangChain |
| Multi-turn conversation with memory | LangChain |
| Complex agent with loops and branching | LangGraph (see next post) |
Key Takeaways
- LCEL (
|operator) is the modern way to build LangChain pipelines — composable, streamable, async-compatible - Pydantic parsers give you type-safe structured output without fragile string parsing
- RAG is the practical solution when your LLM needs access to private or up-to-date information
- RunnableParallel runs independent chains concurrently — no extra threading needed
LangChain handles the plumbing so you can focus on what the chain actually does. Once you outgrow linear pipelines and need loops, conditional logic, or agent state — that's where LangGraph comes in.