LangChain Tutorial — Build LLM Applications
LangChain is the most popular open-source framework for building applications powered by Large Language Models. It simplifies chaining LLM calls, connecting external data, and creating intelligent agents.
What is LangChain?
LangChain is an open-source framework designed to help developers build applications powered by Large Language Models (LLMs). Created by Harrison Chase in late 2022, it has become the most widely adopted tool in the LLM application development ecosystem.
At its core, LangChain solves a fundamental problem: while LLMs like GPT-4 and Claude are incredibly powerful on their own, building production applications requires much more than just sending prompts to an API. You need to chain multiple LLM calls together, connect to external data sources, manage conversation memory, handle tool use, and orchestrate complex workflows.
LangChain provides a standardized abstraction layer for all of these tasks. Instead of writing custom integration code for every LLM provider, vector database, and tool, you write against LangChain's interfaces and can swap components freely.
Why LangChain Exists
Before LangChain, building an LLM application meant dealing with raw API calls, manual prompt management, custom parsing logic, and glue code everywhere. Consider a simple RAG (Retrieval-Augmented Generation) pipeline: you need to load documents, split them into chunks, generate embeddings, store them in a vector database, retrieve relevant chunks at query time, format a prompt, call the LLM, and parse the response. That is dozens of lines of boilerplate per component.
LangChain wraps all of this into composable, reusable components. The same RAG pipeline can be built in a few lines of code, and you can swap out the vector database, LLM provider, or document loader without rewriting your application logic.
Core Concepts
LangChain is built around several key abstractions. Understanding these is essential before writing any code.
LLMs and Chat Models
LangChain provides a unified interface for interacting with different LLM providers. ChatModel is the primary interface for conversational models (GPT-4, Claude, Gemini), while LLM is for text-completion models. In practice, you will almost always use ChatModel.
Each provider has its own integration package: langchain-openai, langchain-anthropic, langchain-google-genai, and so on. The beauty is that once you code against the ChatModel interface, switching providers is a one-line change.
Prompts and Templates
PromptTemplate and ChatPromptTemplate let you define reusable prompts with variable placeholders. Instead of building strings manually, you define a template and fill in variables at runtime. This keeps prompts clean, version-controlled, and easy to test.
Chains (LCEL)
Chains are the core composition unit in LangChain. The LangChain Expression Language (LCEL) is a declarative way to chain components together using the pipe (|) operator. A chain might look like: prompt | model | output_parser. Each component passes its output to the next, forming a pipeline.
LCEL chains are automatically parallelizable, support streaming, support async execution, and can be inspected for debugging. This is a significant improvement over the older LLMChain class that was removed in LangChain v0.2.
Document Loaders and Text Splitters
LangChain has over 150 document loaders for different sources: PDFs, websites, databases, APIs, Google Drive, Notion, CSV files, and more. Each loader returns Document objects containing text and metadata.
Text splitters break large documents into smaller chunks suitable for embedding and retrieval. The most common is RecursiveCharacterTextSplitter, which splits by paragraphs, then sentences, then characters, trying to keep semantic units together.
Vector Stores and Retrievers
Vector stores are databases optimized for storing and searching text embeddings. LangChain integrates with Pinecone, Weaviate, Chroma, Qdrant, FAISS, and many more. A Retriever wraps a vector store and provides a simple invoke(query) interface that returns relevant documents.
Memory
Memory components let your application maintain conversation history across multiple turns. ConversationBufferMemory stores the full conversation, while ConversationSummaryMemory keeps a running summary to save tokens. Memory is passed as context to the LLM so it can reference previous messages.
Agents and Tools
Agents use LLMs to decide which actions to take. Instead of hardcoding a workflow, you give the agent a set of tools (functions it can call) and let the LLM decide which tool to use based on the user's request. This is the foundation of autonomous AI applications. We cover agents in detail in our AI Agents Explained guide.
Installation and Setup
Getting started with LangChain is straightforward. You need Python 3.9+ and pip.
Install LangChain
The core package and provider integrations are installed separately:
# Core LangChain
pip install langchain
# Provider integrations (install what you need)
pip install langchain-openai # OpenAI, GPT-4
pip install langchain-anthropic # Claude
pip install langchain-community # Community integrations
# Vector stores
pip install langchain-chroma # Chroma (local, lightweight)
pip install faiss-cpu # Facebook AI Similarity Search
# Document loaders
pip install pypdf # PDF loading
pip install unstructured # Various document formats
Set Up API Keys
Store your API keys as environment variables. Never hardcode them in source code:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
Verify Installation
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke("Hello, what is 2+2?")
print(response.content) # "2 + 2 = 4"
If this prints a response, you are ready to go. For a deeper dive into setting up local models, see our Ollama Guide.
Building Your First Chain
Let's build a simple chain that takes a topic and generates a haiku about it. This demonstrates the core LCEL pattern.
Step 1: Define the Prompt
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are a creative poet. Write haikus about the given topic."),
("user", "Write a haiku about {topic}")
])
Step 2: Create the Chain
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
model = ChatOpenAI(model="gpt-4o-mini")
parser = StrOutputParser()
# Compose the chain with LCEL
chain = prompt | model | parser
Step 3: Run It
result = chain.invoke({"topic": "machine learning"})
print(result)
# Neural networks learn
# Patterns hidden in the data
# Machines start to think
That's it. Three lines to define the chain, one line to run it. The | operator connects the components: the prompt formats the input, the model generates a response, and the parser extracts the string content from the message object.
Streaming Responses
LCEL chains support streaming out of the box. This is essential for chat applications where you want to show tokens as they are generated:
for chunk in chain.stream({"topic": "quantum computing"}):
print(chunk, end="", flush=True)
Batch Processing
You can process multiple inputs efficiently with batch:
topics = ["ocean", "mountain", "city"]
results = chain.batch([{"topic": t} for t in topics])
for topic, result in zip(topics, results):
print(f"--- {topic} ---\n{result}\n")
Building a RAG Pipeline
RAG (Retrieval-Augmented Generation) is the most common use case for LangChain. It lets you ground LLM responses in your own data, reducing hallucinations and providing accurate, sourced answers. For background on how RAG works, see our RAG Explained guide.
Step 1: Load and Split Documents
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Load a PDF
loader = PyPDFLoader("my_document.pdf")
docs = loader.load()
# Split into chunks
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
chunks = splitter.split_documents(docs)
Step 2: Create Embeddings and Vector Store
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
# Create vector store from documents
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=OpenAIEmbeddings(model="text-embedding-3-small")
)
# Create a retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
Step 3: Build the RAG Chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
template = """Answer the question based on the following context:
Context:
{context}
Question: {question}
Answer concisely and cite the source page if available."""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(model="gpt-4o-mini")
def format_docs(docs):
return "\n\n---\n\n".join(
f"[Page {d.metadata.get('page', '?')}]: {d.page_content}"
for d in docs
)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| model
| StrOutputParser()
)
Step 4: Ask Questions
answer = rag_chain.invoke("What is the main conclusion of the report?")
print(answer)
The chain automatically retrieves relevant chunks from your document, formats them into the prompt, and generates an answer grounded in your data. This is the power of RAG — the LLM does not need to have seen your document during training.
Adding Memory to RAG
For multi-turn conversations, you need to pass chat history along with the question. LangChain's RunnableWithMessageHistory handles this:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
store = {}
def get_session_history(session_id):
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
conversational_rag = RunnableWithMessageHistory(
rag_chain,
get_session_history,
input_messages_key="question",
history_messages_key="history"
)
Building Agents with Tools
Agents go beyond chains by letting the LLM decide what actions to take. Instead of a fixed pipeline, the LLM observes the user's request and chooses which tools to call. This is the foundation of autonomous AI applications.
Define Tools
from langchain_core.tools import tool
@tool
def search_web(query: str) -> str:
"""Search the web for current information."""
# In production, integrate with a real search API
return f"Search results for: {query}"
@tool
def calculator(expression: str) -> str:
"""Evaluate a mathematical expression."""
return str(eval(expression))
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
return f"Weather in {city}: 22°C, partly cloudy"
Create the Agent
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
model = ChatOpenAI(model="gpt-4o-mini")
tools = [search_web, calculator, get_weather]
agent = create_react_agent(model, tools)
Run the Agent
response = agent.invoke({
"messages": [("user", "What's the weather in Tokyo and what is 15% of 847?")]
})
for message in response["messages"]:
print(f"{message.type}: {message.content}")
The agent will automatically call the get_weather tool for the Tokyo question and the calculator tool for the math problem, then combine the results into a coherent response. This is the ReAct pattern in action — the LLM reasons about which tools to use, acts by calling them, and observes the results.
For a deeper understanding of how agents work, read our AI Agents Explained guide.
LangChain vs Alternatives
LangChain is not the only option. Here is how it compares to the main alternatives:
| Feature | LangChain | LlamaIndex | Haystack | Direct API |
|---|---|---|---|---|
| Focus | General LLM apps | Data indexing & RAG | NLP pipelines | Raw LLM access |
| Learning Curve | Moderate | Moderate | Steep | Low |
| RAG Support | Good | Excellent | Good | DIY |
| Agent Support | Excellent (LangGraph) | Basic | Limited | DIY |
| Integrations | 700+ | 300+ | 100+ | None |
| Abstraction Level | High | High | High | None |
| Best For | Complex workflows | Document Q&A | Search pipelines | Simple use cases |
When to Use LangChain
- Prototyping: Quickly build and iterate on LLM applications
- Multi-step workflows: Chains, agents, and complex orchestration
- Multiple integrations: Need to connect to many different services
- Team projects: Standardized interfaces help teams collaborate
When to Use Something Else
- Simple API calls: If you just need to call one LLM, use the provider's SDK directly
- Pure RAG: LlamaIndex is more specialized for document indexing and retrieval
- Maximum control: Direct API calls give you full control over every detail
- Performance-critical: LangChain's abstractions add some overhead; for hot paths, consider direct calls
Best Practices
After building many LangChain applications, here are the key lessons:
- Start simple: Begin with a basic chain and add complexity only when needed. Do not use agents when a simple chain suffices.
- Use LCEL: Always prefer LCEL chains over legacy classes. They are faster, support streaming, and are the officially recommended approach.
- Test prompts independently: Isolate and test your prompts before building them into chains. Small prompt changes can have large effects.
- Monitor token usage: LangChain callbacks let you track token consumption. Set up logging early to avoid surprise bills.
- Use structured output: When you need the LLM to return structured data (JSON, Pydantic models), use
with_structured_output()instead of parsing strings. - Version your prompts: Treat prompts like code. Store them in version control, test them, and track changes.
- Handle errors gracefully: LLM calls can fail (rate limits, timeouts, API errors). Use retry logic and fallbacks.
- Know when to leave: If LangChain's abstractions are getting in the way, it is perfectly fine to drop down to raw API calls for specific components.
Frequently Asked Questions
What is LangChain used for?
LangChain is a framework for building applications powered by Large Language Models. It provides tools for chaining LLM calls, connecting to external data sources, building agents, and creating RAG (Retrieval-Augmented Generation) pipelines.
Is LangChain free to use?
Yes, LangChain is open-source and free to use under the MIT license. However, you still need to pay for the underlying LLM API calls (e.g., OpenAI, Anthropic) or have sufficient hardware to run local models.
Should I use LangChain or build from scratch?
Use LangChain when you need to quickly prototype RAG pipelines, agents, or complex LLM workflows. Build from scratch when you need full control, minimal dependencies, or are building a simple single-LLM-call application. Many teams start with LangChain and replace components as needed.
What programming languages does LangChain support?
LangChain is primarily available for Python (langchain) and JavaScript/TypeScript (langchainjs). The Python version is more mature and has a larger ecosystem of integrations.