You're going to learn how to build agents in LangChain - AI systems that can decide which tools to use, execute them, and reason about the results until they solve your problem.
What makes an agent different
Most LLM apps follow a fixed path: prompt → model → response. Agents break this pattern by giving the model:
- Access to tools (search, calculators, APIs, databases)
- Freedom to decide which tool to use and when
- Ability to chain actions based on intermediate results
Think of it like the difference between following a recipe (chain) versus cooking freestyle (agent).
The ReAct framework: Reasoning + Acting
LangChain agents use the ReAct pattern from a 2023 research paper. It's a simple loop:
- Thought: The agent reasons about what to do
- Action: It picks a tool and provides input
- Observation: It sees the tool's result
- Repeat until it has enough info to answer
This "think out loud" approach makes agents more reliable and debuggable.
Building your first agent: Step by step
Step 1: Define tools
Tools are functions the agent can call. Start with something simple like a calculator.
from langchain_core.tools import Tool
from langchain_experimental.utilities import PythonREPL
# Create a Python code executor
python_repl = PythonREPL()
calculator = Tool(
name="Python Calculator",
func=python_repl.run,
description="Useful when you need to perform calculations or execute Python code. Input should be valid Python code."
)
The description matters: It tells the agent when to use this tool. Be specific.
Step 2: Add custom tools
You can turn any Python function into a tool with the @tool decorator.
from langchain.tools import tool
@tool
def search_weather(location: str):
"""Search for the current weather in a specified location."""
# In production, call a real weather API
return f"The weather in {location} is sunny and 72°F."
The docstring becomes the tool's description - the agent reads it to decide when to use the tool.
Step 3: Create a toolkit
A toolkit is just a list of related tools.
tools = [calculator, search_weather]
For production, LangChain offers pre-built toolkits for:
- Web search (DuckDuckGo, Google)
- SQL databases
- File systems
- APIs (Slack, GitHub, etc.)
Step 4: Build the agent prompt
The prompt teaches the agent the thought-action-observation pattern.
from langchain_core.prompts import PromptTemplate
prompt_template = """You are an agent with access to these tools:
{tools}
Available tool names: {tool_names}
Use this format:
Thought: Think about what you need to do
Action: tool_name
Action Input: input for the tool
Observation: result from the tool
... (repeat Thought/Action/Observation as needed)
Thought: I know the final answer
Final Answer: your answer to the user's question
Question: {input}
{agent_scratchpad}
"""
prompt = PromptTemplate.from_template(prompt_template)
The {agent_scratchpad} variable stores the agent's thinking history as it works through the problem.
Step 5: Initialize the agent
from langchain.agents import create_react_agent, AgentExecutor
# Create the ReAct agent
agent = create_react_agent(
llm=llm, # Your language model
tools=tools, # List of tools
prompt=prompt # Instructions
)
# Wrap in an executor that handles the loop
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # Show thinking process
handle_parsing_errors=True # Recover from format mistakes
)
Why two objects?
agent= the LLM doing the reasoningagent_executor= the loop that runs tools and feeds results back
Step 6: Run the agent
result = agent_executor.invoke({
"input": "What's the square root of 256?"
})
print(result["output"])
With verbose=True, you'll see:
Thought: I need to calculate the square root
Action: Python Calculator
Action Input: import math; math.sqrt(256)
Observation: 16.0
Thought: I have the answer
Final Answer: The square root of 256 is 16.
Real example: Multi-step problem
Watch how an agent handles a complex question requiring multiple tools.
result = agent_executor.invoke({
"input": "Generate prime numbers below 50 and calculate their sum"
})
The agent will:
- Thought: Need to generate primes and sum them
- Action: Python Calculator
- Action Input:
[x for x in range(2, 50) if all(x % y != 0 for y in range(2, int(x**0.5)+1))] - Observation:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47] - Thought: Now sum these
- Action: Python Calculator
- Action Input:
sum([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]) - Observation:
328 - Thought: I have the final answer
- Final Answer: The sum of prime numbers below 50 is 328
This multi-step reasoning is what makes agents powerful.
Advanced: Creating custom tools with validation
Add input validation and better error handling.
from pydantic import BaseModel, Field
class CalculatorInput(BaseModel):
expression: str = Field(description="A valid Python mathematical expression")
@tool(args_schema=CalculatorInput)
def safe_calculator(expression: str) -> str:
"""Calculate mathematical expressions safely."""
try:
# Only allow math operations
allowed = {'__builtins__': {}, 'abs': abs, 'min': min, 'max': max}
result = eval(expression, allowed)
return f"Result: {result}"
except Exception as e:
return f"Error: {str(e)}"
Pydantic validation ensures the agent provides inputs in the right format.
Common failure modes and fixes
Problem 1: Agent loops forever
Symptom: Keeps calling tools without progress
Fix: Add max iterations
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
max_iterations=5, # Stop after 5 steps
early_stopping_method="generate"
)
Problem 2: Wrong tool format errors
Symptom: "Could not parse LLM output"
Fix: Set handle_parsing_errors=True and improve prompt instructions. Make the expected format crystal clear.
Problem 3: Agent refuses to use tools
Symptom: Tries to answer without calling any tools
Fix: Improve tool descriptions. Be explicit about when to use each tool:
Tool(
name="calculator",
func=calc,
description="MUST use this for ANY mathematical calculation. Do not try to compute in your head. Input: valid Python expression."
)
Problem 4: Can't debug agent decisions
Fix: Always use verbose=True during development. Add logging:
import logging
logging.basicConfig(level=logging.DEBUG)
Testing your agent
Create a test suite with different question types:
test_cases = [
# Simple single-tool questions
"What is 25 * 89?",
"What's the weather in Boston?",
# Multi-step reasoning
"If the weather in Miami is sunny, suggest an outdoor activity",
# Edge cases
"What is the capital of France?", # Doesn't need tools
"Calculate the weather", # Nonsensical - agent should clarify
]
for query in test_cases:
print(f"\n{'='*60}\nQuery: {query}\n{'='*60}")
result = agent_executor.invoke({"input": query})
print(f"Answer: {result['output']}\n")
When to use agents vs chains
Use a chain when:
- The workflow is fixed (always the same steps)
- You need predictable execution time
- The problem is well-defined
Use an agent when:
- The solution path depends on intermediate results
- You need dynamic tool selection
- The problem requires multi-step reasoning
Example:
- Chain: "Summarize this document then translate to Spanish" (fixed 2 steps)
- Agent: "Research the top AI papers from 2024 and summarize key trends" (needs to search, filter, read, synthesize)
Production considerations
Before deploying agents:
- [ ] Add timeouts on tool execution
- [ ] Implement cost controls (track token usage)
- [ ] Log all decisions for debugging
- [ ] Sandbox tool execution (especially for code interpreters)
- [ ] Add human-in-the-loop for high-stakes decisions
- [ ] Test edge cases exhaustively
- [ ] Set max iterations to prevent runaway costs
- [ ] Monitor tool failure rates
Quick recap
- Agents give LLMs tools and let them decide how to use them
- ReAct framework: Thought → Action → Observation loop
- Tools are Python functions with good descriptions
- Agent vs Chain: Use agents when the path isn't predetermined
- Debugging: Always use
verbose=Trueand log everything - Safety: Add max iterations, timeouts, and input validation
Agents transform LLMs from static responders into dynamic problem solvers that can break down complex tasks and use the right tools at the right time.