LangGraph Chapter 1 — Why Agents? ReAct Pattern vs Chains
Senior Architect Interview Series — LangGraph & Agentic AI
Chapter Map (All 10 Chapters)
| Chapter | Topic | Interview Weight |
|---|---|---|
| Ch 1 ← YOU ARE HERE | Why Agents? ReAct Pattern vs Chains | ⭐⭐⭐⭐⭐ Always asked first |
| Ch 2 | LangGraph Fundamentals — StateGraph, Nodes, Edges | ⭐⭐⭐⭐⭐ |
| Ch 3 | AgentState & Reducers — add_messages Deep Dive | ⭐⭐⭐⭐⭐ |
| Ch 4 | Tool Calling — How It Works End-to-End | ⭐⭐⭐⭐⭐ |
| Ch 5 | Conditional Routing & Graph Control Flow | ⭐⭐⭐⭐ |
| Ch 6 | Memory — In-Context, Session, Long-Term | ⭐⭐⭐⭐⭐ |
| Ch 7 | Multi-Agent — Supervisor Pattern & Handoffs | ⭐⭐⭐⭐⭐ |
| Ch 8 | Human-in-the-Loop & Interrupts | ⭐⭐⭐⭐ |
| Ch 9 | Error Handling, Retries & Fallback Agents | ⭐⭐⭐⭐ |
| Ch 10 | Production Agents — Streaming, Tracing, Scaling | ⭐⭐⭐⭐⭐ |
1.0 Why This Is Always the First Interview Question
Before any interviewer asks you about LangGraph specifics, they will ask:
"Why use an agent instead of a simple chain or a direct API call?"
Your answer to this question tells them everything about whether you understand the fundamental architectural shift that agents represent. A weak answer describes what an agent does. A strong answer explains why the problem requires it.
1.1 The Problem With Chains
A chain is a fixed, linear sequence of operations:
Input → Step 1 → Step 2 → Step 3 → Output
Example: A simple RAG chain
User Question
│
▼
Embed question
│
▼
Search ChromaDB
│
▼
Build prompt
│
▼
Call GPT-4
│
▼
Return answer
This works perfectly when:
- The steps are always the same
- You always need exactly one retrieval
- The question always maps to one document source
- No reasoning is required about WHAT to do next
Where Chains Break Down
Scenario 1 — Multi-step reasoning:
User: "Compare the refund policies of our US and EU divisions,
then tell me which one gives customers more time."
Chain approach:
→ Retrieves US policy ✓
→ Builds prompt ✓
→ GPT-4 generates answer
→ BUT: never retrieved the EU policy — it wasn't hardcoded!
→ Answer is incomplete or fabricated
Scenario 2 — Dynamic tool choice:
User: "What are the top-selling items AND what does our product manual say about them?"
Chain: hardcoded to ONE retrieval path
→ Cannot dynamically decide: "I need both SQL data AND document retrieval"
→ Calls the wrong tool or only one source
Scenario 3 — Error recovery:
Chain step 2 fails (ChromaDB timeout)
→ Chain crashes or returns error
→ No ability to retry with different strategy
→ No fallback to a different tool
Scenario 4 — Iterative refinement:
User: "Summarize all policies related to international shipping"
Chain: calls retrieval once, gets 3 chunks
→ GPT-4: "I need more information about customs duties specifically"
→ Chain: cannot retrieve again — no second trip back to the tool
The fundamental limitation of chains: The control flow is determined at design time by the developer. It cannot adapt at runtime based on what the LLM learns.
1.2 What Is an Agent?
An agent is a system where the LLM itself controls the execution flow.
┌─────────────┐
│ LLM │ ← The LLM is the CONTROLLER
│ (Reasoner) │
└──────┬──────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
[Tool A] [Tool B] [End /
rag_search sql_query Answer]
│ │
└─────┬─────┘
│
Observations fed
back to LLM
│
▼
LLM reasons again:
"Do I have enough to answer?"
→ YES → produce final answer
→ NO → call another tool
Key difference: In a chain, the developer codes if retrieval_needed → retrieve. In an agent, the LLM decides if retrieval_needed → retrieve.
1.3 The ReAct Pattern — The Foundation of All Agents
ReAct = Reasoning + Acting
Introduced in the paper: "ReAct: Synergizing Reasoning and Acting in Language Models" (Yao et al., 2022, Google Brain).
The ReAct Loop
┌──────────────────────────────────────────┐
│ │
│ THOUGHT (reasoning) │
│ "I need to find the refund policy │
│ to answer this question" │
│ │
│ ↓ │
│ ACTION (tool call) │
│ rag_search("refund policy") │
│ │
│ ↓ │
│ OBSERVATION (tool result) │
│ "Refunds accepted within 30 days..." │
│ │
│ ↓ │
│ THOUGHT (reasoning again) │
│ "I now have the refund policy. │
│ I can answer the question." │
│ │
│ ↓ │
│ FINAL ANSWER │
│ "The refund window is 30 days." │
│ │
└──────────────────────────────────────────┘
↑___________________________|
(loop until done)
Each cycle consists of:
- Thought — LLM reasons about current state and what to do
- Action — LLM calls a tool (or decides to stop)
- Observation — Tool result is returned to LLM
- Repeat — Until LLM determines it has enough to answer
Why This Is Powerful
The LLM gets to see its own tool results and reason about them before deciding the next step. This enables:
- Multi-hop retrieval (retrieve → read → retrieve again with refined query)
- Dynamic tool selection (choose the right tool based on the question)
- Self-correction (if one tool gives insufficient info, try another)
- Early termination (if the first retrieval is enough, don't call more tools)
1.4 ReAct in Your Project — Line by Line
Your agent/agent.py implements ReAct exactly. Let's trace through it:
# agent/agent.py
# ── STEP 1: Define the THOUGHT capability ──────────────────────────────────
# The LLM (reasoner) is bound with tools it CAN call
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
llm_with_tools = llm.bind_tools(tools)
# bind_tools() tells the LLM: "You may call these functions.
# When you need information, emit a tool_call instead of a text answer."
# ── STEP 2: The THOUGHT node ───────────────────────────────────────────────
def call_llm(state: AgentState) -> AgentState:
response = llm_with_tools.invoke(state["messages"])
return {"messages": [response]}
# LLM reads ALL messages (conversation history + tool results so far)
# and decides: produce an answer, OR request a tool call
# ── STEP 3: The ACTION + OBSERVATION node ─────────────────────────────────
def call_tools(state: AgentState) -> AgentState:
last_message = state["messages"][-1] # the AIMessage with tool_calls
results = []
for tool_call in last_message.tool_calls:
tool_fn = tool_map[tool_call["name"]]
output = tool_fn.invoke(tool_call["args"]) # ACTION
results.append(ToolMessage( # OBSERVATION
content=str(output),
tool_call_id=tool_call["id"]
))
return {"messages": results}
# Executes the tool, wraps result in ToolMessage → fed back to LLM
# ── STEP 4: The DECISION (loop or stop?) ──────────────────────────────────
def should_call_tools(state: AgentState) -> str:
last_message = state["messages"][-1]
if hasattr(last_message, "tool_calls") and last_message.tool_calls:
return "call_tools" # → loop back, execute tools
return END # → done, return final answer
# This IS the ReAct loop controller.
# If LLM asked for tools → execute them and loop back to call_llm
# If LLM gave a text answer → END (no more tool calls)
The complete ReAct execution trace for: "What is Agent Factory?"
Turn 1 — call_llm:
Input: [HumanMessage("What is Agent Factory?")]
LLM: "I should search the knowledge base for this."
Output: AIMessage(tool_calls=[rag_search("Agent Factory")])
Turn 1 — should_call_tools:
→ sees tool_calls → returns "call_tools"
Turn 1 — call_tools:
→ executes rag_search("Agent Factory")
→ Output: ToolMessage("Agent Factory is a platform that...")
Turn 2 — call_llm:
Input: [HumanMessage, AIMessage(tool_call), ToolMessage(result)]
LLM: "I now have the information. I can answer."
Output: AIMessage("Agent Factory is PepsiCo's platform for...")
Turn 2 — should_call_tools:
→ no tool_calls in last message → returns END
Final answer: "Agent Factory is PepsiCo's platform for..."
1.5 Why LangGraph Instead of Plain LangChain Agents?
This is a must-know interview question.
LangChain AgentExecutor (Old Way)
# Old LangChain agent — black box
from langchain.agents import AgentExecutor, create_openai_tools_agent
agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)
result = executor.invoke({"input": "What is Agent Factory?"})
Problems:
- The loop runs entirely inside
AgentExecutor— you cannot see or control intermediate steps - No ability to pause mid-execution (no human-in-the-loop)
- Cannot inject custom logic between steps (e.g., guardrails after each tool call)
- Hard to add custom state beyond just the messages
- Debugging is painful — it's a black box
LangGraph (New Way — What Your Project Uses)
# LangGraph — explicit, visible, controllable
graph = StateGraph(AgentState)
graph.add_node("call_llm", call_llm)
graph.add_node("call_tools", call_tools)
graph.set_entry_point("call_llm")
graph.add_conditional_edges("call_llm", should_call_tools)
graph.add_edge("call_tools", "call_llm")
agent = graph.compile()
Advantages:
| LangChain AgentExecutor | LangGraph |
|---|---|
| Implicit loop (hidden) | Explicit graph (visible) |
| Fixed flow | Configurable nodes + edges |
| No mid-execution control | Pause/resume with interrupt_before |
| Hard to add custom state | Full TypedDict state, any fields |
| Debugging: guess | Debugging: trace each node |
| Single agent only | Multi-agent natively |
| No persistence | State can be checkpointed |
Interview one-liner: "LangGraph makes the agent loop explicit — every node, every edge, every state transition is visible and controllable. AgentExecutor hides all of that."
1.6 Chains vs Agents vs Agentic RAG — Decision Framework
Use this to answer "when would you USE an agent vs a chain?"
DECISION TREE:
Is the task always the same steps in the same order?
YES → Use a chain (LangChain LCEL, simple pipeline)
NO ↓
Does the solution require choosing between multiple tools?
NO → Use a chain with one tool
YES ↓
Does the solution require seeing a tool result before deciding the next step?
NO → Use a chain (all steps predetermined)
YES ↓
Does the solution require iteration (loop until done)?
NO → Use a chain with multiple fixed tools
YES ↓
USE AN AGENT
Concrete examples:
| Use Case | Chain or Agent? | Why |
|---|---|---|
| Simple Q&A from one document | Chain | Fixed path: retrieve → answer |
| Customer chatbot with memory | Agent | Must reason about multi-turn, tool choice |
| "Compare policies from 3 docs" | Agent | Must retrieve 3× iteratively |
| ETL pipeline | Chain | Fixed transformation steps |
| Code debugging assistant | Agent | Iterates: read error → fix → run → check result |
| Data analysis with SQL + docs | Agent | Dynamic tool choice: SQL or RAG or both |
| Your project's multi-agent system | Agent + Supervisor | Multiple specialized capabilities |
1.7 The Three Levels of LLM Application Architecture
Senior architects categorize LLM applications into three levels:
Level 1 — LLM-Powered Functions
Direct API calls. No framework needed.
response = openai.chat.completions.create(model="gpt-4o-mini", messages=[...])
Use when: Single prompt, simple transformation, no tools needed.
Level 2 — Chains
Fixed multi-step pipelines. LangChain LCEL or similar.
chain = retriever | prompt | llm | output_parser
result = chain.invoke({"question": "..."})
Use when: Steps are known upfront, no branching, no loops.
Level 3 — Agents (LangGraph)
LLM controls the flow. Dynamic, iterative, multi-tool.
graph = StateGraph(AgentState)
# nodes + edges + conditional routing
agent = graph.compile()
result = agent.invoke({"messages": [HumanMessage("...")]})
Use when: Complex reasoning, tool choice, iteration, or multi-agent needed.
Your project implements Level 3 — and that's exactly what makes it interview-worthy.
1.8 What "Agentic" Really Means
The word "agentic" is overused. Here is the precise definition for senior interviews:
An agentic system has:
| Property | Meaning | Your Project |
|---|---|---|
| Autonomy | LLM decides what to do without explicit instructions per step | ✓ LLM chooses when to call rag_search |
| Tool use | Can take actions beyond text generation | ✓ rag_search, sql_query tools |
| Perception | Reads its own action results | ✓ ToolMessages fed back to LLM |
| Memory | Retains information across turns | ✓ Session history in PostgreSQL |
| Goal-directed | Pursues a goal across multiple steps | ✓ Loops until question is answered |
| Adaptability | Changes approach based on what it learns | ✓ Decides next tool based on current state |
A system that just calls GPT-4 with a fixed prompt is NOT agentic — it's a Level 1 function. A system where GPT-4 decides what to do next based on tool results IS agentic.
1.9 The Message Accumulation Pattern
Every ReAct agent shares one core pattern: messages accumulate.
Initial state:
messages: [HumanMessage("What is Agent Factory?")]
After call_llm (Turn 1):
messages: [
HumanMessage("What is Agent Factory?"),
AIMessage(tool_calls=[rag_search("Agent Factory")]) ← appended
]
After call_tools:
messages: [
HumanMessage("What is Agent Factory?"),
AIMessage(tool_calls=[rag_search("Agent Factory")]),
ToolMessage("Agent Factory is a platform...") ← appended
]
After call_llm (Turn 2):
messages: [
HumanMessage("What is Agent Factory?"),
AIMessage(tool_calls=[rag_search("Agent Factory")]),
ToolMessage("Agent Factory is a platform..."),
AIMessage("Agent Factory is PepsiCo's platform...") ← appended
]
Why this matters:
- The LLM in Turn 2 sees the FULL history — including its own tool call and the result
- This is what enables reasoning ABOUT tool results
- The
add_messagesreducer inAgentStatehandles this append behavior (Chapter 3)
1.10 Code Reference — Complete agent.py Annotated
import os
from typing import Annotated, TypedDict
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage, BaseMessage
from langchain_core.tools import tool
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from rag.retrieve import retrieve, build_prompt
from .memory import load_history, save_history
from sqlalchemy.orm import Session
from .guardrails import check_input
from .logger import log_guardrail, log_agent_start, log_agent_end, log_tool_call
# ── TOOL DEFINITION ────────────────────────────────────────────────────────
@tool
def rag_search(query: str) -> str:
"""Search the knowledge base for relevant information about the query."""
# The docstring IS the tool description the LLM reads to decide when to use it
chunks = retrieve(query, top_k=3)
if not chunks:
return "No relevant information found in the knowledge base."
return build_prompt(query, chunks)
# ── STATE DEFINITION ───────────────────────────────────────────────────────
class AgentState(TypedDict):
messages: Annotated[list[BaseMessage], add_messages]
# Annotated[..., add_messages] means:
# "When this field is updated, APPEND new messages, don't replace the list"
# This is the reducer that makes message accumulation automatic
# ── LLM SETUP ──────────────────────────────────────────────────────────────
tools = [rag_search]
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
llm_with_tools = llm.bind_tools(tools)
# bind_tools registers tool schemas with the model
# The model receives JSON schema of each tool and can emit structured tool_calls
tool_map = {t.name: t for t in tools} # name → callable lookup
# ── GRAPH NODES ────────────────────────────────────────────────────────────
def call_llm(state: AgentState) -> AgentState:
response = llm_with_tools.invoke(state["messages"])
# response is either:
# AIMessage(content="Final answer...") → done
# AIMessage(content="", tool_calls=[...]) → wants to call tools
return {"messages": [response]}
def call_tools(state: AgentState) -> AgentState:
last_message = state["messages"][-1] # AIMessage with tool_calls
results = []
for tool_call in last_message.tool_calls:
log_tool_call("agent", tool_call["name"], tool_call["args"])
tool_fn = tool_map[tool_call["name"]]
output = tool_fn.invoke(tool_call["args"])
results.append(ToolMessage(
content=str(output),
tool_call_id=tool_call["id"] # must match the AIMessage tool_call id
))
return {"messages": results}
# ── CONDITIONAL EDGE ───────────────────────────────────────────────────────
def should_call_tools(state: AgentState) -> str:
last_message = state["messages"][-1]
if hasattr(last_message, "tool_calls") and last_message.tool_calls:
return "call_tools" # → route to call_tools node
return END # → exit graph, return final state
# ── GRAPH ASSEMBLY ─────────────────────────────────────────────────────────
graph = StateGraph(AgentState)
graph.add_node("call_llm", call_llm)
graph.add_node("call_tools", call_tools)
graph.set_entry_point("call_llm") # first node to execute
graph.add_conditional_edges( # from call_llm, branch:
"call_llm",
should_call_tools # routing function
# Return "call_tools" → go to call_tools node
# Return END → exit graph
)
graph.add_edge("call_tools", "call_llm") # always go back to LLM after tools
agent = graph.compile() # compile into executable
# ── ENTRY POINT ────────────────────────────────────────────────────────────
def run_agent(question: str, session_id: str, db: Session) -> str:
# 1. Input guardrail — check BEFORE anything
allowed = check_input(question)
log_guardrail(session_id, question, allowed)
if not allowed:
return "I can only answer questions relevant to the knowledge base."
# 2. Load conversation history from PostgreSQL
start_time = log_agent_start(session_id, question)
history = load_history(session_id, db)
# 3. Add the new question to history
history.append(HumanMessage(content=question))
# 4. Run the LangGraph agent (ReAct loop)
final_state = agent.invoke({"messages": history})
# 5. Extract the final answer (last message)
answer = final_state["messages"][-1].content
# 6. Persist session history
save_history(session_id, question, answer, db)
log_agent_end(session_id, start_time)
return answer
1.11 Interview Q&A
Q: What is the ReAct pattern and why does it matter?
ReAct (Reasoning + Acting) is the foundational pattern for LLM agents. The LLM alternates between reasoning about the current state (Thought), taking an action via a tool call (Act), and reading the result (Observe). This loop continues until the LLM determines it has enough information to produce a final answer. It matters because it enables dynamic, multi-step problem solving that static chains cannot do — the LLM can retrieve information, read the result, decide it needs more, retrieve again with a refined query, and so on.
Q: Why did you use LangGraph instead of LangChain's AgentExecutor?
AgentExecutorruns the loop as a black box — you can't see or control intermediate steps, can't pause execution, can't inject custom logic between steps, and can't add custom state fields. LangGraph makes the agent an explicit directed graph: every node, every edge, every conditional branch is visible and controllable. This means I can add guardrails between steps, implement human-in-the-loop interrupts, add checkpointing for fault tolerance, or route to different subgraphs based on state. For a production system at PepsiCo's scale, that control and visibility is non-negotiable.
Q: When would you use a chain instead of an agent?
When the steps are known upfront and fixed — a chain is simpler, faster, cheaper, and more predictable. The overhead of an agent (iterative LLM calls, tool execution loop) is only justified when you need: dynamic tool selection based on the question, iterative retrieval (retrieve → reason → retrieve again), or the ability to recover from tool failures with a different strategy. A simple Q&A over a single document collection is a chain. A system that might query documents, SQL, or external APIs depending on the question is an agent.
Q: How does your agent know when to stop?
The
should_call_toolsfunction checks the last message in the state. If the LLM's response containstool_calls, the graph routes to thecall_toolsnode and loops back. If the LLM's response is a plain textAIMessagewith no tool calls, the graph returnsENDand execution stops. The LLM inherently knows to stop generating tool calls when it has sufficient information — because its training teaches it to emit tool results only when needed, and to emit a final text answer when the question is fully answerable.
Q: What is bind_tools() and what does it actually do?
llm.bind_tools(tools)registers the JSON schema of each tool with the LLM. At call time, OpenAI receives the tool schemas alongside the messages. The model is then capable of producing structuredtool_callsin its response — a JSON object with the function name and arguments — instead of plain text. The LLM never directly executes Python functions; it emits a structured request, and thecall_toolsnode in the graph dispatches to the actual Python function and feeds the result back as aToolMessage.
Q: What happens if a tool call fails mid-agent execution?
In the current implementation, an exception in
tool_fn.invoke()would propagate up and crash the agent invocation. In production, I'd wrap each tool call in a try/except and return an error ToolMessage (e.g.,"Tool failed: timeout"). The LLM then reads this failure as an observation and can reason about it — either retrying with a different query, calling a different tool, or returning a graceful answer explaining it couldn't retrieve the information. LangGraph also supports retry policies at the graph level via compiled graph settings.
1.12 Key One-Liners to Memorize
"A chain is a developer-defined path. An agent is an LLM-defined path."
"ReAct = Thought → Action → Observation → repeat until done."
"LangGraph makes the agent loop explicit. AgentExecutor hides it."
"bind_tools() doesn't let the LLM call Python. It lets the LLM REQUEST a call —
the framework executes it."
"Messages accumulate across the loop — every tool call and result becomes context
for the next reasoning step."
"Use a chain when steps are known. Use an agent when the LLM needs to decide the steps."
1.13 Mental Model Summary
CHAIN (fixed flow):
User Input → [Step 1] → [Step 2] → [Step 3] → Answer
Developer decides every step at design time.
AGENT (dynamic flow):
User Input
│
▼
[LLM decides: do I need a tool?]
│
├── YES → [Execute Tool] → [LLM reads result] → loop
│
└── NO → [Final Answer]
LANGGRAPH AGENT (your project):
StateGraph with explicit nodes (call_llm, call_tools)
and conditional routing (should_call_tools)
↓
Full visibility, full control, production-ready
↓
run_agent():
guardrails → load history → ReAct loop → save history → return
Next: Chapter 2 — LangGraph Fundamentals: StateGraph, Nodes & Edges