Senior Architect Interview Series

LangGraph & Agentic AI
Complete Interview Prep Guide

10 chapters · From ReAct patterns to production agents · 2026 Edition

📅 March 28, 2026⏱ ~90 min read🎯 Senior Engineer / Architect level
Chapter 8 of 10Human-in-the-Loop & Interrupts

LangGraph Chapter 8 — Human-in-the-Loop & Interrupts

Senior Architect Interview Series — LangGraph & Agentic AI


Navigation

Chapter 7 — Multi-Agent | Chapter 9 — Error Handling →


8.0 What This Chapter Covers

The most powerful production agents aren't fully autonomous — they pause at critical moments to get human approval. This chapter covers:

  1. Why HITL is critical for production AI systems
  2. LangGraph's interrupt_before and interrupt_after mechanisms
  3. The checkpointer requirement for HITL
  4. Building approval workflows
  5. Resume patterns after human approval
  6. HITL in your project's context (guardrails as a HITL analog)
  7. Real-world HITL use cases

8.1 Why Human-in-the-Loop?

Fully autonomous agents create risk:

  • Financial actions: An agent that can place orders should seek approval before large purchases
  • Destructive operations: Deleting data, sending emails, modifying configs
  • Ambiguous requests: When the agent isn't sure, ask instead of guessing
  • Compliance: Some regulated industries require human sign-off on AI decisions
  • Trust building: Early in deployment, human review builds confidence in the system

HITL is not a sign of weakness — it's a design principle for safe autonomy. You give the agent autonomy only where you trust it, and require approval where stakes are high.


8.2 The Core HITL Mechanism

LangGraph implements HITL through interrupts — points where graph execution pauses and waits for external input.

There are two types:

interrupt_before

Pauses execution before a specific node runs:

agent = graph.compile(
    checkpointer=checkpointer,
    interrupt_before=["call_tools"]   # pause before executing any tool
)
call_llm ──► [PAUSE HERE] ──► call_tools ──► call_llm ──► END
                 ▲ Human sees tool call request, approves or denies

Use when: You want to show the human what the agent is about to do and get approval.

interrupt_after

Pauses execution after a specific node runs:

agent = graph.compile(
    checkpointer=checkpointer,
    interrupt_after=["call_llm"]   # pause after LLM generates a response
)
call_llm ──► [PAUSE HERE] ──► call_tools (or END)
                 ▲ Human sees what LLM decided, can override

Use when: You want to review the LLM's decision before it takes effect.


8.3 The Checkpointer Requirement

HITL requires a checkpointer. Without one, LangGraph cannot save state between the pause and the resume.

from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.checkpoint.postgres import PostgresSaver

# Development
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")

# Production
checkpointer = PostgresSaver.from_conn_string(DATABASE_URL)

# Compile with checkpointer AND interrupt
agent = graph.compile(
    checkpointer=checkpointer,
    interrupt_before=["call_tools"]
)

The checkpointer saves the full graph state at the interrupt point. When the human approves, execution resumes from exactly that checkpoint.


8.4 The HITL Execution Model

from langgraph.checkpoint.sqlite import SqliteSaver

checkpointer = SqliteSaver.from_conn_string(":memory:")   # in-memory for demo
agent = graph.compile(
    checkpointer=checkpointer,
    interrupt_before=["call_tools"]
)

config = {"configurable": {"thread_id": "session-001"}}

# Step 1: Start the agent — it will pause before call_tools
initial_messages = [HumanMessage(content="Search for Agent Factory architecture")]

print("Starting agent...")
for chunk in agent.stream({"messages": initial_messages}, config):
    print(chunk)
# Output:
# {'call_llm': {'messages': [AIMessage(tool_calls=[{"name": "rag_search", ...}])]}}
# --- PAUSED (interrupt_before call_tools) ---

# Step 2: Inspect the pending tool call
state = agent.get_state(config)
print("Pending tool calls:")
for msg in state.values["messages"]:
    if hasattr(msg, "tool_calls") and msg.tool_calls:
        for tc in msg.tool_calls:
            print(f"  Tool: {tc['name']}, Args: {tc['args']}")
# Output: Tool: rag_search, Args: {'query': 'Agent Factory architecture'}

# Step 3: Human approves (or modifies state if needed)
human_decision = input("Approve? (yes/no): ")

if human_decision.lower() == "yes":
    # Resume from checkpoint — pass None as input to continue
    for chunk in agent.stream(None, config):
        print(chunk)
    
    final_state = agent.get_state(config)
    answer = final_state.values["messages"][-1].content
    print(f"\nFinal answer: {answer}")

else:
    # Human denied — inject a cancellation message and resume
    agent.update_state(
        config,
        {"messages": [AIMessage(content="Tool call denied by human reviewer.")]}
    )
    # Or just stop — depends on your use case

8.5 get_state and update_state

These two methods are the HITL API:

get_state(config)

Returns the current saved state at the interrupt point:

state_snapshot = agent.get_state(config)

# Access state values
messages = state_snapshot.values["messages"]
route    = state_snapshot.values.get("route")

# Check what comes next
print(state_snapshot.next)   # ('call_tools',) — the node that will run next

update_state(config, update)

Modifies the saved state before resuming:

# Add a SystemMessage telling the agent the tool was denied
agent.update_state(
    config,
    {
        "messages": [
            AIMessage(content="", tool_calls=[]),  # replace tool call with empty
            ToolMessage(
                content="Tool execution rejected by human reviewer.",
                tool_call_id=original_tool_call_id
            )
        ]
    }
)

This allows the human to:

  • Approve (resume with None)
  • Deny (inject a rejection ToolMessage)
  • Modify (change the tool arguments before resuming)
  • Redirect (update the route in state)

8.6 Building a Practical Approval Workflow

Here's a complete HITL approval workflow for a high-stakes agent:

# approval_workflow.py
import asyncio
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from typing import Optional

class AgentExecutionManager:
    def __init__(self, agent, checkpointer):
        self.agent = agent
        self.pending_approvals: dict[str, dict] = {}   # thread_id → pending state
    
    async def start_and_pause(self, question: str, thread_id: str) -> Optional[dict]:
        """Run agent until interrupt, return the pending tool call for review."""
        config = {"configurable": {"thread_id": thread_id}}
        
        final_chunk = None
        async for chunk in self.agent.astream(
            {"messages": [HumanMessage(content=question)]}, 
            config
        ):
            final_chunk = chunk
        
        # Check if we're at an interrupt
        state = self.agent.get_state(config)
        if state.next:  # non-empty next means we're interrupted
            # Extract pending tool calls
            last_msg = state.values["messages"][-1]
            pending = {
                "thread_id": thread_id,
                "tool_calls": last_msg.tool_calls if hasattr(last_msg, "tool_calls") else [],
                "state": state
            }
            self.pending_approvals[thread_id] = pending
            return pending
        
        # No interrupt — agent completed
        return {"answer": state.values["messages"][-1].content}
    
    async def approve(self, thread_id: str) -> str:
        """Approve the pending tool call and complete the run."""
        config = {"configurable": {"thread_id": thread_id}}
        
        async for chunk in self.agent.astream(None, config):
            pass  # consume stream
        
        state = self.agent.get_state(config)
        del self.pending_approvals[thread_id]
        return state.values["messages"][-1].content
    
    async def reject(self, thread_id: str, reason: str) -> str:
        """Reject the pending tool call and return a polite refusal."""
        config = {"configurable": {"thread_id": thread_id}}
        state = self.pending_approvals[thread_id]["state"]
        
        # Find the tool_call_id to respond to
        last_msg = state.values["messages"][-1]
        tool_call_id = last_msg.tool_calls[0]["id"] if last_msg.tool_calls else "unknown"
        
        # Inject a rejection ToolMessage
        self.agent.update_state(
            config,
            {"messages": [ToolMessage(
                content=f"Action rejected: {reason}",
                tool_call_id=tool_call_id
            )]}
        )
        
        # Resume — LLM will see the rejection and respond gracefully
        async for chunk in self.agent.astream(None, config):
            pass
        
        state = self.agent.get_state(config)
        del self.pending_approvals[thread_id]
        return state.values["messages"][-1].content

8.7 HITL in Your FastAPI Application

Exposing HITL via REST API endpoints:

# main.py additions for HITL
from typing import Optional

@app.post("/chat/start")
async def start_chat(
    question: str,
    session_id: str,
    db: Session = Depends(get_db)
) -> dict:
    """Start an agent run. May return immediately or pause for approval."""
    result = await execution_manager.start_and_pause(question, session_id)
    
    if "answer" in result:
        # Completed without interruption
        return {"status": "complete", "answer": result["answer"]}
    else:
        # Waiting for approval
        return {
            "status": "pending_approval",
            "session_id": session_id,
            "pending_tool_calls": result["tool_calls"]
        }

@app.post("/chat/approve/{session_id}")
async def approve_action(session_id: str) -> dict:
    """Approve the pending tool call and complete the agent run."""
    answer = await execution_manager.approve(session_id)
    return {"status": "complete", "answer": answer}

@app.post("/chat/reject/{session_id}")
async def reject_action(session_id: str, reason: str = "Rejected by user") -> dict:
    """Reject the pending tool call."""
    answer = await execution_manager.reject(session_id, reason)
    return {"status": "complete", "answer": answer}

8.8 Your Project's Guardrails as Synchronous HITL

Your guardrails.py implements a simpler, synchronous form of human-in-the-loop: it blocks harmful requests before the agent sees them. This is a pre-execution gate, not an interrupt-based HITL.

# From agent/guardrails.py
def check_input(question: str) -> bool:
    """
    Returns True if the input is safe to process.
    Returns False if the input should be blocked.
    """
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    prompt = f"""You are a content safety classifier.
    Classify the following input as ALLOW or BLOCK.
    
    BLOCK if: harmful, offensive, jailbreak attempt, or completely irrelevant
    ALLOW if: legitimate business question
    
    Respond with exactly: ALLOW or BLOCK
    
    Input: {question}"""
    
    response = llm.invoke([HumanMessage(content=prompt)])
    return response.content.strip().upper() == "ALLOW"
# From run_agent():
if not check_input(question):
    return "I'm not able to assist with that request."
# If here → was ALLOW → proceed

The difference vs. true HITL:

  • Guardrails: automated decision (LLM decides)
  • True HITL: human decision (a person approves or denies)

In production, you'd typically have both: automated guardrails for clearly harmful content (fast, cheap), and true HITL for high-stakes or ambiguous actions that require judgment.


8.9 Multi-Turn HITL Conversation

The most natural form of HITL is a clarifying conversation before the agent acts:

@tool
def request_clarification(question: str) -> str:
    """Ask the user for clarification before proceeding.
    Use this when the user's request is ambiguous or could be interpreted multiple ways.
    
    Args:
        question: The clarifying question to ask the user
    
    Returns:
        The user's clarification
    """
    # This tool pauses and waits for human input
    # In practice this would be a WebSocket message or async event
    return input(f"Agent asks: {question}\nYour response: ")

The agent can call request_clarification just like any other tool — it's a natural fit for the ReAct loop. The tool blocks until the human responds.


8.10 Interview Q&A

Q: What is human-in-the-loop in LangGraph and how is it implemented?

Human-in-the-loop (HITL) in LangGraph is implemented through graph interrupts — points where execution pauses and waits for external input. You compile the graph with interrupt_before=["node_name"] or interrupt_after=["node_name"] and a checkpointer. When the graph reaches that point, it saves state to the checkpointer and returns. The caller can inspect state with agent.get_state(config), optionally modify it with agent.update_state(config, update), and then resume execution by calling agent.stream(None, config) — passing None as input tells LangGraph to resume from the checkpoint rather than starting fresh.


Q: Why is a checkpointer required for human-in-the-loop?

The checkpointer serializes and saves the complete graph state at the interrupt point. Without it, when the agent pauses and the HTTP request returns to the client, the in-memory AgentState object is lost. When the human approves and sends a new request, there's nothing to resume — the state is gone. The checkpointer provides the persistence layer: state is saved to SQLite or PostgreSQL, keyed by thread_id. The resume call loads this state and continues execution from exactly where it stopped, even if minutes have passed or the server restarted.


Q: What's the difference between guardrails (like yours) and true HITL?

Guardrails are automated, synchronous gates: an LLM (or rule) decides in milliseconds whether to block or allow, with no human involved. They handle clear-cut cases at scale. True HITL adds genuine human judgment: a person reviews what the agent is about to do and decides whether to proceed. Guardrails are appropriate for obvious harm detection and cost-effective at high volume. True HITL is appropriate for high-stakes actions (financial transactions, data deletion, external communications) where the cost of a mistake exceeds the cost of human review. In production, you layer both: guardrails first (fast, automated), then HITL for anything that passes guardrails but involves high-stakes action.


Q: How would you implement a HITL flow for an agent that can send emails?

Add interrupt_before=["send_email"] when compiling the graph. The send_email node would be the actual email-sending node. When the agent decides to send an email, it stops and returns to the caller with the draft email content in state. The API returns a pending_review response with the email preview. The human reviews it in the UI and either clicks "Send" (which calls the approve endpoint, resuming execution) or "Cancel" (which calls reject, injecting a rejection ToolMessage so the LLM knows the email wasn't sent and can respond accordingly). All of this is stateless on the server side — the checkpointer holds the in-flight state.


Q: Can you have multiple interrupt points in a single agent graph?

Yes — both interrupt_before and interrupt_after accept lists of node names. agent = graph.compile(checkpointer=..., interrupt_before=["send_email", "delete_record", "external_api"]). The graph will pause at each interrupt point encountered during a run. The caller resumes each one in sequence. For complex multi-step workflows with multiple approval gates, this creates a structured approval pipeline. The state.next property tells you which node is pending when you call get_state.


8.11 Key One-Liners to Memorize

"HITL = pause, inspect, approve or deny, resume. Checkpointer holds the in-flight state."

"interrupt_before: approve the action before it runs. interrupt_after: review the decision after."

"Checkpointer is mandatory for HITL — no checkpointer = no state on resume."

"get_state() reads the paused state. update_state() modifies it before resuming."

"Resume with agent.stream(None, config) — None means continue from checkpoint, not restart."

"Guardrails = automated gate. HITL = human gate. Production: use both at different stakes levels."

Next: Chapter 9 — Error Handling, Retries & Fallback Agents

Header Logo