langgraph python

Low-level orchestration for stateful AI agents

$ npx docs2skills add langgraph-python

SKILL.md

LangGraph

Low-level orchestration for stateful AI agents

What this skill does

LangGraph is a low-level orchestration framework for building stateful, long-running AI agents that can persist through failures and resume execution. Unlike high-level agent frameworks that abstract away control flow, LangGraph gives you direct control over agent orchestration through graph-based workflow definitions.

The framework provides durable execution, meaning agents can survive server restarts, network failures, or crashes by maintaining state and resuming from the last checkpoint. It enables human-in-the-loop workflows where humans can inspect, modify, or approve agent actions at any point in the execution flow.

LangGraph is inspired by Google's Pregel and Apache Beam, built specifically for agent use cases. It integrates seamlessly with LangChain components but can be used standalone with any LLM or tool integration. Companies like Klarna, Replit, and Elastic use it for production agent systems that require reliability and human oversight.

Prerequisites

Python 3.8+
Understanding of state machines and graph structures
Basic familiarity with LLMs and tool calling concepts
Optional: LangChain components for pre-built integrations
Optional: LangSmith account for debugging and monitoring

Quick start

pip install langgraph

from langgraph.graph import StateGraph, MessagesState, START, END

def mock_llm(state: MessagesState):
    return {"messages": [{"role": "ai", "content": "hello world"}]}

graph = StateGraph(MessagesState)
graph.add_node(mock_llm)
graph.add_edge(START, "mock_llm")
graph.add_edge("mock_llm", END)
graph = graph.compile()

result = graph.invoke({"messages": [{"role": "user", "content": "hi!"}]})
print(result["messages"][-1]["content"])  # "hello world"

Core concepts

State Management: LangGraph centers around state objects that flow between nodes. MessagesState is the most common, tracking conversation history, but you can define custom state schemas for any data structure.

Graph Architecture: Workflows are defined as directed graphs where nodes are functions that receive and modify state, and edges determine execution flow. The graph compiles into an executable workflow engine.

Checkpointing: Built-in persistence automatically saves state at each node execution. When failures occur, the graph resumes from the last successful checkpoint rather than restarting.

Interrupts: Human-in-the-loop functionality allows pausing execution at any node, letting humans inspect state, make modifications, or provide approvals before continuing.

Key API surface

Function	Purpose
`StateGraph(schema)`	Create graph with state schema
`add_node(name, function)`	Add processing node
`add_edge(from, to)`	Add unconditional edge
`add_conditional_edge(from, condition, mapping)`	Add conditional routing
`compile(checkpointer=None, interrupt_before=[], interrupt_after=[])`	Create executable graph
`invoke(input, config=None)`	Run graph once
`stream(input, config=None)`	Stream execution steps
`get_state(config)`	Retrieve current state
`update_state(config, values)`	Modify state manually

Common patterns

Multi-step reasoning workflow:

def researcher(state):
    # Research step
    return {"messages": [...], "research_data": data}

def analyzer(state):
    # Analysis step  
    return {"messages": [...], "analysis": results}

graph.add_node("research", researcher)
graph.add_node("analyze", analyzer)
graph.add_edge("research", "analyze")

Conditional branching:

def route_decision(state):
    if state["confidence"] > 0.8:
        return "finalize"
    return "human_review"

graph.add_conditional_edge(
    "analyzer",
    route_decision,
    {"finalize": "end_node", "human_review": "review_node"}
)

Human-in-the-loop approval:

graph = graph.compile(interrupt_before=["final_action"])
# Execution pauses before final_action for human review

Configuration

# Checkpointing for persistence
from langgraph.checkpoint.sqlite import SqliteSaver
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")

# Compile with configuration
graph = graph.compile(
    checkpointer=checkpointer,          # Enable persistence
    interrupt_before=["human_step"],    # Pause before nodes
    interrupt_after=["critical_step"]   # Pause after nodes
)

# Thread configuration for state isolation
config = {"configurable": {"thread_id": "user123"}}

Best practices

Design idempotent nodes: Each node function should be safe to retry since checkpointing may re-execute nodes after failures.

Use thread IDs consistently: Always provide the same thread_id in config for related conversations to maintain state continuity.

Structure state efficiently: Keep state objects focused and avoid storing large data that doesn't need to persist across checkpoints.

Handle partial failures gracefully: Wrap node functions in try-catch blocks and use state to track error conditions rather than throwing exceptions.

Leverage streaming for long operations: Use stream() instead of invoke() for workflows that take significant time to provide intermediate feedback.

Gotchas and common mistakes

State mutations aren't automatic: Nodes must return state updates explicitly. Modifying the input state object directly won't persist changes.

Checkpointer required for persistence: Without a checkpointer, state is lost between invocations. Memory checkpointers don't survive process restarts.

Thread ID uniqueness matters: Using the same thread_id across different conversations will mix their state. Use unique identifiers per conversation thread.

Interrupt configuration is compile-time: You must specify interrupt points when calling compile(), not at runtime during invoke().

Conditional edges need exhaustive mappings: All possible return values from condition functions must map to valid node names or you'll get runtime errors.

Large state objects impact performance: Checkpointing serializes entire state. Keep heavy data in external storage and reference by ID.

Node names must be strings: Function references as node identifiers don't work. Always use string names when adding nodes.

START and END are reserved: Don't use "START" or "END" as custom node names as they're reserved for graph entry and exit points.