langgraph python
Low-level orchestration for stateful AI agents
$ npx docs2skills add langgraph-pythonLangGraph
Low-level orchestration for stateful AI agents
What this skill does
LangGraph is a low-level orchestration framework for building stateful, long-running AI agents that can persist through failures and resume execution. Unlike high-level agent frameworks that abstract away control flow, LangGraph gives you direct control over agent orchestration through graph-based workflow definitions.
The framework provides durable execution, meaning agents can survive server restarts, network failures, or crashes by maintaining state and resuming from the last checkpoint. It enables human-in-the-loop workflows where humans can inspect, modify, or approve agent actions at any point in the execution flow.
LangGraph is inspired by Google's Pregel and Apache Beam, built specifically for agent use cases. It integrates seamlessly with LangChain components but can be used standalone with any LLM or tool integration. Companies like Klarna, Replit, and Elastic use it for production agent systems that require reliability and human oversight.
Prerequisites
- Python 3.8+
- Understanding of state machines and graph structures
- Basic familiarity with LLMs and tool calling concepts
- Optional: LangChain components for pre-built integrations
- Optional: LangSmith account for debugging and monitoring
Quick start
pip install langgraph
from langgraph.graph import StateGraph, MessagesState, START, END
def mock_llm(state: MessagesState):
return {"messages": [{"role": "ai", "content": "hello world"}]}
graph = StateGraph(MessagesState)
graph.add_node(mock_llm)
graph.add_edge(START, "mock_llm")
graph.add_edge("mock_llm", END)
graph = graph.compile()
result = graph.invoke({"messages": [{"role": "user", "content": "hi!"}]})
print(result["messages"][-1]["content"]) # "hello world"
Core concepts
State Management: LangGraph centers around state objects that flow between nodes. MessagesState is the most common, tracking conversation history, but you can define custom state schemas for any data structure.
Graph Architecture: Workflows are defined as directed graphs where nodes are functions that receive and modify state, and edges determine execution flow. The graph compiles into an executable workflow engine.
Checkpointing: Built-in persistence automatically saves state at each node execution. When failures occur, the graph resumes from the last successful checkpoint rather than restarting.
Interrupts: Human-in-the-loop functionality allows pausing execution at any node, letting humans inspect state, make modifications, or provide approvals before continuing.
Key API surface
| Function | Purpose |
|---|---|
StateGraph(schema) | Create graph with state schema |
add_node(name, function) | Add processing node |
add_edge(from, to) | Add unconditional edge |
add_conditional_edge(from, condition, mapping) | Add conditional routing |
compile(checkpointer=None, interrupt_before=[], interrupt_after=[]) | Create executable graph |
invoke(input, config=None) | Run graph once |
stream(input, config=None) | Stream execution steps |
get_state(config) | Retrieve current state |
update_state(config, values) | Modify state manually |
Common patterns
Multi-step reasoning workflow:
def researcher(state):
# Research step
return {"messages": [...], "research_data": data}
def analyzer(state):
# Analysis step
return {"messages": [...], "analysis": results}
graph.add_node("research", researcher)
graph.add_node("analyze", analyzer)
graph.add_edge("research", "analyze")
Conditional branching:
def route_decision(state):
if state["confidence"] > 0.8:
return "finalize"
return "human_review"
graph.add_conditional_edge(
"analyzer",
route_decision,
{"finalize": "end_node", "human_review": "review_node"}
)
Human-in-the-loop approval:
graph = graph.compile(interrupt_before=["final_action"])
# Execution pauses before final_action for human review
Configuration
# Checkpointing for persistence
from langgraph.checkpoint.sqlite import SqliteSaver
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")
# Compile with configuration
graph = graph.compile(
checkpointer=checkpointer, # Enable persistence
interrupt_before=["human_step"], # Pause before nodes
interrupt_after=["critical_step"] # Pause after nodes
)
# Thread configuration for state isolation
config = {"configurable": {"thread_id": "user123"}}
Best practices
Design idempotent nodes: Each node function should be safe to retry since checkpointing may re-execute nodes after failures.
Use thread IDs consistently: Always provide the same thread_id in config for related conversations to maintain state continuity.
Structure state efficiently: Keep state objects focused and avoid storing large data that doesn't need to persist across checkpoints.
Handle partial failures gracefully: Wrap node functions in try-catch blocks and use state to track error conditions rather than throwing exceptions.
Leverage streaming for long operations: Use stream() instead of invoke() for workflows that take significant time to provide intermediate feedback.
Gotchas and common mistakes
State mutations aren't automatic: Nodes must return state updates explicitly. Modifying the input state object directly won't persist changes.
Checkpointer required for persistence: Without a checkpointer, state is lost between invocations. Memory checkpointers don't survive process restarts.
Thread ID uniqueness matters: Using the same thread_id across different conversations will mix their state. Use unique identifiers per conversation thread.
Interrupt configuration is compile-time: You must specify interrupt points when calling compile(), not at runtime during invoke().
Conditional edges need exhaustive mappings: All possible return values from condition functions must map to valid node names or you'll get runtime errors.
Large state objects impact performance: Checkpointing serializes entire state. Keep heavy data in external storage and reference by ID.
Node names must be strings: Function references as node identifiers don't work. Always use string names when adding nodes.
START and END are reserved: Don't use "START" or "END" as custom node names as they're reserved for graph entry and exit points.