Build stateful multi-agent workflows in Java β with graphs, retries, and persistence.
No orchestration code. No glue logic. Just define your agents and run.
π΄ Explore a real-world B2B USE CASE built with Spring Agent Flow:
π https://huggingface.co/spaces/datallmhub/multi-agent-customer-ops
- Multi-agent orchestration (Triage β Lookup β Policy β Writer)
- Hybrid AI + deterministic business logic
- Typed shared state across agents
- End-to-end decision traceability
You can run the demo locally using the full source code:
π https://github.com/datallmhub/multi-agent-customer-ops
ExecutorAgent researcher = ExecutorAgent.builder()
.chatClient(chatClient)
.systemPrompt("Find key facts.")
.build();
ExecutorAgent writer = ExecutorAgent.builder()
.chatClient(chatClient)
.systemPrompt("Write a clear report.")
.build();
CoordinatorAgent coordinator = CoordinatorAgent.builder()
.executors(Map.of("research", researcher, "writing", writer))
.routingStrategy(RoutingStrategy.llmDriven(chatClient))
.build();
AgentResult result = coordinator.execute(
AgentContext.of("Compare Claude 4 and GPT-5"));
System.out.println(result.text());Output:
=== Multi-Agent Coordination ===
Request: Compare Claude 4 and GPT-5
[router] Routing to: research
[research] Gathering facts...
[router] Routing to: writing
[writing] Generating report...
Result:
Claude 4 excels in reasoning and long-context tasks.
GPT-5 shows stronger tool integration and instruction following.
This is a multi-step, stateful workflow with routing, coordination, and resilience β without writing orchestration code.
β If this saves you time, consider starring the repo.
Real-world AI systems are not one LLM call.
They are:
- multi-step
- stateful
- failure-prone
- long-running
Spring AI gives you primitives. spring-agent-flow gives you a runtime.
A coordinator routes tasks across agents, executing a graph with shared state, retries, and checkpoints.
Dynamic routing, minimal setup. A CoordinatorAgent routes to ExecutorAgents β you focus on the agents, not the plumbing.
CoordinatorAgent coordinator = CoordinatorAgent.builder()
.executors(Map.of(
"research", researchExecutor,
"analysis", analysisExecutor,
"writing", writingExecutor
))
.routingStrategy(RoutingStrategy.llmDriven(chatClient))
.build();
AgentResult result = coordinator.execute(AgentContext.of("..."));Explicit flows, loops, conditions, full control.
AgentGraph graph = AgentGraph.builder()
.addNode("research", researcher)
.addNode("analyze", analyzer)
.addNode("write", writer)
.addEdge("research", "analyze")
.addEdge(Edge.conditional("analyze",
ctx -> ctx.get(CONFIDENCE).doubleValue() < 0.7,
"research")) // loop back
.addEdge("analyze", "write") // fallback: forward
.errorPolicy(ErrorPolicy.RETRY_ONCE)
.build();
AgentResult result = graph.invoke(AgentContext.of("..."));Use it if:
- your agent needs multiple LLM calls
- your workflow has branches or loops
- failures (retry, resume, rate limits) matter
- multiple agents must coordinate
Avoid it if:
- you just call
ChatClientonce
| Approach | Limitation |
|---|---|
| Spring AI alone | Low-level primitives only β you write the orchestration |
Manual while loops |
Don't scale, retries are hard, state becomes fragile |
| LangChain-style flows | Limited execution control, Python-first |
spring-agent-flow provides:
- explicit execution graphs
- built-in resilience (retry + circuit breaker)
- durable, typed state
| Spring AI | spring-agent-flow |
|---|---|
Primitives (ChatClient, tools) |
Structured runtime (AgentGraph, CoordinatorAgent) |
| Manual orchestration | Graph-based execution |
| No durable state | Typed shared state + checkpoints |
| Retry logic in user code | Built-in retry + circuit breaker |
| No resume | Interrupt + resume support |
git clone https://github.com/datallmhub/spring-agent-flow.git
cd spring-agent-flow
mvn install -DskipTests -q
mvn -pl spring-agent-flow-samples exec:javaπ Runs a real multi-agent workflow with routing, coordination, and state β fully simulated.
The project ships with ready-to-run examples β no LLM required.
| Example | What it shows | Run |
|---|---|---|
MultiAgentCoordination |
Multi-agent routing with CoordinatorAgent | default |
MinimalPipeline |
Simple 2-step workflow using AgentGraph | -Dexec.mainClass="...MinimalPipeline" |
AdvancedGraphDemo |
Loops, conditions, state, listeners | -Dexec.mainClass="...AdvancedGraphDemo" |
π Start with MultiAgentCoordination β it demonstrates the full power of the framework.
- β‘ No orchestration code required
- π§ Stateful agent workflows
- π Built-in retries & circuit breakers
- π Graph-based execution
- πΎ Durable checkpoints (JDBC / Redis)
- π Native Spring AI integration
- π‘ Streaming support
- π Micrometer metrics
Layered architecture showing coordination, execution, resilience, and persistence on top of Spring AI.
Requirements: Java 17+, Spring Boot 3.x, Spring AI 1.0+
Distributed via JitPack.
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
<dependency>
<groupId>com.github.datallmhub.spring-agent-flow</groupId>
<artifactId>spring-agent-flow-starter</artifactId>
<version>v0.5.0</version>
</dependency>repositories {
maven { url 'https://jitpack.io' }
}
dependencies {
implementation 'com.github.datallmhub.spring-agent-flow:spring-agent-flow-starter:v0.5.0'
}| Module | Use case |
|---|---|
spring-agent-flow-starter |
Spring Boot auto-config, properties, Micrometer listener |
spring-agent-flow-core |
Minimal API (Agent, AgentContext, StateKey, AgentResult) |
spring-agent-flow-graph |
AgentGraph, RetryPolicy, CircuitBreakerPolicy SPI, checkpoint contract |
spring-agent-flow-squad |
CoordinatorAgent, ExecutorAgent, ReActAgent, ParallelAgent, RoutingStrategy |
spring-agent-flow-checkpoint |
JdbcCheckpointStore, RedisCheckpointStore, Jackson codec |
spring-agent-flow-resilience4j |
CircuitBreakerPolicy adapter backed by Resilience4j |
spring-agent-flow-cli-agents |
CliAgentNode β runs Claude Code / Codex / Gemini CLI agents as graph nodes |
spring-agent-flow-test |
MockAgent, TestGraph for unit-testing graphs |
Minimal application.yml:
spring:
ai:
agents:
enabled: true
default-error-policy: RETRY_ONCE
observability:
metrics: truegraph.invokeStream(AgentContext.of("hello"))
.subscribe(event -> {
switch (event) {
case AgentEvent.Token t -> System.out.print(t.chunk());
case AgentEvent.NodeTransition x -> System.out.println("\n--> " + x.to());
case AgentEvent.Completed c -> System.out.println("\n[done]");
default -> {}
}
});// Declare keys with types β compile-time safety
StateKey<Double> CONFIDENCE = StateKey.of("confidence", Double.class);
StateKey<String> SUMMARY = StateKey.of("summary", String.class);
// Use them anywhere
AgentContext ctx = context.with(CONFIDENCE, 0.85);
double score = ctx.get(CONFIDENCE); // no cast neededAgentGraph.builder()
.errorPolicy(ErrorPolicy.FAIL_FAST) // or RETRY_ONCE / SKIP_NODE
.retryPolicy(RetryPolicy.exponential(3, Duration.ofMillis(200)))
.addNode("llm", flakyAgent,
RetryPolicy.exponential(5, Duration.ofMillis(500)), // per-node override
new Resilience4jCircuitBreakerPolicy(registry)) // per-node breaker
.build();See resilient-typed-executor.md and circuit-breaker.md.
| Metric | Tags | Description |
|---|---|---|
agents.execution.count |
agent, graph, status |
Per-node execution count |
agents.execution.duration |
agent, graph |
Per-node execution time |
agents.graph.transitions |
graph, from, to |
Node-to-node transitions |
agents.execution.errors |
agent, graph, cause |
Error count by type |
MockAgent mock = MockAgent.builder()
.thenReturn("First response")
.thenReturn("Second response")
.build();
TestGraph.Trace trace = TestGraph.trace(
AgentGraph.builder()
.addNode("a", mock)
.addNode("b", MockAgent.returning("done"))
.addEdge("a", "b"));
AgentResult result = trace.invoke(AgentContext.of("test"));
assertThat(trace.visitedInOrder("a", "b")).isTrue();
assertThat(result.text()).isEqualTo("done");- ReAct loop β self-correcting agent with observation/action cycles
- Supervisor pattern β coordinator re-routes until done
- Parallel executors β fan-out/fan-in
- Subgraphs β plug a graph in as a node
- Human-in-the-loop β interrupt, wait for human input, resume
- Durable runs β JDBC or Redis checkpoint store, resume after crash
- Resilient typed executor β tool audit + typed output + retry
- Circuit breaker β trip upstream calls with Resilience4j
| Version | Focus |
|---|---|
| 0.5 (current) | Subgraphs, parallel fan-out, cancellation, typed output, RetryPolicy, CircuitBreakerPolicy, JDBC/Redis checkpoint store |
| 1.0 | API stabilization, documentation, community feedback |
| 1.1 | Crew roles (CrewAI-inspired), auto-config for checkpoint backends |
| 2.0 | OpenTelemetry tracing, MCP integration, Agent-as-Tool |
This project is independent and not affiliated with spring-ai-community/agent-client.
That project focuses on CLI agent integrations (Claude Code, Codex, Gemini).
spring-agent-flow focuses on something different: a graph-based runtime for stateful, multi-step agent workflows on top of Spring AI.
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
This project follows the Apache 2.0 License.
- LangGraph β graph-based orchestration
- CrewAI β role-based agent teams
- AWS Strands β agent patterns for Java
- Spring AI β the foundation we build on

