3. Agent Design Patterns
Building effective agents requires proven patterns that structure how agents think, act, and collaborate. This section covers the most important design patterns, from single-agent architectures to sophisticated multi-agent systems.
What Are Agentic Systems?
At its core, an agentic system is a computational entity designed to:
- Perceive its environment (both digital and potentially physical)
- Reason and make informed decisions based on those perceptions and predefined or learned goals
- Act autonomously to achieve those goals
Unlike traditional software that follows rigid, step-by-step instructions, agents exhibit flexibility and initiative.
Traditional Software vs. Agentic Systems
Example: Customer Inquiry Management
| Traditional System | Agentic System |
|---|---|
| Follows fixed script | Perceives query nuances |
| Linear path | Accesses knowledge bases dynamically |
| Cannot adapt | Interacts with other systems (order management) |
| Passive responses | Asks clarifying questions proactively |
| Reactive | Anticipates future needs |
Core Characteristics of Agentic Systems
The "Canvas" Metaphor
Agentic systems operate on the canvas of your application's infrastructure, utilizing available services and data.
Complexity Challenges
Effectively realizing these characteristics introduces significant complexity:
| Challenge | Question to Address |
|---|---|
| State Management | How does the agent maintain state across multiple steps? |
| Tool Selection | How does it decide when and how to use a tool? |
| Agent Communication | How is communication between different agents managed? |
| Resilience | How do you handle unexpected outcomes or errors? |
| Goal Achievement | How does the agent know when it has succeeded? |
Why Patterns Matter in Agent Development
This complexity is precisely why agentic design patterns are indispensable.
What Are Design Patterns?
Design patterns are not rigid rules. Rather, they are battle-tested templates or blueprints that offer proven approaches to standard design and implementation challenges in the agentic domain.
Key Benefits
| Benefit | Impact on Your Agents |
|---|---|
| Proven Solutions | Avoid reinventing fundamental approaches |
| Common Language | Clearer communication with your team |
| Structure & Clarity | Easier to understand and maintain |
| Reliability | Battle-tested error handling and state management |
| Development Speed | Focus on unique aspects, not foundational mechanics |
| Maintainability | Established patterns others can recognize |
The Pattern Advantage
Without Patterns:
Every agent = Custom implementation
├── Different state management approaches
├── Inconsistent error handling
├── Unique communication protocols
└── Hard to maintain and scale
With Patterns:
All agents = Consistent foundation
├── Standardized patterns for common problems
├── Predictable behavior
├── Easy to extend and modify
└── Scalable architecture
This Chapter's Patterns
This chapter covers 10 fundamental design patterns that represent the core building blocks for constructing sophisticated agents:
Single-Agent Patterns:
- Prompt Chaining (Pipeline)
- ReAct (Reasoning + Acting)
- Plan-and-Solve
- Reflection
- Self-Consistency
Multi-Agent Patterns: 6. Supervisor 7. Hierarchical 8. Sequential 9. Debate
Coordination Patterns: 10. Query Router
Why Multi-Agent Systems?
Single-agent systems have limitations when dealing with complex, multifaceted problems:
Single-Agent Limitations:
- Cognitive Overload: One agent trying to handle all aspects of a complex task
- Lack of Specialization: General-purpose agents may lack deep domain expertise
- No Collaboration: Can't leverage multiple perspectives or approaches
- Sequential Bottleneck: Tasks must wait for previous ones to complete
- Single Point of Failure: If the agent fails, the entire system fails
Multi-Agent Advantages:
| Benefit | Single-Agent | Multi-Agent |
|---|---|---|
| Specialization | ❌ General knowledge | ✅ Deep domain expertise |
| Parallelization | ❌ Sequential only | ✅ Concurrent execution |
| Quality | ⚠️ Variable | ✅ Higher quality |
| Reliability | ❌ Single failure point | ✅ Fault tolerance |
| Scalability | ⚠️ Limited | ✅ Easily scales |
| Complexity | ✅ Simple | ⚠️ Harder to coordinate |
Multi-Agent Architecture Patterns
Multi-agent systems can be organized in several ways:
1. Flat Coordination
All agents are peers, coordinated by a central supervisor:
2. Hierarchical Organization
Agents are organized in levels, with managers at each level:
3. Sequential Pipeline
Agents pass work in a pipeline:
4. Debate/Deliberation
Agents discuss and vote on decisions:
Key Multi-Agent Concepts
1. Agent Roles
Specialized agents have specific responsibilities:
- Researcher: Information gathering and analysis
- Writer: Content creation and drafting
- Coder: Programming and technical implementation
- Reviewer: Quality assurance and validation
- Planner: Task decomposition and scheduling
- Critic: Evaluation and feedback
2. Communication Patterns
How agents exchange information:
- Direct Messaging: Point-to-point communication
- Broadcast: One-to-many announcements
- Shared Memory: Common knowledge base
- Message Queues: Asynchronous communication
- Blackboard: Shared workspace
3. Coordination Mechanisms
How agents work together:
- Centralized: Supervisor makes all decisions
- Decentralized: Agents negotiate among themselves
- Hierarchical: Chain of command with multiple levels
- Peer-to-Peer: Flat organization with voting/consensus
4. Synchronization Strategies
How agents coordinate their actions:
- Sequential: One agent at a time
- Parallel: Independent agents work simultaneously
- Pipeline: Each agent does its part then passes to next
- Adaptive: Dynamic allocation based on workload
When to Use Multi-Agent Systems
Use Multi-Agent Systems When:
- ✅ Task requires multiple distinct skills (research + writing + coding)
- ✅ Subtasks can be parallelized for performance
- ✅ Quality benefits from multiple perspectives
- ✅ System needs fault tolerance and redundancy
- ✅ Task is too complex for one agent to handle well
- ✅ You need specialized domain expertise
Stick with Single-Agent When:
- ❌ Task is simple and straightforward
- ❌ Coordination overhead isn't justified
- ❌ Budget constraints favor minimal LLM calls
- ❌ Task requires tight, immediate integration between steps
- ❌ Speed is more important than quality
These patterns provide a toolkit for building agents that can:
- Process complex multi-step tasks
- Coordinate with other agents
- Maintain context across interactions
- Handle errors gracefully
- Scale from simple to complex workflows
Pattern Selection Quick Reference
Pattern Complexity Guide
| Pattern | Complexity | Best For | Learning Curve |
|---|---|---|---|
| Prompt Chaining | ⭐ | Multi-step workflows | Low |
| ReAct | ⭐ | Tool-using agents | Low |
| Sequential | ⭐ | Pipelines | Low |
| Reflection | ⭐⭐ | Quality improvement | Medium |
| Plan-and-Solve | ⭐⭐ | Well-defined goals | Medium |
| Router | ⭐⭐ | Query classification | Medium |
| Self-Consistency | ⭐⭐ | Reducing randomness | Medium |
| Supervisor | ⭐⭐⭐ | Complex workflows | High |
| Debate | ⭐⭐⭐ | Decision making | High |
| Hierarchical | ⭐⭐⭐⭐ | Large systems | Very High |
Now let's dive into the patterns, starting with foundational single-agent approaches.
3.1 Single-Agent Patterns
Pattern 1: Prompt Chaining (Pipeline Pattern)
By deconstructing complex problems into a sequence of simpler, more manageable sub-tasks, prompt chaining provides a robust framework for guiding large language models. This "divide-and-conquer" strategy significantly enhances the reliability and control of the output by focusing the model on one specific operation at a time.
What is Prompt Chaining?
Prompt chaining, sometimes referred to as the Pipeline pattern, represents a powerful paradigm for handling intricate tasks when leveraging large language models (LLMs). Rather than expecting an LLM to solve a complex problem in a single, monolithic step, prompt chaining advocates for a divide-and-conquer strategy.
The Core Idea:
- Break down the original, daunting problem into a sequence of smaller, more manageable sub-problems
- Each sub-problem is addressed individually through a specifically designed prompt
- The output generated from one prompt is strategically fed as input into the subsequent prompt in the chain
- This establishes a dependency chain where context and results of previous operations guide subsequent processing
Why Use It? (Problems with Single Prompts)
For multifaceted tasks, using a single, complex prompt for an LLM can be inefficient and unreliable:
| Issue | Description | Example |
|---|---|---|
| Instruction Neglect | Model overlooks parts of the prompt | "Summarize AND extract data AND draft email" - model may only summarize |
| Contextual Drift | Model loses track of initial context | Long prompts cause the model to forget early instructions |
| Error Propagation | Early errors amplify through the response | Wrong analysis in step 1 affects all subsequent steps |
| Context Window Limits | Insufficient information for complex tasks | Can't fit all requirements in one prompt |
| Increased Hallucination | Higher cognitive load = more errors | Complex multi-step requests generate incorrect information |
Example Failure Scenario:
Query: "Analyze this market research report, summarize findings,
identify trends with data points, and draft an email to the marketing team."
Likely Result: Model summarizes well but fails to extract specific
data or drafts a poor email because the cognitive load is too high.
Enhanced Reliability Through Sequential Decomposition
Prompt chaining addresses these challenges by breaking the complex task into a focused, sequential workflow:
Example Chained Approach:
Step 1: Summarization
Prompt: "Summarize the key findings of the following market research report: [text]"
Focus: Summarization only
Step 2: Trend Identification
Prompt: "Using the summary, identify the top three emerging trends and
extract the specific data points that support each trend: [output from step 1]"
Focus: Data extraction
Step 3: Email Composition
Prompt: "Draft a concise email to the marketing team that outlines
the following trends and their supporting data: [output from step 2]"
Focus: Communication
Key Mechanisms
1. Role Assignment at Each Stage
Assign distinct roles to every stage for improved focus:
2. Structured Output
The reliability of a prompt chain is highly dependent on the integrity of the data passed between steps. Specifying a structured output format (JSON, XML) is crucial.
// Example: Structured output for trend identification
public record TrendData(
String trendName,
String supportingData
) {}
// Output format
TrendData[] trends = {
new TrendData(
"AI-Powered Personalization",
"73% of consumers prefer brands that use personal information for relevant shopping"
),
new TrendData(
"Sustainable Brands",
"ESG product sales grew 28% vs 20% for products without ESG claims"
)
};
This structured format ensures that the data is machine-readable and can be precisely parsed and inserted into the next prompt without ambiguity.
Practical Applications & Use Cases
1. Information Processing Workflows
Prompt 1: Extract text content from a document
↓
Prompt 2: Summarize the cleaned text
↓
Prompt 3: Extract specific entities (names, dates, locations)
↓
Prompt 4: Use entities to search knowledge base
↓
Prompt 5: Generate final report
Applications: Automated content analysis, AI research assistants, complex report generation
2. Complex Query Answering
Question: "What were the main causes of the 1929 stock market crash, and how did government policy respond?"
Prompt 1: Identify core sub-questions (causes, government response)
↓
Prompt 2: Research causes of the crash
↓
Prompt 3: Research government policy response
↓
Prompt 4: Synthesize information into coherent answer
3. Data Extraction and Transformation
Prompt 1: Extract fields from invoice (name, address, amount)
↓
Processing: Validate all required fields present
↓
Prompt 2 (Conditional): If missing/malformed, retry with specific focus
↓
Processing: Validate results again
↓
Output: Structured, validated data
Applications: OCR processing, form data extraction, invoice processing
4. Content Generation Workflows
Prompt 1: Generate 5 topic ideas
↓
Processing: User selects best idea
↓
Prompt 2: Generate detailed outline
↓
Prompt 3-N: Write each section (with context from previous sections)
↓
Final Prompt: Review and refine for coherence and tone
Applications: Creative writing, technical documentation, blog generation
5. Conversational Agents with State
Prompt 1: Process user utterance, identify intent and entities
↓
Processing: Update conversation state
↓
Prompt 2: Based on state, generate response and identify next needed info
↓
Repeat for subsequent turns...
6. Code Generation and Refinement
Prompt 1: Generate pseudocode/outline
↓
Prompt 2: Write initial code draft
↓
Prompt 3: Identify errors and improvements
↓
Prompt 4: Refine code based on issues
↓
Prompt 5: Add documentation and tests
Implementation: Spring AI Example
@Service
public class PromptChainingService {
@Autowired
private ChatClient chatClient;
/**
* Chain: Extract → Transform to JSON → Validate
*/
public String processTechnicalSpecs(String inputText) {
// Step 1: Extract Information
String extracted = extractSpecs(inputText);
log.info("Step 1 - Extracted: {}", extracted);
// Step 2: Transform to JSON
String json = transformToJson(extracted);
log.info("Step 2 - JSON: {}", json);
// Step 3: Validate
boolean isValid = validateJson(json);
if (!isValid) {
// Retry with refinement
json = refineJson(extracted);
}
return json;
}
private String extractSpecs(String text) {
return chatClient.prompt()
.system("You are a technical specification extractor.")
.user("Extract the technical specifications from: {text}")
.param("text", text)
.call()
.content();
}
private String transformToJson(String specs) {
return chatClient.prompt()
.system("You are a data formatter. Always return valid JSON.")
.user("""
Transform these specifications into a JSON object with
'cpu', 'memory', and 'storage' as keys:
{specs}
Return ONLY the JSON object, no additional text.
""".formatted(specs))
.call()
.content();
}
private boolean validateJson(String json) {
try {
ObjectMapper mapper = new ObjectMapper();
mapper.readTree(json);
return true;
} catch (Exception e) {
return false;
}
}
private String refineJson(String specs) {
return chatClient.prompt()
.system("You are a JSON expert. Fix invalid JSON.")
.user("""
The following output was not valid JSON. Please fix it:
{specs}
Return ONLY valid JSON.
""".formatted(specs))
.call()
.content();
}
}
Advanced Pattern: Parallel + Sequential
Complex operations often combine parallel processing for independent tasks with prompt chaining for dependent steps:
Example Implementation:
@Service
public class ParallelSequentialService {
@Autowired
private ChatClient chatClient;
public String generateComprehensiveReport(List<String> articleUrls) {
// Parallel Phase: Extract info from all articles concurrently
List<CompletableFuture<ArticleInfo>> futures = articleUrls.stream()
.map(url -> CompletableFuture.supplyAsync(
() -> extractArticleInfo(url), executor))
.toList();
// Wait for all parallel extractions
List<ArticleInfo> infos = futures.stream()
.map(CompletableFuture::join)
.toList();
// Sequential Phase: Chain of dependent operations
String collated = collateData(infos);
String draft = synthesizeDraft(collated);
String refined = reviewAndRefine(draft);
return refined;
}
private String collateData(List<ArticleInfo> infos) {
// Step 1 in sequential chain
return chatClient.prompt()
.user("Collate these article extracts into organized notes: {infos}")
.param("infos", infos.toString())
.call()
.content();
}
private String synthesizeDraft(String collated) {
// Step 2: Uses output from step 1
return chatClient.prompt()
.user("Write a comprehensive report based on: {collated}")
.param("collated", collated)
.call()
.content();
}
private String reviewAndRefine(String draft) {
// Step 3: Uses output from step 2
return chatClient.prompt()
.user("Review and improve this report for clarity and accuracy: {draft}")
.param("draft", draft)
.call()
.content();
}
}
Limitations
| Limitation | Description | Mitigation |
|---|---|---|
| Latency | Multiple sequential LLM calls = slower | Parallelize independent steps where possible |
| Cost | Each step consumes tokens | Use smaller models for intermediate steps |
| Error Accumulation | Errors in early steps affect later steps | Add validation and retry logic between steps |
| Complexity | More moving parts to manage | Use frameworks (LangChain, LangGraph) for orchestration |
| State Management | Passing state between steps can be complex | Use structured formats and define clear contracts |
Relationship to Context Engineering
Prompt chaining is a foundational technique that enables Context Engineering - the systematic discipline of designing and delivering a complete informational environment to AI models.
Context Engineering Components:
- System Prompt: Foundational instructions (e.g., "You are a technical writer")
- Retrieved Documents: Fetched from knowledge base
- Tool Outputs: Results from API calls or database queries
- Implicit Data: User identity, interaction history, environmental state
Prompt chaining enables the iterative refinement of this context, creating a feedback loop where each step enriches the informational environment for the next.
When to Use Prompt Chaining
| Scenario | Use Chaining? | Reason |
|---|---|---|
| Simple Q&A | ❌ No | Single prompt sufficient |
| Multi-step reasoning | ✅ Yes | Each step needs dedicated focus |
| External tool integration | ✅ Yes | Need to process tool outputs |
| Content generation pipeline | ✅ Yes | Natural progression (outline → draft → refine) |
| Data extraction | ✅ Yes | May need validation and retry |
| Real-time requirements | ❌ Maybe | Consider latency impact |
Best Practices
- Design Backwards: Start with the final output format and work backwards
- Validate Between Steps: Check outputs before passing to next prompt
- Use Structured Formats: JSON/XML for machine-readable intermediate outputs
- Assign Clear Roles: Different system prompts for each stage
- Handle Failures Gracefully: Implement retry logic for individual steps
- Monitor Token Usage: Chain length can quickly increase costs
- Log Intermediate Outputs: Essential for debugging and optimization
Pattern 2: ReAct Agent
The foundational pattern for tool-using agents.
How It Works
Implementation Steps
- Thought: Agent reasons about what to do
- Action: Agent executes a tool
- Observation: Agent observes the result
- Iterate: Repeat until goal is achieved