Chapter 5: Middleware (Middleware Pattern)
Goal of this chapter: understand the Middleware pattern and implement Tool error handling and ChatModel retry.
Why Middleware
In Chapter 4 we added Tool capabilities to the Agent, enabling it to access the file system. However, in real-world scenarios, Tool errors and ChatModel errors are common occurrences, such as:
- Tool errors: file not found, invalid arguments, insufficient permissions, etc.
- ChatModel errors: API rate limiting (429), network timeouts, service unavailable, etc.
Problem 1: Tool Errors Interrupt the Entire Flow
When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to break:
[tool call] read_file(file_path: "nonexistent.txt")
Error: open nonexistent.txt: no such file or directory
// Conversation interrupted, user must start over
Problem 2: Model Calls May Fail Due to Rate Limiting
When the model API returns a 429 (Too Many Requests) error, the entire conversation also breaks:
Error: rate limit exceeded (429)
// Conversation interrupted
Desired Behavior
These error messages often should not directly terminate the Agent flow. Instead, we want to pass the error information to the model so it can self-correct and proceed to the next round. For example:
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Let me list the files in the current directory...
[tool call] glob(pattern: "*")
The Role of Middleware
The Middleware pattern can extend the behavior of Tools and ChatModels, making it ideal for solving this problem:
- Middleware is an interceptor for the Agent: inserts custom logic before and after calls
- Middleware can handle errors: converts errors into a format the model can understand
- Middleware can implement retries: automatically retries failed operations
- Middleware is composable: multiple Middlewares can be chained together
Simple analogy:
- Agent = “business logic”
- Middleware = “AOP aspect” (cross-cutting concerns like logging, retry, error handling)
Code Location
- Entry code: cmd/ch05/main.go
Prerequisites
Same as Chapter 1: you need a configured ChatModel (OpenAI or Ark). Also, like Chapter 4, you need to set PROJECT_ROOT:
export PROJECT_ROOT=/path/to/eino # Eino core library root directory
Running
In the examples/quickstart/chatwitheino directory, run:
# Set project root directory
export PROJECT_ROOT=/path/to/your/project
go run ./cmd/ch05
Output example:
you> List files in the current directory
[assistant] Let me list the files for you...
[tool call] list_files(directory: ".")
you> Read a non-existent file
[assistant] Attempting to read the file...
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist...
Key Concepts
Middleware Interface
ChatModelAgentMiddleware is the middleware interface for the Agent:
type ChatModelAgentMiddleware interface {
// BeforeAgent is called before each agent run, allowing modification of
// the agent's instruction and tools configuration.
BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)
// BeforeModelRewriteState is called before each model invocation.
// The returned state is persisted to the agent's internal state and passed to the model.
BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
// AfterModelRewriteState is called after each model invocation.
// The input state includes the model's response as the last message.
AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
// WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
// This method is only called for tools that implement InvokableTool.
WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)
// WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
// This method is only called for tools that implement StreamableTool.
WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)
// WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
// This method is only called for tools that implement EnhancedInvokableTool.
WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)
// WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
// This method is only called for tools that implement EnhancedStreamableTool.
WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)
// WrapModel wraps a chat model with custom behavior.
// This method is called at request time when the model is about to be invoked.
WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
}
Design Philosophy:
- Decorator pattern: each Middleware wraps the original call, and can modify input, output, or errors
- Onion model: requests pass through Middlewares from outside in, responses return from inside out
- Composable: multiple Middlewares execute in order
Middleware Execution Order
Handlers (i.e., Middlewares) are wrapped in array order, forming an onion model:
Handlers: []adk.ChatModelAgentMiddleware{
&middlewareA{}, // Outermost: wrapped first, intercepts requests first, but WrapModel takes effect last
&middlewareB{}, // Middle layer
&middlewareC{}, // Innermost: wrapped last
}
Execution order for Tool calls:
Request → A.Wrap → B.Wrap → C.Wrap → Actual Tool execution → C returns → B returns → A returns → Response
Practical advice: Place safeToolMiddleware (error capture) at the innermost layer (end of the array) to ensure interrupt errors thrown by other Middlewares can propagate outward correctly.
SafeToolMiddleware
SafeToolMiddleware converts Tool errors into strings so the model can understand and handle them:
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
// Convert error to string instead of returning an error
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
Effect:
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Please check the file path...
// Conversation continues, model can adjust strategy based on the error
ModelRetryConfig
ModelRetryConfig configures automatic retry for ChatModel:
type ModelRetryConfig struct {
MaxRetries int // Maximum number of retries
IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable
}
Usage (using DeepAgent as an example):
agent, err := deep.New(ctx, &deep.Config{
// ...
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
// 429 rate limit errors are retryable
return strings.Contains(err.Error(), "429") ||
strings.Contains(err.Error(), "Too Many Requests") ||
strings.Contains(err.Error(), "qpm limit")
},
},
})
Retry strategy:
- Exponential backoff: retry intervals increase with each attempt
- Configurable conditions: use
IsRetryAbleto determine which errors are retryable - Automatic recovery: no user intervention needed
Middleware Implementation
1. Implementing SafeToolMiddleware
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
// Interrupt errors are not converted; they need to propagate
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
// Other errors are converted to strings
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
2. Implementing Streaming Tool Error Handling
func (m *safeToolMiddleware) WrapStreamableToolCall(
_ context.Context,
endpoint adk.StreamableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.StreamableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
sr, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return nil, err
}
// Return a single-frame stream containing the error message
return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
}
// Wrap the stream to capture errors within the stream
return safeWrapReader(sr), nil
}, nil
}
3. Configuring the Agent to Use Middleware
This chapter continues using the DeepAgent introduced in Chapter 4, registering Middleware in its Handlers field:
agent, err := deep.New(ctx, &deep.Config{
Name: "Ch05MiddlewareAgent",
Description: "ChatWithDoc agent with safe tool middleware and retry.",
ChatModel: cm,
Instruction: agentInstruction,
Backend: backend,
StreamingShell: backend,
MaxIteration: 50,
Handlers: []adk.ChatModelAgentMiddleware{
&safeToolMiddleware{}, // Converts Tool errors to strings
},
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
return strings.Contains(err.Error(), "429") ||
strings.Contains(err.Error(), "Too Many Requests") ||
strings.Contains(err.Error(), "qpm limit")
},
},
})
Note: The Handlers field (in configuration) and “Middleware” (the concept discussed in documentation) are the same thing — Handlers is the configuration field name, while ChatModelAgentMiddleware is the interface name.
**Key code snippet (**Note: this is a simplified snippet that cannot run directly. See** [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) for the complete code):
```go
// SafeToolMiddleware captures Tool errors and converts them to strings
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
// Configure DeepAgent (same as Chapter 4, with added Handlers and ModelRetryConfig)
agent, _ := deep.New(ctx, &deep.Config{
ChatModel: cm,
Backend: backend,
StreamingShell: backend,
MaxIteration: 50,
Handlers: []adk.ChatModelAgentMiddleware{
&safeToolMiddleware{},
},
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
return strings.Contains(err.Error(), "429")
},
},
})
Middleware Execution Flow
┌─────────────────────────────────────────┐
│ User: Read a non-existent file │
└─────────────────────────────────────────┘
↓
┌──────────────────────┐
│ Agent analyzes │
│ intent │
│ Decides to call │
│ read_file │
└──────────────────────┘
↓
┌──────────────────────┐
│ SafeToolMiddleware │
│ Intercepts Tool │
│ call │
└──────────────────────┘
↓
┌──────────────────────┐
│ Executes read_file │
│ Returns error │
└──────────────────────┘
↓
┌──────────────────────┐
│ SafeToolMiddleware │
│ Converts error to │
│ string │
└──────────────────────┘
↓
┌──────────────────────┐
│ Returns Tool Result │
│ "[tool error] ..." │
└──────────────────────┘
↓
┌──────────────────────┐
│ Agent generates │
│ reply │
│ "Sorry, file not │
│ found..." │
└──────────────────────┘
Chapter Summary
- Middleware: Agent interceptors that insert custom logic before and after calls
- SafeToolMiddleware: converts Tool errors to strings so the model can understand and handle them
- ModelRetryConfig: configures automatic retry for ChatModel, handling temporary errors like rate limiting
- Decorator pattern: Middleware wraps the original call and can modify input, output, or errors
- Onion model: requests pass through Middlewares from outside in, responses return from inside out
Further Reading
Eino Built-in Middlewares:
| Middleware | Description |
| reduction | Tool output reduction — automatically truncates and offloads to the file system when tool output is too long, preventing context overflow |
| summarization | Automatic conversation history summarization — automatically generates summaries to compress history when token count exceeds a threshold |
| skill | Skill loading middleware — enables the Agent to dynamically load and execute predefined skills |
Middleware chain example:
import (
"github.com/cloudwego/eino/adk/middlewares/reduction"
"github.com/cloudwego/eino/adk/middlewares/summarization"
"github.com/cloudwego/eino/adk/middlewares/skill"
)
// Create reduction middleware: manage tool output length
reductionMW, _ := reduction.New(ctx, &reduction.Config{
Backend: filesystemBackend, // Storage backend
MaxLengthForTrunc: 50000, // Max length for single tool output
MaxTokensForClear: 30000, // Token threshold to trigger cleanup
})
// Create summarization middleware: automatically compress conversation history
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
Model: chatModel, // Model used to generate summaries
Trigger: &summarization.TriggerCondition{
ContextTokens: 190000, // Token threshold to trigger summarization
},
})
// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Handlers: []adk.ChatModelAgentMiddleware{ // Note: config field is Handlers, conceptually equivalent to Middlewares
summarizationMW, // Outermost: conversation history summarization
reductionMW, // Middle layer: tool output reduction
},
})