Summarization
π‘ This middleware was introduced in v0.8.0. Package path:
github.com/cloudwego/eino/adk/middlewares/summarization
Overview
The Summarization middleware automatically invokes a summarization model to compress conversation history when the conversation token count exceeds a threshold, keeping long conversations coherent within the model’s context window. The middleware is mounted on the BeforeModelRewriteState hook, checking trigger conditions before each model call. When triggered, it executes: counting β summary generation (with retry/failover) β post-processing β state replacement.
Generic System
All core types and functions in this package provide both Typed generic versions (M adk.MessageType) and non-generic aliases (fixed to *schema.Message).
| Generic Version | Non-generic Alias (= Typed\[*schema.Message\]) |
TypedConfig[M] | Config |
NewTyped[M](ctx, *TypedConfig[M]) | New(ctx, *Config) |
TypedTokenCounterFunc[M] | TokenCounterFunc |
TypedGenModelInputFunc[M] | GenModelInputFunc |
TypedGetFailoverModelFunc[M] | GetFailoverModelFunc |
TypedFinalizeFunc[M] | FinalizeFunc |
TypedCallbackFunc[M] | CallbackFunc |
TypedUserMessageFilterFunc[M] | UserMessageFilterFunc |
TypedPreserveUserMessages[M] | PreserveUserMessages |
TypedRetryConfig[M] | RetryConfig |
TypedFailoverConfig[M] | FailoverConfig |
TypedFailoverContext[M] | FailoverContext |
TypedFinalizerBuilder[M] | FinalizerBuilder |
Unless otherwise noted, type signatures in the following documentation use the generic form M. When using non-generic aliases, M = *schema.Message.
Constructor
// Generic version β supports *schema.Message and *schema.AgenticMessage
func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error)
// Non-generic version β equivalent to NewTyped[*schema.Message]
func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error)
TypedConfig[M] Configuration
| Field | Type | Required | Default | Description |
| Model | model.BaseModel[M] | Yes | β | Model used for generating summaries |
| ModelOptions | []model.Option | No | β | Options passed to the summarization model |
| TokenCounter | TypedTokenCounterFunc[M] | No | Uses total_tokens from the most recent assistant message as baseline, estimates incremental messages at ~4 chars/token | Custom token counting function |
| Trigger | *TriggerCondition | No | ContextTokens=160,000 | Conditions for triggering summarization |
| UserInstruction | string | No | Built-in prompt | Custom user-level summarization instruction, overrides the default instruction |
| TranscriptFilePath | string | No | β | Full conversation transcript file path, appended to the summary to remind the model of original context location. Only effective when Finalize is not set |
| GenModelInput | TypedGenModelInputFunc[M] | No | sysInstruction β contextMsgs β userInstruction | Full control over building the summarization model input |
| Finalize | TypedFinalizeFunc[M] | No | Built-in post-processing | Custom summary post-processing. When set, the middleware no longer performs any default post-processing |
| Callback | TypedCallbackFunc[M] | No | β | Called after Finalize, with parameters before, after adk.TypedChatModelAgentState[M](value types), read-only |
| EmitInternalEvents | bool | No | false | Whether to emit internal events at key points |
| PreserveUserMessages | *TypedPreserveUserMessages[M] | No | Enabled: true | Preserve original user messages in the summary. Only effective when Finalize is not set |
| Retry | *TypedRetryConfig[M] | No | nil (no retry) | Retry strategy for the primary model's summary generation |
| Failover | *TypedFailoverConfig[M] | No | nil | Failover strategy when the primary model fails |
π‘ Finalize override semantics: Once a custom
Finalizeis set, the middleware will skip all default post-processing β bothPreserveUserMessagesandTranscriptFilePathwill no longer take effect. To reuse default post-processing logic within a custom Finalize, use theDefaultFinalizerfunction.
Sub-configuration Structs
TriggerCondition
Summarization is triggered when any one condition is met.
type TriggerCondition struct {
ContextTokens int // Trigger when token count exceeds this threshold
ContextMessages int // Trigger when message count exceeds this threshold
}
TypedPreserveUserMessages[M]
When enabled, replaces the <all_user_messages>...</all_user_messages> section in the summary with the most recent original user messages.
type TypedPreserveUserMessages[M adk.MessageType] struct {
Enabled bool
MaxTokens int // Maximum tokens for preserved user messages; defaults to TriggerCondition.ContextTokens / 3
Filter TypedUserMessageFilterFunc[M] // Filter function; return false to exclude a message from preservation
}
TypedRetryConfig[M]
type TypedRetryConfig[M adk.MessageType] struct {
MaxRetries *int // Default 3
ShouldRetry func(ctx context.Context, resp M, err error) bool // Default: retry when err != nil
BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // Default: exponential backoff + jitter
}
TypedFailoverConfig[M]
type TypedFailoverConfig[M adk.MessageType] struct {
MaxRetries *int // Default 3
ShouldFailover func(ctx context.Context, resp M, err error) bool // Default: failover when err != nil
BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration
GetFailoverModel TypedGetFailoverModelFunc[M] // Returns (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error)
}
TypedFailoverContext[M]
Context passed to the GetFailoverModel callback.
type TypedFailoverContext[M adk.MessageType] struct {
Attempt int // Current failover attempt number, starting from 1
SystemInstruction M // System instruction (set internally by the middleware, not configurable)
UserInstruction M // User instruction
OriginalMessages []M // Original full conversation
LastModelResponse M // Model response from the last attempt
LastErr error
}
TypedTokenCounterInput[M]
type TypedTokenCounterInput[M adk.MessageType] struct {
Messages []M
Tools []*schema.ToolInfo
}
Function Type Signature Quick Reference
type TypedTokenCounterFunc[M] func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error)
type TypedGenModelInputFunc[M] func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error)
type TypedGetFailoverModelFunc[M] func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error)
type TypedFinalizeFunc[M] func(ctx context.Context, originalMessages []M, summary M) ([]M, error)
type TypedCallbackFunc[M] func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error
type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error)
DefaultFinalizer
DefaultFinalizer is a standalone factory function that returns a TypedFinalizeFunc[M] consistent with the middleware’s default post-processing logic. Use it when you need to reuse the default logic (preserving user messages, appending transcript path, etc.) within a custom Finalize.
func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error)
DefaultFinalizerConfig[M]
type DefaultFinalizerConfig[M adk.MessageType] struct {
PreserveUserMessages *TypedPreserveUserMessages[M] // Default Enabled=true, MaxTokens=30000
TranscriptFilePath string
}
Example: Execute default post-processing first within a custom Finalize, then add a system message:
defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{
TranscriptFilePath: "/path/to/transcript.txt",
})
if err != nil {
// handle error
}
cfg := &summarization.Config{
Model: yourModel,
Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
msgs, err := defaultFinalize(ctx, originalMessages, summary)
if err != nil {
return nil, err
}
// Add a system message before the summary
return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil
},
}
FinalizerBuilder
TypedFinalizerBuilder[M] provides a chainable API for building TypedFinalizeFunc[M], supporting linking multiple handlers (Handler) and an optional custom finalizer (Custom).
func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M]
func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message]
func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M]
func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M]
func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error)
Execution order: Handlers transform the summary sequentially in registration order β Custom determines the final output message list. If Custom is not set, []M{summary} is returned.
PreserveSkills
Preserves Skill content loaded by the Skill middleware after summary compression, ensuring the agent retains skill knowledge after context window compression.
type PreserveSkillsConfig struct {
SkillToolName string // Skill tool name, must match the Skill middleware. Default "skill"
MaxSkills *int // Maximum number of skills to preserve. Default 5; 0 disables
MaxTokensPerSkill *int // Maximum tokens per skill, truncated if exceeded. Default 5000
SkillsTokenBudget *int // Total token budget for all skills. Default 25000
}
Example:
finalizer, err := summarization.NewFinalizer().
PreserveSkills(&summarization.PreserveSkillsConfig{}).
Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil
}).
Build()
cfg := &summarization.Config{
Model: yourModel,
Finalize: finalizer,
}
Summarize Method
TypedMiddleware[M] exposes a Summarize method that allows manual execution of a single summarization outside of the middleware’s automatic triggering:
func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error)
This method executes the complete summarization flow (generation β post-processing β Callback β events) but does not check trigger conditions. Returns the replaced message list.
How It Works
Trigger condition check: First checks ContextMessages (message count), then calculates the token count via TokenCounter and compares against ContextTokens. Either condition being met triggers summarization.
Default post-processing (when Finalize is not set):
- Replaces the
<all_user_messages>...</all_user_messages>section in the summary with the most recent original user messages (controlled byPreserveUserMessages) - Appends the
TranscriptFilePathhint - Adds the summary preamble and continuation instructions
Internal Events
When EmitInternalEvents = true, the middleware emits events via adk.TypedSendEvent:
| Event Type | Trigger Timing | Carried Data |
ActionTypeBeforeSummarize | After trigger conditions are met, before calling the model | TypedBeforeSummarizeAction[M]{Messages}: original message list |
ActionTypeGenerateSummary | After each model generation attempt (including retries/failover) | TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()} |
ActionTypeAfterSummarize | After summarization is complete and Finalize has run | TypedAfterSummarizeAction[M]{Messages}: final message list |
Events are wrapped in TypedCustomizedAction[M] and placed in the adk.AgentAction.CustomizedAction field. GenerateSummaryPhase has two values: GenerateSummaryPhasePrimary (primary model/retry) and GenerateSummaryPhaseFailover (failover).
Usage Examples
Minimal Configuration
mw, err := summarization.New(ctx, &summarization.Config{
Model: yourChatModel,
})
agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Model: yourChatModel,
Middlewares: []adk.ChatModelAgentMiddleware{mw},
})
Custom Trigger Conditions + Retry + Failover
mw, err := summarization.New(ctx, &summarization.Config{
Model: yourChatModel,
Trigger: &summarization.TriggerCondition{
ContextTokens: 100000,
ContextMessages: 80,
},
TranscriptFilePath: "/path/to/transcript.txt",
Retry: &summarization.RetryConfig{
MaxRetries: ptrOf(2),
},
Failover: &summarization.FailoverConfig{
MaxRetries: ptrOf(3),
GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) {
return backupModel, nil, nil // Returning nil input reuses the default input
},
},
})
FinalizerBuilder + PreserveSkills + DefaultFinalizer
defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message](
&summarization.DefaultFinalizerConfig[*schema.Message]{
TranscriptFilePath: "/path/to/transcript.txt",
},
)
finalizer, err := summarization.NewFinalizer().
PreserveSkills(&summarization.PreserveSkillsConfig{
MaxSkills: ptrOf(3),
}).
Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
msgs, err := defaultFinalize(ctx, origMsgs, summary)
if err != nil {
return nil, err
}
return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil
}).
Build()
cfg := &summarization.Config{
Model: yourModel,
Finalize: finalizer,
}
Notes
- Set TranscriptFilePath: It is strongly recommended to provide a conversation transcript file path so the model can refer back to details from the original transcript after summarization.
- Adjust trigger thresholds:
Trigger.ContextTokensshould be set to 80-90% of the model’s context window. The default value of 160,000 is suitable for models with a 200k window. - Custom TokenCounter: For production environments, it is recommended to implement a counter that precisely matches the model’s tokenizer. The default estimator uses
ResponseMeta.Usage.TotalTokensfrom the most recent assistant message as a baseline and estimates incremental messages at ~4 chars/token. - Finalize override: Once
Finalizeis set,PreserveUserMessagesandTranscriptFilePathno longer take effect automatically. To reuse them, useDefaultFinalizerorFinalizerBuilder. - GetFailoverModel constraints: The callback must return a non-nil model and a non-empty input message list.
