Reduction

adk/middlewares/reduction

πŸ’‘ This middleware was introduced in v0.8.0.

Overview

The reduction middleware manages the token count occupied by tool outputs in Agent conversations, operating in two phases:

  1. Truncation: Triggered immediately when a tool call returns. When a single output exceeds MaxLengthForTrunc, the full content is stored in the Backend and the message is replaced with a truncated summary.
  2. Clear: Triggered before model calls (BeforeModelRewriteState). When the total tokens exceed MaxTokensForClear, it traverses the message history and offloads old tool arguments and results to the Backend.

Architecture

Tool call returns result
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     WrapInvokableToolCall / WrapStreamableToolCall          β”‚
β”‚     WrapEnhancedInvokableToolCall / WrapEnhancedStreamable  β”‚
β”‚                                                             β”‚
β”‚  Truncation (can be skipped via SkipTruncation)             β”‚
β”‚    Result length > MaxLengthForTrunc?                       β”‚
β”‚      Yes β†’ Truncate content, store full content in Backend  β”‚
β”‚      No β†’ Return as-is                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
                    Result added to Messages
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  BeforeModelRewriteState                    β”‚
β”‚                                                             β”‚
β”‚  Clear (can be skipped via SkipClear)                       β”‚
β”‚    Total tokens > MaxTokensForClear?                        β”‚
β”‚      Yes β†’ ClearMessageRewriter preprocessing               β”‚
β”‚         β†’ Store old tool results in Backend, replace with   β”‚
β”‚           file paths                                        β”‚
β”‚         β†’ ClearAtLeastTokens minimum release check          β”‚
β”‚         β†’ ClearPostProcess callback                         β”‚
β”‚      No β†’ Do nothing                                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
                     Call Model

Generic System

This middleware follows the ADK standard generic pattern, supporting both *schema.Message and *schema.AgenticMessage:

// Generic config, M constrained to adk.MessageType
type TypedConfig[M adk.MessageType] struct { ... }

// Backward-compatible alias
type Config = TypedConfig[*schema.Message]

The constructor also comes in both generic and non-generic forms:

func NewTyped[M adk.MessageType](ctx context.Context, config *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error)
func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error)

Configuration

TypedConfig[M] Main Configuration

FieldTypeDescription
Backend
Backend
Storage backend. Required when
SkipTruncation
is false; can be nil when only doing Clear without offload.
SkipTruncation
bool
Skip the truncation phase.
SkipClear
bool
Skip the clear phase.
ReadFileToolName
string
Name of the tool used to read offloaded content. Default
"read_file"
.
RootDir
string
Root directory for saving content. Default
"/tmp"
. Truncated content is saved to
{RootDir}/trunc/{tool_call_id}
, cleared content to
{RootDir}/clear/{tool_call_id}
.
GenTruncOffloadFilePath
func(ctx, *ToolDetail) (string, error)
Custom truncation file path generation. When set, RootDir does not apply to truncation. Useful when tool_call_id is not unique.
GenClearOffloadFilePath
func(ctx, *ToolDetail) (string, error)
Custom clear file path generation. When set, RootDir does not apply to clear.
MaxLengthForTrunc
int
Maximum character length to trigger truncation. Default
50000
.
TruncExcludeTools
[]string
List of tool names exempt from truncation.
TokenCounter
func(ctx, []M, []*schema.ToolInfo) (int64, error)
Token counting function. Default estimates using char_count/4. Recommended to replace with tiktoken-go/tokenizer.
MaxTokensForClear
int64
Token threshold to trigger clearing. Default
160000
.
ClearRetentionSuffixLimit
int
Retain the most recent N rounds of assistant messages without clearing. Default
1
.
ClearAtLeastTokens
int64
Minimum tokens that must be released by clearing. If not met, clearing is not performed (to avoid unnecessarily breaking prompt cache). Default
0
.
ClearExcludeTools
[]string
List of tool names exempt from clearing.
ClearMessageRewriter
func(ctx, M, []M) ([]M, error)
Message rewriting callback before clearing. Parameters are toolCallMsg and its corresponding toolResponseMsgs. Can be used to rewrite write_file/edit_file calls into system-reminders. Return nil to remove that message group.
ClearPostProcess
func(ctx, *adk.TypedChatModelAgentState[M]) context.Context
Callback after clearing is complete; can save state or send notifications. Returns a potentially updated context.
ToolConfig
map[string]*ToolReductionConfig
Per-tool configuration; takes priority over global settings.

ToolReductionConfig Per-tool Configuration

type ToolReductionConfig struct {
    Backend        Backend
    SkipTruncation bool
    TruncHandler   func(ctx context.Context, detail *ToolDetail) (*TruncResult, error)
    SkipClear      bool
    ClearHandler   func(ctx context.Context, detail *ToolDetail) (*ClearResult, error)
}
  • When TruncHandler / ClearHandler is nil and not skipped, the global default handler is used.
  • Backend is the independent storage backend for this tool, overriding the global Backend.

ToolDetail Tool Details

type ToolDetail struct {
    ToolContext       *adk.ToolContext
    ToolArgument      *schema.ToolArgument
    ToolResult        *schema.ToolResult                    // Non-streaming
    StreamToolResult  *schema.StreamReader[*schema.ToolResult] // Streaming
}

TruncResult Truncation Result

type TruncResult struct {
    NeedTrunc        bool
    ToolResult       *schema.ToolResult                    // Required when NeedTrunc && non-streaming
    StreamToolResult *schema.StreamReader[*schema.ToolResult] // Required when NeedTrunc && streaming
    NeedOffload      bool
    OffloadFilePath  string  // Required when NeedOffload
    OffloadContent   string  // Required when NeedOffload
}

ClearResult Clear Result

type ClearResult struct {
    NeedClear       bool
    ToolArgument    *schema.ToolArgument  // Required when NeedClear
    ToolResult      *schema.ToolResult    // Required when NeedClear
    NeedOffload     bool
    OffloadFilePath string  // Required when NeedOffload
    OffloadContent  string  // Required when NeedOffload
}

Backend Interface

// Defined in reduction/internal, exported via type alias
type Backend interface {
    Write(context.Context, *filesystem.WriteRequest) error
}

filesystem.WriteRequest contains two fields: FilePath string and Content string.


Creating the Middleware

Basic Usage

import "github.com/cloudwego/eino/adk/middlewares/reduction"

middleware, err := reduction.New(ctx, &reduction.Config{
    Backend: myBackend,
})

agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Model:       chatModel,
    Middlewares: []adk.ChatModelAgentMiddleware{middleware},
})

Generic Usage (AgenticMessage)

middleware, err := reduction.NewTyped[*schema.AgenticMessage](ctx, &reduction.TypedConfig[*schema.AgenticMessage]{
    Backend: myBackend,
    TokenCounter: myAgenticTokenCounter,
})

agent, err := adk.NewTypedChatModelAgent(ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{
    Model:       chatModel,
    Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{middleware},
})

Custom Configuration

middleware, err := reduction.New(ctx, &reduction.Config{
    Backend:           myBackend,
    RootDir:           "/data/agent",
    MaxLengthForTrunc: 30000,
    MaxTokensForClear: 100000,
    ClearRetentionSuffixLimit: 2,
    ClearAtLeastTokens: 10000,
    TruncExcludeTools: []string{"search_tool"},
    ClearExcludeTools: []string{"read_file"},
    ClearMessageRewriter: func(ctx context.Context, toolCallMsg *schema.Message, toolResponseMsgs []*schema.Message) ([]*schema.Message, error) {
        // Rewrite write_file calls into system-reminders
        return []*schema.Message{schema.UserMessage("<system-reminder>file written</system-reminder>")}, nil
    },
    ClearPostProcess: func(ctx context.Context, state *adk.ChatModelAgentState) context.Context {
        log.Printf("Clear completed, messages: %d", len(state.Messages))
        return ctx
    },
    ToolConfig: map[string]*reduction.ToolReductionConfig{
        "grep": {Backend: grepBackend},
        "read_file": {SkipClear: true},
    },
})

Truncation Only

middleware, err := reduction.New(ctx, &reduction.Config{
    Backend:   myBackend,
    SkipClear: true,
})

Clear Only

middleware, err := reduction.New(ctx, &reduction.Config{
    SkipTruncation: true,
    MaxTokensForClear: 100000,
    // When Backend is nil, clearing still replaces content with placeholders but does not perform offload
})

How It Works

Truncation

Handled in WrapInvokableToolCall / WrapStreamableToolCall / WrapEnhancedInvokableToolCall / WrapEnhancedStreamableToolCall:

  1. Tool returns its result
  2. Check TruncExcludeTools; skip if matched
  3. Look up ToolConfig β†’ global defaultConfig to find the TruncHandler
  4. TruncHandler decision: read the full output and check if the total length of all text parts exceeds MaxLengthForTrunc
  5. If exceeded: keep the first and last MaxLengthForTrunc/(textParts*2) characters as a preview, store the full content in the Backend
  6. Return a truncation notice informing the agent of the file path for the full content

πŸ’‘ For streaming tools, the default TruncHandler waits for the complete stream to finish before deciding whether to truncate. If you need strict incremental streaming behavior, provide a custom TruncHandler for that tool.

Clear

Handled in BeforeModelRewriteState:

  1. Calculate total tokens using TokenCounter
  2. Skip if below MaxTokensForClear
  3. Determine the clear range: from the first unprocessed assistant message to len(messages) - ClearRetentionSuffixLimit rounds
  4. If ClearMessageRewriter is configured, execute rewriting preprocessing on messages within the range
  5. Traverse tool call messages within the range, skipping ClearExcludeTools
  6. Call ClearHandler for each tool call, replacing arguments and results
  7. If ClearAtLeastTokens is set: operate on a copy first, compare token difference before and after clearing; if the target is not met, abandon this clearing
  8. Once the target is met, execute the actual offload writes and update state.Messages
  9. Call ClearPostProcess

Multi-language Support

Truncation and clearing prompt text supports automatic switching between Chinese and English:

adk.SetLanguage(adk.LanguageChinese)  // Chinese
adk.SetLanguage(adk.LanguageEnglish)  // English (default)

Notes

  • When SkipTruncation is false, Backend must be set
  • The default TokenCounter estimates using char_count/4; it is recommended to replace with github.com/tiktoken-go/tokenizer
  • Already-processed messages are marked via the Extra field with _reduction_mw_processed and will not be processed again
  • Configuration in ToolConfig takes priority over global settings; if only SkipTruncation: false is set in ToolConfig without providing a TruncHandler, it falls back to the default handler
  • GenTruncOffloadFilePath / GenClearOffloadFilePath are useful when tool_call_id is not unique (e.g., retries) to prevent file overwrites
  • ClearMessageRewriter runs after the clear range is determined and before per-tool clearing; it is suitable for compressing write/edit type calls into short hints
  • ClearAtLeastTokens set to 0 means clearing is performed whenever the threshold is exceeded; values greater than 0 avoid minimal clearing that would break prompt cache
  • Legacy APIs (NewClearToolResult, NewToolResultMiddleware) are deprecated; migration to New / NewTyped is recommended