FileSystem

The FileSystem middleware injects a set of file system operation tools (ls, read_file, write_file, edit_file, glob, grep) and an optional command execution tool (execute) into the Agent, enabling the Agent to interact with local or remote file systems.

import "github.com/cloudwego/eino/adk/middlewares/filesystem"

Quick Start

import (
    "context"
    "github.com/cloudwego/eino/adk"
    "github.com/cloudwego/eino/adk/middlewares/filesystem"
)

// 1. Create middleware
middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{
    Backend: myBackend, // Implements the filesystem.Backend interface
})

// 2. Inject into Agent
agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    // ...
    Middlewares: []adk.ChatModelAgentMiddleware{middleware},
})

Constructors

Function Signature

Description

New(ctx, *MiddlewareConfig) (ChatModelAgentMiddleware, error)

Recommended. Returns

ChatModelAgentMiddleware

, supports dynamically modifying Instruction and Tools through the

BeforeAgent

hook.

NewTyped[M MessageType](ctx, *MiddlewareConfig) (TypedChatModelAgentMiddleware[M], error)

Generic version, type parameter

supports

*schema.Message

and

*schema.AgenticMessage

New

is equivalent to

NewTyped[*schema.Message]

💡 Deprecated: NewMiddleware(ctx, *Config) (AgentMiddleware, error) is the legacy constructor; new code should use New. NewMiddleware returns the struct AgentMiddleware, which lacks the flexibility of the BeforeAgent hook; additionally, it enables the “large result offloading” feature by default (see below), which has been removed in the New path.

MiddlewareConfig

MiddlewareConfig is the configuration struct used by New / NewTyped.

Core Fields

Field	Type	Description
Backend	filesystem.Backend	Required. Provides file system operation capabilities, powering the 6 tools: ls, read\_file, write\_file, edit\_file, glob, grep. The interface is defined in the github.com/cloudwego/eino/adk/filesystem package.
Shell	filesystem.Shell	Optional. Provides command execution capability; when set, registers the execute tool. Mutually exclusive with StreamingShell .
StreamingShell	filesystem.StreamingShell	Optional. Provides streaming command execution capability; when set, registers the streaming execute tool. Mutually exclusive with Shell .
UseMultiModalRead	bool	Optional, defaults to false . When enabled, the read_file tool becomes an EnhancedInvokableTool , supporting multi-modal content such as images/PDFs. Requires the Backend to also implement the filesystem.MultiModalReader interface.
CustomSystemPrompt	*string	Optional. Overrides the system prompt appended to the Agent Instruction. If nil , no system prompt is appended.

Tool Configuration Fields

Each tool has a corresponding *ToolConfig field for customizing the tool name, description, replacing the implementation, or disabling it:

Field	Corresponding Tool
LsToolConfig	ls
ReadFileToolConfig	read\_file
WriteFileToolConfig	write\_file
EditFileToolConfig	edit\_file
GlobToolConfig	glob
GrepToolConfig	grep

The execute tool currently does not support customization via ToolConfig; its registration is controlled solely by whether Shell / StreamingShell is set.

ToolConfig

type ToolConfig struct {
    Name       string         // Override tool name, empty string uses default
    Desc       *string        // Override tool description, nil uses default
    CustomTool tool.BaseTool  // Custom tool implementation, replaces Backend default when set
    Disable    bool           // Set to true to not register this tool
}

Priority: Disable=true > CustomTool > Backend default implementation.

Tool Name Constants

const (
    ToolNameLs        = "ls"
    ToolNameReadFile  = "read_file"
    ToolNameWriteFile = "write_file"
    ToolNameEditFile  = "edit_file"
    ToolNameGlob      = "glob"
    ToolNameGrep      = "grep"
    ToolNameExecute   = "execute"
)

Injected Tools

Tool	Default Name	Registration Condition	Description
ls	ls	Backend ≠ nil	List files and subdirectories in a directory
read\_file	read_file	Backend ≠ nil	Read file content, supports offset/limit pagination. When UseMultiModalRead is enabled, can read images and PDFs
write\_file	write_file	Backend ≠ nil	Create or overwrite a file
edit\_file	edit_file	Backend ≠ nil	Precise string replacement editing, supports replace_all
glob	glob	Backend ≠ nil	Match file paths by glob pattern
grep	grep	Backend ≠ nil	Regex search of file content, supports multiple output modes and pagination
execute	execute	Shell ≠ nil or StreamingShell ≠ nil	Execute shell commands

Backend Interface

Backend is defined in the github.com/cloudwego/eino/adk/filesystem package. The middleware package re-exports request/response types via type aliases (e.g., ReadRequest, FileContent), but the Backend interface itself needs to be referenced from the adk/filesystem package.

type Backend interface {
    LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error)
    Read(ctx context.Context, req *ReadRequest) (*FileContent, error)
    GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error)
    GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error)
    Write(ctx context.Context, req *WriteRequest) error
    Edit(ctx context.Context, req *EditRequest) error
}

Shell and StreamingShell

type Shell interface {
    Execute(ctx context.Context, input *ExecuteRequest) (*ExecuteResponse, error)
}

type StreamingShell interface {
    ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (*schema.StreamReader[*ExecuteResponse], error)
}

These two are mutually exclusive — only one can be set. StreamingShell supports streaming output, suitable for long-running commands.

MultiModalReader Extension Interface

When UseMultiModalRead = true, the Backend needs to additionally implement the MultiModalReader interface:

type MultiModalReader interface {
    MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error)
}

Behavior:

The read_file tool is upgraded from InvokableTool to EnhancedInvokableTool, returning multi-modal results via schema.ToolResult.Parts
The default implementation supports reading image files (PNG, JPG, etc.) and PDF files (supports the pages parameter to specify page ranges, up to 20 pages at a time)
The tool description automatically appends a multi-modal capability suffix; if the description is customized via ReadFileToolConfig.Desc, no suffix is appended

💡 When using ChatModelAgentMiddleware, you need to implement the WrapEnhancedInvokableToolCall method for the multi-modal read_file tool to work.

// MultiModalReadRequest extends ReadRequest
type MultiModalReadRequest struct {
    ReadRequest
    Pages string  // PDF page range, e.g., "1-5", "3", "10-20"
}

// MultiFileContent return result
type MultiFileContent struct {
    *FileContent            // Plain text result
    Parts []FileContentPart // Multi-modal result (mutually exclusive with FileContent; FileContent is ignored when Parts is non-empty)
}

type FileContentPart struct {
    Type     FileContentPartType // "image" or "pdf"
    MIMEType string              // e.g., "image/png", "application/pdf"
    Data     []byte              // Raw binary data
}

Deprecated: Legacy Config and Large Result Offloading

💡 The following content only applies to the NewMiddleware + Config legacy path. The New / NewTyped path does not include the large result offloading feature.

The legacy Config provides an additional “Large Tool Result Offloading” mechanism on top of MiddlewareConfig:

Field	Description
WithoutLargeToolResultOffloading bool	Set to true to disable offloading, defaults to false (enabled)
LargeToolResultOffloadingTokenLimit int	Token threshold, defaults to 20000
LargeToolResultOffloadingPathGen func(ctx, *compose.ToolInput) (string, error)	Offloading path generation function, defaults to /large_tool_result/{ToolCallID}

Trigger condition: Offloading is triggered when the character count of the tool’s return result exceeds tokenLimit × 4.

Offloading behavior: The complete result is written to a file via Backend.Write, and the original return is replaced with a summary (first 10 lines + file path hint). The Agent can read the full result via read_file with pagination.

Feedback

Was this page helpful?

Please tell us how we can improve.

Last modified May 18, 2026: docs(eino): sync english translations (ffdf777e)