FileSystem
The FileSystem middleware injects a set of file system operation tools (ls, read_file, write_file, edit_file, glob, grep) and an optional command execution tool (execute) into the Agent, enabling the Agent to interact with local or remote file systems.
import "github.com/cloudwego/eino/adk/middlewares/filesystem"
Quick Start
import (
"context"
"github.com/cloudwego/eino/adk"
"github.com/cloudwego/eino/adk/middlewares/filesystem"
)
// 1. Create middleware
middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{
Backend: myBackend, // Implements the filesystem.Backend interface
})
// 2. Inject into Agent
agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
// ...
Middlewares: []adk.ChatModelAgentMiddleware{middleware},
})
Constructors
| Function Signature | Description |
New(ctx, *MiddlewareConfig) (ChatModelAgentMiddleware, error) | Recommended. Returns ChatModelAgentMiddleware, supports dynamically modifying Instruction and Tools through the BeforeAgenthook. |
NewTyped[M MessageType](ctx, *MiddlewareConfig) (TypedChatModelAgentMiddleware[M], error) | Generic version, type parameter Msupports *schema.Messageand *schema.AgenticMessage. Newis equivalent to NewTyped[*schema.Message]. |
💡 Deprecated:
NewMiddleware(ctx, *Config) (AgentMiddleware, error)is the legacy constructor; new code should useNew.NewMiddlewarereturns the structAgentMiddleware, which lacks the flexibility of theBeforeAgenthook; additionally, it enables the “large result offloading” feature by default (see below), which has been removed in theNewpath.
MiddlewareConfig
MiddlewareConfig is the configuration struct used by New / NewTyped.
Core Fields
| Field | Type | Description |
Backend | filesystem.Backend | Required. Provides file system operation capabilities, powering the 6 tools: ls, read\_file, write\_file, edit\_file, glob, grep. The interface is defined in the github.com/cloudwego/eino/adk/filesystempackage. |
Shell | filesystem.Shell | Optional. Provides command execution capability; when set, registers the executetool. Mutually exclusive with StreamingShell. |
StreamingShell | filesystem.StreamingShell | Optional. Provides streaming command execution capability; when set, registers the streaming executetool. Mutually exclusive with Shell. |
UseMultiModalRead | bool | Optional, defaults to false. When enabled, the read_filetool becomes an EnhancedInvokableTool, supporting multi-modal content such as images/PDFs. Requires the Backend to also implement the filesystem.MultiModalReader interface. |
CustomSystemPrompt | *string | Optional. Overrides the system prompt appended to the Agent Instruction. If nil, no system prompt is appended. |
Tool Configuration Fields
Each tool has a corresponding *ToolConfig field for customizing the tool name, description, replacing the implementation, or disabling it:
| Field | Corresponding Tool |
LsToolConfig | ls |
ReadFileToolConfig | read\_file |
WriteFileToolConfig | write\_file |
EditFileToolConfig | edit\_file |
GlobToolConfig | glob |
GrepToolConfig | grep |
The
executetool currently does not support customization viaToolConfig; its registration is controlled solely by whetherShell/StreamingShellis set.
ToolConfig
type ToolConfig struct {
Name string // Override tool name, empty string uses default
Desc *string // Override tool description, nil uses default
CustomTool tool.BaseTool // Custom tool implementation, replaces Backend default when set
Disable bool // Set to true to not register this tool
}
Priority: Disable=true > CustomTool > Backend default implementation.
Tool Name Constants
const (
ToolNameLs = "ls"
ToolNameReadFile = "read_file"
ToolNameWriteFile = "write_file"
ToolNameEditFile = "edit_file"
ToolNameGlob = "glob"
ToolNameGrep = "grep"
ToolNameExecute = "execute"
)
Injected Tools
| Tool | Default Name | Registration Condition | Description |
| ls | ls | Backend ≠ nil | List files and subdirectories in a directory |
| read\_file | read_file | Backend ≠ nil | Read file content, supports offset/limit pagination. When UseMultiModalReadis enabled, can read images and PDFs |
| write\_file | write_file | Backend ≠ nil | Create or overwrite a file |
| edit\_file | edit_file | Backend ≠ nil | Precise string replacement editing, supports replace_all |
| glob | glob | Backend ≠ nil | Match file paths by glob pattern |
| grep | grep | Backend ≠ nil | Regex search of file content, supports multiple output modes and pagination |
| execute | execute | Shell ≠ nil or StreamingShell ≠ nil | Execute shell commands |
Backend Interface
Backend is defined in the github.com/cloudwego/eino/adk/filesystem package. The middleware package re-exports request/response types via type aliases (e.g., ReadRequest, FileContent), but the Backend interface itself needs to be referenced from the adk/filesystem package.
type Backend interface {
LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error)
Read(ctx context.Context, req *ReadRequest) (*FileContent, error)
GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error)
GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error)
Write(ctx context.Context, req *WriteRequest) error
Edit(ctx context.Context, req *EditRequest) error
}
Shell and StreamingShell
type Shell interface {
Execute(ctx context.Context, input *ExecuteRequest) (*ExecuteResponse, error)
}
type StreamingShell interface {
ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (*schema.StreamReader[*ExecuteResponse], error)
}
These two are mutually exclusive — only one can be set. StreamingShell supports streaming output, suitable for long-running commands.
MultiModalReader Extension Interface
When UseMultiModalRead = true, the Backend needs to additionally implement the MultiModalReader interface:
type MultiModalReader interface {
MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error)
}
Behavior:
- The
read_filetool is upgraded fromInvokableTooltoEnhancedInvokableTool, returning multi-modal results viaschema.ToolResult.Parts - The default implementation supports reading image files (PNG, JPG, etc.) and PDF files (supports the
pagesparameter to specify page ranges, up to 20 pages at a time) - The tool description automatically appends a multi-modal capability suffix; if the description is customized via
ReadFileToolConfig.Desc, no suffix is appended
💡 When using
ChatModelAgentMiddleware, you need to implement theWrapEnhancedInvokableToolCallmethod for the multi-modal read_file tool to work.
// MultiModalReadRequest extends ReadRequest
type MultiModalReadRequest struct {
ReadRequest
Pages string // PDF page range, e.g., "1-5", "3", "10-20"
}
// MultiFileContent return result
type MultiFileContent struct {
*FileContent // Plain text result
Parts []FileContentPart // Multi-modal result (mutually exclusive with FileContent; FileContent is ignored when Parts is non-empty)
}
type FileContentPart struct {
Type FileContentPartType // "image" or "pdf"
MIMEType string // e.g., "image/png", "application/pdf"
Data []byte // Raw binary data
}
Deprecated: Legacy Config and Large Result Offloading
💡 The following content only applies to the
NewMiddleware+Configlegacy path. TheNew/NewTypedpath does not include the large result offloading feature.
The legacy Config provides an additional “Large Tool Result Offloading” mechanism on top of MiddlewareConfig:
| Field | Description |
WithoutLargeToolResultOffloading bool | Set to trueto disable offloading, defaults to false(enabled) |
LargeToolResultOffloadingTokenLimit int | Token threshold, defaults to 20000 |
LargeToolResultOffloadingPathGen func(ctx, *compose.ToolInput) (string, error) | Offloading path generation function, defaults to /large_tool_result/{ToolCallID} |
Trigger condition: Offloading is triggered when the character count of the tool’s return result exceeds tokenLimit × 4.
Offloading behavior: The complete result is written to a file via Backend.Write, and the original return is replaced with a summary (first 10 lines + file path hint). The Agent can read the full result via read_file with pagination.