- Add isValidToolType() type guard to replace unsafe `as ToolType` casts - Make `raw` field optional in ParsedEvent interface since never read - Fix MockParser in tests to implement missing error detection methods - Add @internal JSDoc to test-only exports (hasOutputParser, getAllOutputParsers, clearParserRegistry) - Update AGENT_SUPPORT.md to reference canonical ParsedEvent source instead of duplicating type definition - Add tests for new isValidToolType function
29 KiB
Adding Agent Support
This guide explains how to add support for a new AI coding agent (provider) in Maestro. It covers the architecture, required implementations, and step-by-step instructions.
Multi-Provider Architecture Status
Status: ✅ Foundation Complete (2025-12-16)
The multi-provider refactoring has established the pluggable architecture for supporting multiple AI agents:
| Component | Status | Description |
|---|---|---|
| Capability System | ✅ Complete | AgentCapabilities interface, capability gating in UI |
| Generic Identifiers | ✅ Complete | claudeSessionId → agentSessionId across 47+ files |
| Session Storage | ✅ Complete | AgentSessionStorage interface, Claude + OpenCode implementations |
| Output Parsers | ✅ Complete | AgentOutputParser interface, Claude + OpenCode parsers |
| Error Handling | ✅ Complete | AgentError types, detection patterns, recovery UI |
| IPC API | ✅ Complete | window.maestro.agentSessions.* replaces claude.* |
| UI Capability Gates | ✅ Complete | Features hidden/shown based on agent capabilities |
Adding a New Agent
To add support for a new agent (e.g., Gemini CLI, Codex), follow these steps:
- Add agent definition to
src/main/agent-detector.ts - Define capabilities in
src/main/agent-capabilities.ts - Create output parser in
src/main/parsers/{agent}-output-parser.ts - Register parser in
src/main/parsers/index.ts - (Optional) Create session storage in
src/main/storage/{agent}-session-storage.ts - (Optional) Add error patterns to
src/main/parsers/error-patterns.ts
See detailed instructions below.
Table of Contents
- Vernacular
- Architecture Overview
- Agent Capability Model
- Step-by-Step: Adding a New Agent
- Implementation Details
- Error Handling
- Testing Your Agent
- Supported Agents Reference
Vernacular
Use these terms consistently throughout the codebase:
| Term | Definition |
|---|---|
| Maestro Agent | A configured AI assistant in Maestro (e.g., "My Claude Assistant") |
| Provider | The underlying AI service (Claude Code, OpenCode, Codex, Gemini CLI) |
| Provider Session | A conversation session managed by the provider (e.g., Claude's session_id) |
| Tab | A Maestro UI tab that maps 1:1 to a Provider Session |
Hierarchy: Maestro Agent → Provider → Provider Sessions → Tabs
Architecture Overview
Maestro uses a pluggable architecture for AI agents. Each agent integrates through:
- Agent Definition (
src/main/agent-detector.ts) - CLI binary, arguments, detection - Capabilities (
src/main/agent-capabilities.ts) - Feature flags controlling UI - Output Parser (
src/main/parsers/) - Translates agent JSON to Maestro events - Session Storage (
src/main/storage/) - Optional browsing of past sessions - Error Patterns (
src/main/parsers/error-patterns.ts) - Error detection and recovery
┌─────────────────────────────────────────────────────────────┐
│ Maestro UI │
│ (InputArea, MainPanel, AgentSessionsBrowser, etc.) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Capability Gates │
│ useAgentCapabilities() → show/hide UI features │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ ProcessManager │
│ Spawns agent, routes output through parser │
└─────────────────────────────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ ClaudeOutput │ │ OpenCodeOut │ │ YourAgent │
│ Parser │ │ Parser │ │ Parser │
└──────────────┘ └──────────────┘ └──────────────┘
Agent Capability Model
Each agent declares capabilities that determine which UI features are available.
Capability Interface
// src/main/agent-capabilities.ts
interface AgentCapabilities {
// Core features
supportsResume: boolean; // Can resume previous sessions
supportsReadOnlyMode: boolean; // Has a plan/read-only mode
supportsJsonOutput: boolean; // Emits structured JSON for parsing
supportsSessionId: boolean; // Emits session ID for tracking
// Advanced features
supportsImageInput: boolean; // Can receive images in prompts
supportsSlashCommands: boolean; // Has discoverable slash commands
supportsSessionStorage: boolean; // Persists sessions we can browse
supportsCostTracking: boolean; // Reports token costs
supportsUsageStats: boolean; // Reports token counts
// Streaming behavior
supportsBatchMode: boolean; // Runs per-message (vs persistent process)
supportsStreaming: boolean; // Streams output incrementally
// Message classification
supportsResultMessages: boolean; // Distinguishes final result from intermediary
}
Capability-to-UI Feature Mapping
| Capability | UI Feature | Hidden When False |
|---|---|---|
supportsReadOnlyMode |
Read-only toggle | Toggle hidden |
supportsSessionStorage |
Sessions browser tab | Tab hidden |
supportsResume |
Resume button | Button disabled |
supportsCostTracking |
Cost widget | Widget hidden |
supportsUsageStats |
Token usage display | Display hidden |
supportsImageInput |
Image attachment button | Button hidden |
supportsSlashCommands |
Slash command autocomplete | Autocomplete disabled |
supportsSessionId |
Session ID pill | Pill hidden |
supportsResultMessages |
Show only final result | Shows all messages |
Context Window Configuration
For agents where context window size varies by model (like OpenCode or Codex), Maestro provides a user-configurable setting:
Configuration Location: Settings → Agent Configuration → Context Window Size
How It Works:
- Parser-reported value: If the agent reports
contextWindowin JSON output, that value takes priority - User configuration: If the parser doesn't report context window, the user-configured value is used
- Hidden when zero: If no value is configured (0), the context usage widget is hidden entirely
Agent-Specific Behavior:
| Agent | Default Context Window | Notes |
|---|---|---|
| Claude Code | 200,000 | Always reported in JSON output |
| Codex | 200,000 | Default for GPT-5.x models; user can override in settings |
| OpenCode | 128,000 | Default for common models (GPT-4, etc.); user can override in settings |
Adding Context Window Config to an Agent:
// In agent-detector.ts, add to configOptions:
configOptions: [
{
key: 'contextWindow',
type: 'number',
label: 'Context Window Size',
description: 'Maximum context window size in tokens. Required for context usage display.',
default: 128000, // Set a sane default for the agent's typical model
},
],
The value is passed to ProcessManager.spawn() and used when emitting usage stats if the parser doesn't provide a context window value.
Starting Point: All False
When adding a new agent, start with all capabilities set to false:
'your-agent': {
supportsResume: false,
supportsReadOnlyMode: false,
supportsJsonOutput: false,
supportsSessionId: false,
supportsImageInput: false,
supportsSlashCommands: false,
supportsSessionStorage: false,
supportsCostTracking: false,
supportsUsageStats: false,
supportsBatchMode: false,
supportsStreaming: false,
supportsResultMessages: false,
},
Then enable capabilities as you implement and verify each feature.
Step-by-Step: Adding a New Agent
Step 1: Agent Discovery
Before writing code, investigate your agent's CLI:
# Check for JSON output mode
your-agent --help | grep -i json
your-agent --help | grep -i format
# Check for session resume
your-agent --help | grep -i session
your-agent --help | grep -i resume
your-agent --help | grep -i continue
# Check for read-only/plan mode
your-agent --help | grep -i plan
your-agent --help | grep -i readonly
your-agent --help | grep -i permission
# Test JSON output
your-agent run --format json "say hello" 2>&1 | head -20
Document:
- How to get JSON output
- Session ID field name and format
- How to resume a session
- How to enable read-only mode
- Token/usage reporting format
Step 2: Add Agent Definition
Edit src/main/agent-detector.ts:
const AGENT_DEFINITIONS: AgentConfig[] = [
// ... existing agents
{
id: 'your-agent',
name: 'Your Agent',
binaryName: 'your-agent',
command: 'your-agent',
args: [],
// CLI argument builders
batchModePrefix: ['run'], // Subcommand for batch mode
jsonOutputArgs: ['--format', 'json'], // JSON output flag
resumeArgs: (sessionId) => ['--session', sessionId],
readOnlyArgs: ['--mode', 'readonly'],
// Runtime (set by detection)
available: false,
path: undefined,
},
];
Step 3: Define Capabilities
Edit src/main/agent-capabilities.ts:
const AGENT_CAPABILITIES: Record<string, AgentCapabilities> = {
// ... existing agents
'your-agent': {
supportsResume: true, // If --session works
supportsReadOnlyMode: true, // If readonly mode exists
supportsJsonOutput: true, // If JSON output works
supportsSessionId: true, // If session ID in output
supportsImageInput: false, // Start false, enable if supported
supportsSlashCommands: false,
supportsSessionStorage: false, // Enable if you implement storage
supportsCostTracking: false, // Enable if API-based with costs
supportsUsageStats: true, // If token counts in output
supportsBatchMode: true,
supportsStreaming: true,
supportsResultMessages: false, // Enable if result vs intermediary distinction
},
};
Step 4: Create Output Parser
Create src/main/parsers/your-agent-output-parser.ts:
import { AgentOutputParser, ParsedEvent } from './agent-output-parser';
export class YourAgentOutputParser implements AgentOutputParser {
parseJsonLine(line: string): ParsedEvent | null {
try {
const event = JSON.parse(line);
// Map your agent's event types to Maestro's ParsedEvent
switch (event.type) {
case 'your_text_event':
return {
type: 'text',
sessionId: event.sessionId,
text: event.content,
raw: event,
};
case 'your_tool_event':
return {
type: 'tool_use',
sessionId: event.sessionId,
toolName: event.tool,
toolState: event.state,
raw: event,
};
case 'your_finish_event':
return {
type: 'result',
sessionId: event.sessionId,
text: event.finalText,
usage: {
input: event.tokens?.input ?? 0,
output: event.tokens?.output ?? 0,
},
raw: event,
};
default:
return null;
}
} catch {
return null;
}
}
isResultMessage(event: ParsedEvent): boolean {
return event.type === 'result';
}
extractSessionId(event: ParsedEvent): string | null {
return event.sessionId ?? null;
}
}
Step 5: Register Parser in Factory
Edit src/main/parsers/agent-output-parser.ts:
import { YourAgentOutputParser } from './your-agent-output-parser';
export function getOutputParser(agentId: string): AgentOutputParser {
switch (agentId) {
case 'claude-code':
return new ClaudeOutputParser();
case 'opencode':
return new OpenCodeOutputParser();
case 'your-agent':
return new YourAgentOutputParser();
default:
return new GenericOutputParser();
}
}
Step 6: Add Error Patterns (Optional but Recommended)
Edit src/main/parsers/error-patterns.ts:
export const YOUR_AGENT_ERROR_PATTERNS = {
auth_expired: [
/authentication failed/i,
/invalid.*key/i,
/please login/i,
],
token_exhaustion: [
/context.*exceeded/i,
/too many tokens/i,
],
rate_limited: [
/rate limit/i,
/too many requests/i,
],
};
Step 7: Implement Session Storage (Optional)
If your agent stores sessions in browseable files, create src/main/storage/your-agent-session-storage.ts:
import { AgentSessionStorage, AgentSession } from '../agent-session-storage';
export class YourAgentSessionStorage implements AgentSessionStorage {
async listSessions(projectPath: string): Promise<AgentSession[]> {
// Find and parse session files
const sessionDir = this.getSessionDir(projectPath);
// ... implementation
}
async readSession(projectPath: string, sessionId: string): Promise<SessionMessage[]> {
// Read and parse session file
// ... implementation
}
// ... other methods
}
Step 8: Test Your Integration
# Run dev build
npm run dev
# Create a session with your agent
# 1. Open Maestro
# 2. Create new session, select your agent
# 3. Send a message
# 4. Verify output displays correctly
# 5. Test session resume (if supported)
# 6. Test read-only mode (if supported)
Implementation Details
Message Display Classification
Agents may emit intermediary messages (streaming, tool calls) and result messages (final response). Configure display behavior via supportsResultMessages:
| supportsResultMessages | Behavior |
|---|---|
true |
Only show result messages prominently; collapse intermediary |
false |
Show all messages as they stream |
CLI Argument Builders
The AgentConfig supports several argument builder patterns:
interface AgentConfig {
// Static arguments always included
args: string[];
// Subcommand prefix for batch mode (e.g., ['run'] for opencode)
batchModePrefix?: string[];
// Arguments for JSON output
jsonOutputArgs?: string[];
// Function to build resume arguments
resumeArgs?: (sessionId: string) => string[];
// Arguments for read-only mode
readOnlyArgs?: string[];
}
ParsedEvent Types
Your output parser should emit these normalized event types. See src/main/parsers/agent-output-parser.ts for the canonical ParsedEvent interface definition.
Key event types:
init- Agent initialization (may contain session ID, available commands)text- Text content to display to usertool_use- Agent is using a tool (file read, bash, etc.)result- Final result/response from agenterror- Error occurredusage- Token usage statisticssystem- System-level messages (not user-facing content)
Import the interface directly rather than defining your own:
import { type ParsedEvent } from './agent-output-parser';
Error Handling
Maestro has unified error handling for agent failures. Your agent should integrate with this system.
Error Types
| Error Type | When to Detect |
|---|---|
auth_expired |
API key invalid, login required |
token_exhaustion |
Context window full |
rate_limited |
Too many requests |
network_error |
Connection failed |
agent_crashed |
Non-zero exit code |
permission_denied |
Operation not allowed |
Adding Error Detection
In your output parser, implement the detectError method:
detectError(line: string): AgentError | null {
for (const [errorType, patterns] of Object.entries(YOUR_AGENT_ERROR_PATTERNS)) {
for (const pattern of patterns) {
if (pattern.test(line)) {
return {
type: errorType as AgentError['type'],
message: line,
recoverable: errorType !== 'agent_crashed',
agentId: 'your-agent',
timestamp: Date.now(),
};
}
}
}
return null;
}
Testing Your Agent
Unit Tests
Create src/__tests__/parsers/your-agent-output-parser.test.ts:
import { YourAgentOutputParser } from '../../main/parsers/your-agent-output-parser';
describe('YourAgentOutputParser', () => {
const parser = new YourAgentOutputParser();
it('parses text events', () => {
const line = '{"type": "your_text_event", "sessionId": "123", "content": "Hello"}';
const event = parser.parseJsonLine(line);
expect(event).toEqual({
type: 'text',
sessionId: '123',
text: 'Hello',
raw: expect.any(Object),
});
});
it('extracts session ID', () => {
const event = { type: 'text', sessionId: 'abc-123', raw: {} };
expect(parser.extractSessionId(event)).toBe('abc-123');
});
it('detects auth errors', () => {
const error = parser.detectError('Error: authentication failed');
expect(error?.type).toBe('auth_expired');
});
});
Integration Testing Checklist
- Agent appears in agent selection dropdown
- New session starts successfully
- Output streams to AI Terminal
- Session ID captured and displayed
- Token usage updates (if applicable)
- Session resume works (if applicable)
- Read-only mode works (if applicable)
- Error modal appears on auth/token errors
- Auto Run works with your agent
Supported Agents Reference
Claude Code ✅ Fully Implemented
| Aspect | Value |
|---|---|
| Binary | claude |
| JSON Output | --output-format stream-json |
| Resume | --resume <session-id> |
| Read-only | --permission-mode plan |
| Session ID Field | session_id (snake_case) |
| Session Storage | ~/.claude/projects/<encoded-path>/ |
Implementation Status:
- ✅ Output Parser:
src/main/parsers/claude-output-parser.ts - ✅ Session Storage:
src/main/storage/claude-session-storage.ts - ✅ Error Patterns:
src/main/parsers/error-patterns.ts - ✅ All capabilities enabled
JSON Event Types:
system(init) → session_id, slash_commandsassistant→ streaming contentresult→ final response, modelUsage
OpenCode 🔄 Stub Ready
| Aspect | Value |
|---|---|
| Binary | opencode |
| JSON Output | --format json |
| Resume | --session <session-id> |
| Read-only | --agent plan |
| Session ID Field | sessionID (camelCase) |
| Session Storage | ✅ File-based (see below) |
| YOLO Mode | ✅ Auto-enabled in batch mode |
| Model Selection | --model provider/model |
| Config File | ~/.config/opencode/opencode.json or project opencode.json |
YOLO Mode (Auto-Approval) Details:
OpenCode automatically approves all tool operations in batch mode (opencode run). Per official documentation:
- Batch mode behavior: "All permissions are auto-approved for the session" when running non-interactively
- No explicit flag needed: Unlike Claude Code's
--dangerously-skip-permissions, OpenCode'srunsubcommand inherently auto-approves - Permission defaults: Most tools run without approval by default; only
doom_loopandexternal_directoryrequire explicit approval in interactive mode - Configurable permissions: Advanced users can customize via
opencode.jsonwith granular tool-level controls (allow,ask,deny) - Read-only operations: Tools like
view,glob,grep,ls, anddiagnosticsnever require approval
This makes OpenCode suitable for Maestro's batch processing use case without additional configuration.
Session Storage Details:
OpenCode stores session data in ~/.local/share/opencode/storage/ with the following structure:
~/.local/share/opencode/
├── log/ # Log files
├── snapshot/ # Git-style snapshots
└── storage/
├── project/ # Project metadata (JSON per project)
│ └── {projectID}.json # Contains: id, worktree path, vcs info, timestamps
├── session/ # Session metadata (organized by project)
│ ├── global/ # Sessions not tied to a specific project
│ │ └── {sessionID}.json # Session info: id, version, projectID, title, timestamps
│ └── {projectID}/ # Project-specific sessions
│ └── {sessionID}.json
├── message/ # Message metadata (organized by session)
│ └── {sessionID}/ # One folder per session
│ └── {messageID}.json # Message info: role, time, model, tokens, etc.
└── part/ # Message parts (content chunks)
└── {messageID}/ # One folder per message
└── {partID}.json # Part content: type (text/tool/reasoning), text, etc.
Key findings:
- CLI Commands:
opencode session list,opencode export <sessionID>,opencode import <file> - Project IDs: SHA1 hash of project path (e.g.,
ca85ff7c488724e85fc5b4be14ba44a0f6ce5b40) - Session IDs: Format
ses_{base62-ish}(e.g.,ses_4d585107dffeO9bO3HvMdvLYyC) - Message IDs: Format
msg_{base62-ish}(e.g.,msg_b2a7aef8d001MjwADMqsUcIj3k) - Export format:
opencode export <sessionID>outputs complete session JSON with all messages and parts - Message parts include:
text,reasoning,tool,step-start, etc. - Token tracking: Available in message metadata with
input,output,reasoning, and cache fields
Implementation Status:
- ✅ Output Parser:
src/main/parsers/opencode-output-parser.ts(based on expected format) - ⏳ Session Storage:
src/main/storage/opencode-session-storage.ts(stub, needs implementation using storage paths above) - ⏳ Error Patterns: Placeholder, needs real-world testing
- ⏳ Capabilities: Set to minimal defaults;
supportsSessionStoragecan be enabled once storage is implemented
JSON Event Types:
step_start→ session start (includes snapshot reference)text→ streaming contentreasoning→ model thinking/chain-of-thoughttool→ tool invocations with state (running/complete)step_finish→ tokens, completion
Provider & Model Configuration:
OpenCode supports 75+ LLM providers including local models via Ollama, LM Studio, and llama.cpp. Configuration is stored in:
- Global config:
~/.config/opencode/opencode.json - Per-project config:
opencode.jsonin project root - Custom path: Via
OPENCODE_CONFIGenvironment variable
Configuration files are merged, with project config overriding global config for conflicting keys.
Ollama Setup Example:
{
"$schema": "https://opencode.ai/config.json",
"model": "ollama/qwen3:8b-16k",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen3:8b-16k": {
"name": "Qwen3 8B",
"tools": true
}
}
}
}
}
Key Configuration Options:
npm: Provider package (use@ai-sdk/openai-compatiblefor OpenAI-compatible APIs)options.baseURL: API endpoint URLmodels.<id>.tools: Enable tool calling support (critical for agentic use)models.<id>.limit.context: Max input tokensmodels.<id>.limit.output: Max output tokens
Context Window Configuration (Ollama):
Ollama defaults to 4096 context regardless of model capability. To increase context:
# Create a model variant with larger context
ollama run qwen3:8b
/set parameter num_ctx 16384
/save qwen3:8b-16k
Then reference the custom model name in OpenCode config.
Other Local Provider Examples:
// LM Studio
"lmstudio": {
"npm": "@ai-sdk/openai-compatible",
"options": { "baseURL": "http://127.0.0.1:1234/v1" }
}
// llama.cpp
"llamacpp": {
"npm": "@ai-sdk/openai-compatible",
"options": { "baseURL": "http://127.0.0.1:8080/v1" }
}
Model Selection Methods:
- Command-line:
opencode run --model ollama/qwen3:8b-16k "prompt" - Config file: Set
"model": "provider/model"in opencode.json - Interactive: Use
/modelscommand in interactive mode
Model ID format: provider_id/model_id (e.g., ollama/llama2, anthropic/claude-sonnet-4-5)
Maestro Integration Considerations:
Since OpenCode supports multiple providers/models, Maestro should consider:
- Model selection UI: Add model dropdown when OpenCode is selected, populated from config or
opencode modelscommand - Default config generation: Optionally generate
~/.config/opencode/opencode.jsonfor Ollama on first use - Per-session model: Pass
--modelflag based on user selection - Provider status: Detect which providers are configured and available
Documentation Sources:
Gemini CLI 📋 Planned
Status: Not yet implemented
To Add:
- Agent definition in
agent-detector.ts - Capabilities in
agent-capabilities.ts - Output parser for Gemini JSON format
- Error patterns for Google API errors
Codex ✅ Fully Implemented
| Aspect | Value |
|---|---|
| Binary | codex |
| JSON Output | --json |
| Batch Mode | exec subcommand |
| Resume | resume <thread_id> (v0.30.0+) |
| Read-only | --sandbox read-only |
| YOLO Mode | --dangerously-bypass-approvals-and-sandbox (enabled by default) |
| Session ID Field | thread_id (from thread.started event) |
| Session Storage | ~/.codex/sessions/YYYY/MM/DD/*.jsonl |
| Context Window | 128K tokens |
| Pricing | o4-mini: $1.10/$4.40 per million tokens (input/output) |
Implementation Status:
- ✅ Output Parser:
src/main/parsers/codex-output-parser.ts(42 tests) - ✅ Session Storage:
src/main/storage/codex-session-storage.ts(8 tests) - ✅ Error Patterns:
src/main/parsers/error-patterns.ts(25 tests) - ✅ All capabilities enabled
JSON Event Types:
thread.started→ session_id (thread_id), initializationturn.started→ processing indicatoritem.completed (agent_message)→ final text responseitem.completed (reasoning)→ model thinking (partial text)item.completed (tool_call)→ tool invocationitem.completed (tool_result)→ tool outputturn.completed→ token usage (input_tokens,output_tokens,reasoning_output_tokens,cached_input_tokens)
Unique Features:
- Reasoning Tokens: Reports
reasoning_output_tokensseparately fromoutput_tokens, displayed in UI - Three Sandbox Levels:
read-only,workspace-write,danger-full-access - Cached Input Discount: 75% discount on cached input tokens ($0.275/million)
- YOLO Mode Default: Full system access enabled by default in Maestro
Command Line Pattern:
# Basic execution
codex exec --json -C /path/to/project "prompt"
# With YOLO mode (default in Maestro)
codex exec --json --dangerously-bypass-approvals-and-sandbox -C /path/to/project "prompt"
# Resume session
codex exec --json resume <thread_id> "continue"
Documentation Sources:
Qwen3 Coder 📋 Planned
Status: Not yet implemented
To Add:
- Agent definition in
agent-detector.ts - Capabilities in
agent-capabilities.ts(likely local model, no cost tracking) - Output parser for Qwen JSON format
- Error patterns (likely minimal for local models)