mirror of https://github.com/jlengrand/Maestro.git synced 2026-03-10 08:31:19 +00:00

Files

Pedram Amini 2be8b5987d MAESTRO: Implement Phase 8.2 - Update documentation for multi-provider architecture

- CLAUDE.md: Added agent capabilities section, updated architecture diagram
  with parsers/ and storage/ directories, added agentSessions and agentError
  APIs, updated key files table, added error fields to Session interface
- AGENT_SUPPORT.md: Added multi-provider architecture status section showing
  all 7 completed components, updated supported agents reference with
  implementation status and detailed checklist for planned agents
- CONTRIBUTING.md: Updated supported agents reference table with status
  column and link to detailed guide

2025-12-16 21:50:19 -06:00

19 KiB

Raw Blame History

Adding Agent Support

This guide explains how to add support for a new AI coding agent (provider) in Maestro. It covers the architecture, required implementations, and step-by-step instructions.

Multi-Provider Architecture Status

Status: ✅ Foundation Complete (2025-12-16)

The multi-provider refactoring has established the pluggable architecture for supporting multiple AI agents:

Component	Status	Description
Capability System	✅ Complete	`AgentCapabilities` interface, capability gating in UI
Generic Identifiers	✅ Complete	`claudeSessionId` → `agentSessionId` across 47+ files
Session Storage	✅ Complete	`AgentSessionStorage` interface, Claude + OpenCode implementations
Output Parsers	✅ Complete	`AgentOutputParser` interface, Claude + OpenCode parsers
Error Handling	✅ Complete	`AgentError` types, detection patterns, recovery UI
IPC API	✅ Complete	`window.maestro.agentSessions.` replaces `claude.`
UI Capability Gates	✅ Complete	Features hidden/shown based on agent capabilities

Adding a New Agent

To add support for a new agent (e.g., Gemini CLI, Codex), follow these steps:

Add agent definition to src/main/agent-detector.ts
Define capabilities in src/main/agent-capabilities.ts
Create output parser in src/main/parsers/{agent}-output-parser.ts
Register parser in src/main/parsers/index.ts
(Optional) Create session storage in src/main/storage/{agent}-session-storage.ts
(Optional) Add error patterns to src/main/parsers/error-patterns.ts

See detailed instructions below.

Vernacular
Architecture Overview
Agent Capability Model
Step-by-Step: Adding a New Agent
Implementation Details
Error Handling
Testing Your Agent
Supported Agents Reference

Vernacular

Use these terms consistently throughout the codebase:

Term	Definition
Maestro Agent	A configured AI assistant in Maestro (e.g., "My Claude Assistant")
Provider	The underlying AI service (Claude Code, OpenCode, Codex, Gemini CLI)
Provider Session	A conversation session managed by the provider (e.g., Claude's `session_id`)
Tab	A Maestro UI tab that maps 1:1 to a Provider Session

Hierarchy: Maestro Agent → Provider → Provider Sessions → Tabs

Architecture Overview

Maestro uses a pluggable architecture for AI agents. Each agent integrates through:

Agent Definition (src/main/agent-detector.ts) - CLI binary, arguments, detection
Capabilities (src/main/agent-capabilities.ts) - Feature flags controlling UI
Output Parser (src/main/parsers/) - Translates agent JSON to Maestro events
Session Storage (src/main/storage/) - Optional browsing of past sessions
Error Patterns (src/main/parsers/error-patterns.ts) - Error detection and recovery

┌─────────────────────────────────────────────────────────────┐
│                        Maestro UI                           │
│  (InputArea, MainPanel, AgentSessionsBrowser, etc.)        │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    Capability Gates                          │
│  useAgentCapabilities() → show/hide UI features             │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    ProcessManager                            │
│  Spawns agent, routes output through parser                 │
└─────────────────────────────────────────────────────────────┘
                              │
            ┌─────────────────┼─────────────────┐
            ▼                 ▼                 ▼
    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
    │ ClaudeOutput │  │ OpenCodeOut  │  │ YourAgent    │
    │ Parser       │  │ Parser       │  │ Parser       │
    └──────────────┘  └──────────────┘  └──────────────┘

Agent Capability Model

Each agent declares capabilities that determine which UI features are available.

Capability Interface

// src/main/agent-capabilities.ts

interface AgentCapabilities {
  // Core features
  supportsResume: boolean;           // Can resume previous sessions
  supportsReadOnlyMode: boolean;     // Has a plan/read-only mode
  supportsJsonOutput: boolean;       // Emits structured JSON for parsing
  supportsSessionId: boolean;        // Emits session ID for tracking

  // Advanced features
  supportsImageInput: boolean;       // Can receive images in prompts
  supportsSlashCommands: boolean;    // Has discoverable slash commands
  supportsSessionStorage: boolean;   // Persists sessions we can browse
  supportsCostTracking: boolean;     // Reports token costs
  supportsUsageStats: boolean;       // Reports token counts

  // Streaming behavior
  supportsBatchMode: boolean;        // Runs per-message (vs persistent process)
  supportsStreaming: boolean;        // Streams output incrementally

  // Message classification
  supportsResultMessages: boolean;   // Distinguishes final result from intermediary
}

Capability-to-UI Feature Mapping

Capability	UI Feature	Hidden When False
`supportsReadOnlyMode`	Read-only toggle	Toggle hidden
`supportsSessionStorage`	Sessions browser tab	Tab hidden
`supportsResume`	Resume button	Button disabled
`supportsCostTracking`	Cost widget	Widget hidden
`supportsUsageStats`	Token usage display	Display hidden
`supportsImageInput`	Image attachment button	Button hidden
`supportsSlashCommands`	Slash command autocomplete	Autocomplete disabled
`supportsSessionId`	Session ID pill	Pill hidden
`supportsResultMessages`	Show only final result	Shows all messages

Starting Point: All False

When adding a new agent, start with all capabilities set to false:

'your-agent': {
  supportsResume: false,
  supportsReadOnlyMode: false,
  supportsJsonOutput: false,
  supportsSessionId: false,
  supportsImageInput: false,
  supportsSlashCommands: false,
  supportsSessionStorage: false,
  supportsCostTracking: false,
  supportsUsageStats: false,
  supportsBatchMode: false,
  supportsStreaming: false,
  supportsResultMessages: false,
},

Then enable capabilities as you implement and verify each feature.

Step-by-Step: Adding a New Agent

Step 1: Agent Discovery

Before writing code, investigate your agent's CLI:

# Check for JSON output mode
your-agent --help | grep -i json
your-agent --help | grep -i format

# Check for session resume
your-agent --help | grep -i session
your-agent --help | grep -i resume
your-agent --help | grep -i continue

# Check for read-only/plan mode
your-agent --help | grep -i plan
your-agent --help | grep -i readonly
your-agent --help | grep -i permission

# Test JSON output
your-agent run --format json "say hello" 2>&1 | head -20

Document:

How to get JSON output
Session ID field name and format
How to resume a session
How to enable read-only mode
Token/usage reporting format

Step 2: Add Agent Definition

Edit src/main/agent-detector.ts:

const AGENT_DEFINITIONS: AgentConfig[] = [
  // ... existing agents
  {
    id: 'your-agent',
    name: 'Your Agent',
    binaryName: 'your-agent',
    command: 'your-agent',
    args: [],

    // CLI argument builders
    batchModePrefix: ['run'],              // Subcommand for batch mode
    jsonOutputArgs: ['--format', 'json'],  // JSON output flag
    resumeArgs: (sessionId) => ['--session', sessionId],
    readOnlyArgs: ['--mode', 'readonly'],

    // Runtime (set by detection)
    available: false,
    path: undefined,
  },
];

Step 3: Define Capabilities

Edit src/main/agent-capabilities.ts:

const AGENT_CAPABILITIES: Record<string, AgentCapabilities> = {
  // ... existing agents
  'your-agent': {
    supportsResume: true,           // If --session works
    supportsReadOnlyMode: true,     // If readonly mode exists
    supportsJsonOutput: true,       // If JSON output works
    supportsSessionId: true,        // If session ID in output
    supportsImageInput: false,      // Start false, enable if supported
    supportsSlashCommands: false,
    supportsSessionStorage: false,  // Enable if you implement storage
    supportsCostTracking: false,    // Enable if API-based with costs
    supportsUsageStats: true,       // If token counts in output
    supportsBatchMode: true,
    supportsStreaming: true,
    supportsResultMessages: false,  // Enable if result vs intermediary distinction
  },
};

Step 4: Create Output Parser

Create src/main/parsers/your-agent-output-parser.ts:

import { AgentOutputParser, ParsedEvent } from './agent-output-parser';

export class YourAgentOutputParser implements AgentOutputParser {
  parseJsonLine(line: string): ParsedEvent | null {
    try {
      const event = JSON.parse(line);

      // Map your agent's event types to Maestro's ParsedEvent
      switch (event.type) {
        case 'your_text_event':
          return {
            type: 'text',
            sessionId: event.sessionId,
            text: event.content,
            raw: event,
          };

        case 'your_tool_event':
          return {
            type: 'tool_use',
            sessionId: event.sessionId,
            toolName: event.tool,
            toolState: event.state,
            raw: event,
          };

        case 'your_finish_event':
          return {
            type: 'result',
            sessionId: event.sessionId,
            text: event.finalText,
            usage: {
              input: event.tokens?.input ?? 0,
              output: event.tokens?.output ?? 0,
            },
            raw: event,
          };

        default:
          return null;
      }
    } catch {
      return null;
    }
  }

  isResultMessage(event: ParsedEvent): boolean {
    return event.type === 'result';
  }

  extractSessionId(event: ParsedEvent): string | null {
    return event.sessionId ?? null;
  }
}

Step 5: Register Parser in Factory

Edit src/main/parsers/agent-output-parser.ts:

import { YourAgentOutputParser } from './your-agent-output-parser';

export function getOutputParser(agentId: string): AgentOutputParser {
  switch (agentId) {
    case 'claude-code':
      return new ClaudeOutputParser();
    case 'opencode':
      return new OpenCodeOutputParser();
    case 'your-agent':
      return new YourAgentOutputParser();
    default:
      return new GenericOutputParser();
  }
}

Step 6: Add Error Patterns (Optional but Recommended)

Edit src/main/parsers/error-patterns.ts:

export const YOUR_AGENT_ERROR_PATTERNS = {
  auth_expired: [
    /authentication failed/i,
    /invalid.*key/i,
    /please login/i,
  ],
  token_exhaustion: [
    /context.*exceeded/i,
    /too many tokens/i,
  ],
  rate_limited: [
    /rate limit/i,
    /too many requests/i,
  ],
};

Step 7: Implement Session Storage (Optional)

If your agent stores sessions in browseable files, create src/main/storage/your-agent-session-storage.ts:

import { AgentSessionStorage, AgentSession } from '../agent-session-storage';

export class YourAgentSessionStorage implements AgentSessionStorage {
  async listSessions(projectPath: string): Promise<AgentSession[]> {
    // Find and parse session files
    const sessionDir = this.getSessionDir(projectPath);
    // ... implementation
  }

  async readSession(projectPath: string, sessionId: string): Promise<SessionMessage[]> {
    // Read and parse session file
    // ... implementation
  }

  // ... other methods
}

Step 8: Test Your Integration

# Run dev build
npm run dev

# Create a session with your agent
# 1. Open Maestro
# 2. Create new session, select your agent
# 3. Send a message
# 4. Verify output displays correctly
# 5. Test session resume (if supported)
# 6. Test read-only mode (if supported)

Implementation Details

Message Display Classification

Agents may emit intermediary messages (streaming, tool calls) and result messages (final response). Configure display behavior via supportsResultMessages:

supportsResultMessages	Behavior
`true`	Only show result messages prominently; collapse intermediary
`false`	Show all messages as they stream

CLI Argument Builders

The AgentConfig supports several argument builder patterns:

interface AgentConfig {
  // Static arguments always included
  args: string[];

  // Subcommand prefix for batch mode (e.g., ['run'] for opencode)
  batchModePrefix?: string[];

  // Arguments for JSON output
  jsonOutputArgs?: string[];

  // Function to build resume arguments
  resumeArgs?: (sessionId: string) => string[];

  // Arguments for read-only mode
  readOnlyArgs?: string[];
}

ParsedEvent Types

Your output parser should emit these normalized event types:

type ParsedEvent = {
  type: 'init' | 'text' | 'tool_use' | 'result' | 'error' | 'usage';
  sessionId?: string;
  text?: string;
  toolName?: string;
  toolState?: any;
  usage?: { input: number; output: number; cacheRead?: number; cacheWrite?: number };
  slashCommands?: string[];
  raw: any;
};

Error Handling

Maestro has unified error handling for agent failures. Your agent should integrate with this system.

Error Types

Error Type	When to Detect
`auth_expired`	API key invalid, login required
`token_exhaustion`	Context window full
`rate_limited`	Too many requests
`network_error`	Connection failed
`agent_crashed`	Non-zero exit code
`permission_denied`	Operation not allowed

Adding Error Detection

In your output parser, implement the detectError method:

detectError(line: string): AgentError | null {
  for (const [errorType, patterns] of Object.entries(YOUR_AGENT_ERROR_PATTERNS)) {
    for (const pattern of patterns) {
      if (pattern.test(line)) {
        return {
          type: errorType as AgentError['type'],
          message: line,
          recoverable: errorType !== 'agent_crashed',
          agentId: 'your-agent',
          timestamp: Date.now(),
        };
      }
    }
  }
  return null;
}

Testing Your Agent

Unit Tests

Create src/__tests__/parsers/your-agent-output-parser.test.ts:

import { YourAgentOutputParser } from '../../main/parsers/your-agent-output-parser';

describe('YourAgentOutputParser', () => {
  const parser = new YourAgentOutputParser();

  it('parses text events', () => {
    const line = '{"type": "your_text_event", "sessionId": "123", "content": "Hello"}';
    const event = parser.parseJsonLine(line);

    expect(event).toEqual({
      type: 'text',
      sessionId: '123',
      text: 'Hello',
      raw: expect.any(Object),
    });
  });

  it('extracts session ID', () => {
    const event = { type: 'text', sessionId: 'abc-123', raw: {} };
    expect(parser.extractSessionId(event)).toBe('abc-123');
  });

  it('detects auth errors', () => {
    const error = parser.detectError('Error: authentication failed');
    expect(error?.type).toBe('auth_expired');
  });
});

Integration Testing Checklist

Agent appears in agent selection dropdown
New session starts successfully
Output streams to AI Terminal
Session ID captured and displayed
Token usage updates (if applicable)
Session resume works (if applicable)
Read-only mode works (if applicable)
Error modal appears on auth/token errors
Auto Run works with your agent

Supported Agents Reference

Claude Code ✅ Fully Implemented

Aspect	Value
Binary	`claude`
JSON Output	`--output-format stream-json`
Resume	`--resume <session-id>`
Read-only	`--permission-mode plan`
Session ID Field	`session_id` (snake_case)
Session Storage	`~/.claude/projects/<encoded-path>/`

Implementation Status:

✅ Output Parser: src/main/parsers/claude-output-parser.ts
✅ Session Storage: src/main/storage/claude-session-storage.ts
✅ Error Patterns: src/main/parsers/error-patterns.ts
✅ All capabilities enabled

JSON Event Types:

system (init) → session_id, slash_commands
assistant → streaming content
result → final response, modelUsage

OpenCode 🔄 Stub Ready

Aspect	Value
Binary	`opencode`
JSON Output	`--format json`
Resume	`--session <session-id>`
Read-only	`--agent plan`
Session ID Field	`sessionID` (camelCase)
Session Storage	Server-managed

Implementation Status:

✅ Output Parser: src/main/parsers/opencode-output-parser.ts (based on expected format)
✅ Session Storage: src/main/storage/opencode-session-storage.ts (stub, returns empty results)
⏳ Error Patterns: Placeholder, needs real-world testing
⏳ Capabilities: Set to minimal defaults

JSON Event Types:

step_start → session start
text → streaming content
tool_use → tool invocations
step_finish → tokens, completion

Gemini CLI 📋 Planned

Status: Not yet implemented

To Add:

Agent definition in agent-detector.ts
Capabilities in agent-capabilities.ts
Output parser for Gemini JSON format
Error patterns for Google API errors

Codex 📋 Planned

Status: Not yet implemented

To Add:

Agent definition in agent-detector.ts
Capabilities in agent-capabilities.ts
Output parser for Codex JSON format
Error patterns for OpenAI API errors

Qwen3 Coder 📋 Planned

Status: Not yet implemented

To Add:

Agent definition in agent-detector.ts
Capabilities in agent-capabilities.ts (likely local model, no cost tracking)
Output parser for Qwen JSON format
Error patterns (likely minimal for local models)

19 KiB Raw Blame History

Adding Agent Support

Multi-Provider Architecture Status

Adding a New Agent

Table of Contents

Vernacular

Architecture Overview

Agent Capability Model

Capability Interface

Capability-to-UI Feature Mapping

Starting Point: All False

Step-by-Step: Adding a New Agent

Step 1: Agent Discovery

Step 2: Add Agent Definition

Step 3: Define Capabilities

Step 4: Create Output Parser

Step 5: Register Parser in Factory

Step 6: Add Error Patterns (Optional but Recommended)

Step 7: Implement Session Storage (Optional)

Step 8: Test Your Integration

Implementation Details

Message Display Classification

CLI Argument Builders

ParsedEvent Types

Error Handling

Error Types

Adding Error Detection

Testing Your Agent

Unit Tests

Integration Testing Checklist

Supported Agents Reference

Claude Code ✅ Fully Implemented

OpenCode 🔄 Stub Ready

Gemini CLI 📋 Planned

Codex 📋 Planned

Qwen3 Coder 📋 Planned

19 KiB

Raw Blame History