Skip to content

Latest commit

 

History

History
608 lines (476 loc) · 33.9 KB

File metadata and controls

608 lines (476 loc) · 33.9 KB

Ouroboros Architecture

System Overview

Ouroboros is a specification-first AI workflow engine that transforms vague ideas into validated specifications before execution. Built on event sourcing with a rich TUI interface, it provides complete lifecycle management from requirements to evaluation.

Agent OS terminology is intentionally locked so kernel-level PRs do not blur runtime context, control contracts, transport, and observability. See Agent OS Kernel Terminology for the canonical meanings of AgentRuntimeContext, ControlPlane, ControlContract, Directive, ControlBus, and IOJournal.

┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                            USERLEVEL PROGRAMS LAYER  (Section 7)                                     │
│        Installable plugins:  github-pr-ops  ·  jira-sync  ·  release-coordinator   ...               │
│        First-party programs: ooo auto       ·  ooo run    ·  ooo pm                                  │
└──────────────────────────────────────────────┬──────────────────────────────────────────────────────┘
                                               │ declared manifest contract / declared scopes
                                               ▼
┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                          OUROBOROS ARCHITECTURE  (core, Sections 1–6)                                │
├─────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                     │
│  ┌─────────────────────┐     ┌─────────────────────┐     ┌─────────────────────┐                 │
│  │  SKILLS & AGENTS    │     │      CORE LAYER     │     │      PRESENTATION    │                 │
│  │     REGISTRY        │     │                     │     │      LAYER           │                 │
│  │  ┌───────────────┐  │     │  ┌───────────────┐  │     │  ┌───────────────┐  │                 │
│  │  │   Skills      │──┼─────┼─▶│   Seed Spec    │──┼─────┼─▶│   TUI Dashboard │  │                 │
│  │  │   (9)         │  │     │  │   (Immutable)  │  │     │  │   (Textual)   │  │                 │
│  │  └───────────────┘  │     │  └───────────────┘  │     │  └───────────────┘  │                 │
│  │                     │     │                     │     │                     │                 │
│  │  ┌───────────────┐  │     │  ┌───────────────┐  │     │  ┌───────────────┐  │                 │
│  │  │   Agents      │──┼─────┼─▶│  Acceptance    │──┼─────┼─▶│   CLI Interface│  │                 │
│  │  │   (9)         │  │     │  │  Criteria Tree │  │     │  │   (Typer)    │  │                 │
│  │  └───────────────┘  │     │  └───────────────┘  │     │  └───────────────┘  │                 │
│  └─────────────────────┘     └─────────────────────┘     └─────────────────────┘                 │
│           │                         │                         │                                 │
│           └─────────────────────────┼─────────────────────────┘                                 │
│                                      │                                                         │
│           ┌─────────────────────────┼─────────────────────────┐                                 │
│           │                         │                         │                                 │
│  ┌─────────────────────┐     ┌─────────────────────┐     ┌─────────────────────┐                 │
│  │    EXECUTION LAYER   │     │    STATE LAYER     │     │    ORCHESTRATION    │                 │
│  │                     │     │                     │     │      LAYER         │                 │
│  │  ┌───────────────┐  │     │  ┌───────────────┐  │     │  ┌───────────────┐  │                 │
│  │  │ 7 Execution  │  │     │  │ Event Store  │  │     │  │ 6-Phase       │  │                 │
│  │  │   Modes      │  │     │  │  (SQLite)    │  │     │  │ Pipeline      │  │                 │
│  │  └───────────────┘  │     │  └───────────────┘  │     │  └───────────────┘  │                 │
│  │                     │     │                     │     │                     │                 │
│  │  ┌───────────────┐  │     │  │ Checkpoint   │  │     │  │ PAL Router    │  │                 │
│  │  │ Model Router │  │     │  │   Store      │  │     │  │ (Cost Opt.)   │  │                 │
│  │  └───────────────┘  │     │  └───────────────┘  │     │  └───────────────┘  │                 │
│  └─────────────────────┘     └─────────────────────┘     └─────────────────────┘                 │
│                                                                                                     │
└─────────────────────────────────────────────────────────────────────────────────────────────────────┘

Core Components Overview

1. Skills & Agents Registry

Auto-discovery of bundled skills and agents that ship with Ouroboros core

  • Skills: 14 core workflow skills (interview, seed, run, evaluate, evolve, cancel, unstuck, update, help, setup, ralph, tutorial, welcome, status)
  • Agents: 9 specialized agents for different thinking modes
  • Hot-reload capabilities without restart
  • Magic prefix detection (/ouroboros:)

Note: This layer is the in-process registry of bundled skills and agents that ship with core. It is not the user-installable UserLevel plugin layer (introduced in #725). User-installable plugins live one layer above this — see Section 7: UserLevel Programs Layer below. The full layer-model and contract are described in issue #725; a long-form RFC is being drafted in PR #743 and will land at docs/rfc/userlevel-plugins.md once merged. Do not conflate the two registries in PRs.

2. Core Layer

Immutable data models and specifications

  • Seed: Immutable frozen Pydantic model
  • Acceptance Criteria Tree: Recursive decomposition with MECE principle
  • Ontology schema: Structural validation
  • Version tracking and ambiguity scoring

3. Execution Layer

Evolutionary execution with feedback loops

  • Self-referential persistence loop with verification
  • Dependency-aware parallel execution
  • Automatic scaling and resilience

4. State Layer

Event sourcing for complete auditability

  • SQLite event store with append-only writes
  • Full replay capability
  • Checkpoint system with compression
  • 5 optimized indexes for performance

5. Orchestration Layer

6-phase pipeline ensuring comprehensive execution

  • Phase 0: Big Bang (Interview → Seed)
  • Phase 1: PAL Router (Cost optimization)
  • Phase 2: Double Diamond (Discover → Define → Design → Deliver)
  • Phase 3: Resilience (Lateral thinking)
  • Phase 4: Evaluation (3-stage pipeline)
  • Phase 5: Secondary Loop (TODO registry)

6. Presentation Layer

Rich TUI interface with real-time visibility

  • Textual-based dashboard with live updates
  • AC tree visualization with progress tracking
  • Agent activity monitor
  • Cost tracking and drift visualization
  • Interactive debugging capabilities

7. UserLevel Programs Layer

Workflows composed on top of core primitives via a declared manifest contract — both first-party programs that ship today (ooo auto, ooo run, ooo pm) and a planned third-party plugin surface (introduced in #725; long-form RFC in flight in PR #743, which will land at docs/rfc/userlevel-plugins.md). The third-party install surface (ooo plugin add ...) is not yet implemented on main; this section documents the architectural target so PRs can be discussed in consistent terms.

This layer is distinct from the in-process Skills & Agents Registry (Section 1):

  • Skills & Agents Registry = in-process subsystem of bundled skills/agents that ship with core. Discovered via /ouroboros: magic prefix. Used by ooo interview, ooo qa, etc.
  • UserLevel Programs Layer = workflows composed on top of core primitives via a declared manifest contract. First-party programs shipped today: ooo auto, ooo run, ooo pm. Installable third-party plugins (e.g. github-pr-ops, merge-assistant, jira-sync) are planned: the manifest schema is being prototyped at Q00/ouroboros-plugins/schemas/0.1/ and the ooo plugin add <repo-url> install surface is tracked under #725; it does not exist on main yet. First-party and installable programs will share the same manifest format.
+----------------------------------------------------------------------+
| UserLevel Programs Layer                                             |
|                                                                      |
|   Shipped (first-party):     ooo auto   ooo run   ooo pm             |
|   Planned (third-party):     github-pr-ops  jira-sync  ...  (#725)   |
+--------------------------+-------------------------------------------+
                           | declared manifest contract
                           v
                      Ouroboros core
                      (Sections 1–6 above)

Why a separate layer? To keep ooo auto coherent (goal → interview → Seed → handoff only), keep core small, and route domain-specific operational workflows (GitHub PR ops, Jira triage, Slack incident response) into pluggable packages rather than into core or ooo auto.

Manifest contract: see Q00/ouroboros-plugins/schemas/0.1/.

Terminology guard: in this codebase, "plugin" by itself can refer to either the in-process Skills/Agents Registry or the UserLevel Programs Layer. When writing PRs or docs, qualify the term — e.g. "skill plugin" (Section 1) vs "UserLevel plugin" (Section 7). When in doubt, link the relevant section.

Philosophy

The Problem

Human requirements arrive ambiguous, incomplete, contradictory, and surface-level. If AI executes such input directly, the result is GIGO (Garbage In, Garbage Out).

The Solution

Ouroboros applies two ancient methods to transmute irrational input into executable truth:

  1. Socratic Questioning - Reveals hidden assumptions, exposes contradictions, challenges the obvious
  2. Ontological Analysis - Finds the root problem, separates essential from accidental, maps the structure of being

The Six Phases

Phase 0: BIG BANG         -> Crystallize requirements into a Seed
Phase 1: PAL ROUTER       -> Select appropriate model tier
Phase 2: DOUBLE DIAMOND   -> Decompose and execute tasks
Phase 3: RESILIENCE       -> Handle stagnation with lateral thinking
Phase 4: EVALUATION       -> Verify outputs at three stages
Phase 5: SECONDARY LOOP   -> Process deferred TODOs
         ↺ (cycle back as needed)

Phase 0: Big Bang

The Big Bang phase transforms vague ideas into crystallized specifications through iterative questioning. The seed is auto-generated at the end of this phase — users do not need to author seeds manually in the normal flow.

Components:

  • bigbang/interview.py — InterviewEngine for conducting Socratic interviews
  • bigbang/ambiguity.py — Ambiguity score calculation
  • bigbang/seed_generator.py — Seed generation from interview results

Process:

  1. User provides initial context/idea (ooo interview "..." in Claude Code, or via MCP tools)
  2. Engine asks clarifying questions (up to MAX_INTERVIEW_ROUNDS)
  3. Ambiguity score calculated after each response
  4. Interview completes when ambiguity <= 0.2
  5. Immutable Seed auto-generated and stored in ~/.ouroboros/seeds/

Gate: Ambiguity <= 0.2

Phase 1: PAL Router (Progressive Adaptive LLM)

The PAL Router selects the most cost-effective model tier based on task complexity.

Components:

  • routing/router.py - Main routing logic
  • routing/complexity.py - Task complexity estimation
  • routing/tiers.py - Model tier definitions
  • routing/escalation.py - Escalation logic on failure
  • routing/downgrade.py - Downgrade logic on success

Tiers:

Tier Cost Complexity Threshold
FRUGAL 1x < 0.4
STANDARD 10x < 0.7
FRONTIER 30x >= 0.7 or critical

Strategy: Start frugal, escalate only on failure.

Complexity Scoring Algorithm:

The complexity score is a weighted sum of three normalized factors:

Factor Weight Normalization Threshold
Token count 30% min(tokens / 4000, 1.0) 4000 tokens
Tool dependencies 30% min(tools / 5, 1.0) 5 tools
AC nesting depth 40% min(depth / 5, 1.0) depth 5
complexity = 0.30 * norm_tokens + 0.30 * norm_tools + 0.40 * norm_depth

Escalation Path:

When a task fails consecutively at its current tier (threshold: 2 failures), it escalates:

Frugal → Standard → Frontier → Stagnation Event (triggers resilience)

Downgrade Path:

After sustained success (threshold: 5 consecutive successes), the tier downgrades:

Frontier → Standard → Frugal

Similar task patterns (Jaccard similarity >= 0.80) inherit tier preferences from previously successful tasks.

Phase 2: Double Diamond

The execution phase uses the Double Diamond design process with recursive decomposition.

Components:

  • execution/double_diamond.py - Four-phase execution cycle
  • execution/decomposition.py - Hierarchical task decomposition
  • execution/atomicity.py - Atomicity detection for tasks
  • execution/subagent.py - Isolated subagent execution

Four Phases:

  1. Discover (divergent) - Explore the problem space broadly
  2. Define (convergent) - Converge on the core problem
  3. Design (divergent) - Explore solution approaches
  4. Deliver (convergent) - Converge on implementation

Recursive Decomposition:

Each AC goes through Discover and Define, then atomicity is checked:

  • Atomic (single-focused, 1-2 files) → proceed to Design and Deliver
  • Non-atomic → decompose into 2-5 child ACs, recurse on each child

Key constraints:

  • MAX_DEPTH = 5 — hard recursion limit
  • COMPRESSION_DEPTH = 3 — context truncated to 500 chars at depth 3+
  • Children are dependency-sorted and executed in parallel within each level

For the current recursive execution flow, see parallel_executor.py and runner.py.

Phase 3: Resilience

When execution stalls, the resilience system detects stagnation and applies lateral thinking.

Components:

  • resilience/stagnation.py - Stagnation detection (4 patterns)
  • resilience/lateral.py - Persona rotation and lateral thinking

Stagnation Patterns (4):

Pattern Detection Default Threshold
SPINNING Same output hash repeated (SHA-256) 3 repetitions
OSCILLATION A→B→A→B alternating pattern 2 cycles
NO_DRIFT Drift score unchanging (epsilon < 0.01) 3 iterations
DIMINISHING_RETURNS Progress improvement rate < 0.01 3 iterations

Detection is stateless — all state passed via ExecutionHistory (phase outputs, error signatures, drift scores).

Personas (5):

Persona Strategy Best For (Affinity)
HACKER Unconventional workarounds SPINNING
RESEARCHER Seek more information NO_DRIFT, DIMINISHING_RETURNS
SIMPLIFIER Reduce complexity DIMINISHING_RETURNS, OSCILLATION
ARCHITECT Restructure fundamentally OSCILLATION, NO_DRIFT
CONTRARIAN Challenge all assumptions All patterns

Each persona generates a thinking prompt (not a solution). suggest_persona_for_pattern() recommends the best persona for a given stagnation type based on these affinities.

Phase 4: Evaluation

Three-stage progressive evaluation ensures quality while minimizing cost.

Components:

  • evaluation/pipeline.py - Evaluation pipeline orchestration
  • evaluation/mechanical.py - Stage 1: Mechanical checks
  • evaluation/semantic.py - Stage 2: Semantic verification
  • evaluation/consensus.py - Stage 3: Multi-model consensus
  • evaluation/trigger.py - Consensus trigger matrix

Stages:

  1. Mechanical ($0) — Lint, build, test, static analysis, coverage (threshold: 70%)
    • Auto-detects project language from marker files (e.g., uv.lock → Python/uv, Cargo.toml → Rust, go.mod → Go, package-lock.json → Node). Supported: Python, Rust, Go, Zig, Node (npm/pnpm/bun/yarn).
    • Projects can override or extend commands via .ouroboros/mechanical.toml. Overrides are validated against an executable allowlist for security in CI/CD environments.
    • If no language is detected, Stage 1 checks are skipped and evaluation proceeds to Stage 2.
    • If any check fails → pipeline stops, returns failure
  2. Semantic ($$) — AC compliance, goal alignment, drift, uncertainty scoring
    • If score >= 0.8 and no trigger → approved without consensus
    • Uses Standard tier model (temperature: 0.2)
  3. Consensus ($$$) — Multi-model voting, only when triggered by 1 of 6 conditions
    • Simple mode: 3 models vote (GPT-4o, Claude Sonnet 4, Gemini 2.5 Pro), 2/3 majority required
    • Deliberative mode: Advocate/Devil's Advocate/Judge roles with ontological questioning

6 Consensus Trigger Conditions (checked in priority order):

  1. Seed modification (seeds are immutable — any change requires consensus)
  2. Ontology evolution (schema changes affect output structure)
  3. Goal reinterpretation
  4. Seed drift > 0.3
  5. Stage 2 uncertainty > 0.3
  6. Lateral thinking adoption

For the current evaluation flow, see pipeline.py and definitions.py.

For failure modes, error-handling guidance, and configuration reference, see the Evaluation Pipeline Guide.

Phase 5: Secondary Loop

Non-critical tasks are deferred to maintain focus on the primary goal.

Components:

  • secondary/todo_registry.py - TODO item tracking
  • secondary/scheduler.py - Batch processing scheduler

Process:

  1. During execution, non-blocking TODOs registered
  2. After primary goal completion, TODOs batch-processed
  3. Low-priority tasks executed during idle time

Module Structure

src/ouroboros/
|
+-- core/           # Foundation: types, errors, seed, context
|   +-- types.py       # Result type, type aliases
|   +-- errors.py      # Error hierarchy
|   +-- seed.py        # Immutable Seed specification
|   +-- context.py     # Workflow context management
|   +-- ac_tree.py     # Acceptance criteria tree
|
+-- bigbang/        # Phase 0: Interview and seed generation
+-- routing/        # Phase 1: PAL router
+-- execution/      # Phase 2: Double Diamond execution
+-- resilience/     # Phase 3: Stagnation and lateral thinking
+-- evaluation/     # Phase 4: Three-stage evaluation
+-- secondary/      # Phase 5: TODO registry and scheduling
|
+-- orchestrator/   # Runtime abstraction and orchestration
|   +-- adapter.py     # AgentRuntime protocol, ClaudeAgentAdapter
|   +-- codex_cli_runtime.py  # CodexCliRuntime adapter
|   +-- runtime_factory.py    # create_agent_runtime() factory
|   +-- runner.py      # Orchestration logic
|   +-- session.py     # Session state tracking
|   +-- events.py      # Orchestrator events
|   +-- mcp_tools.py   # MCP tool provider for external tools
|   +-- mcp_config.py  # MCP client configuration loading
|
+-- mcp/            # Model Context Protocol integration
|   +-- client/        # MCP client for external servers
|   +-- server/        # MCP server exposing Ouroboros
|   +-- tools/         # Tool definitions and registry
|   +-- resources/     # Resource handlers
|
+-- providers/      # LLM provider adapters
|   +-- base.py        # Provider protocol
|   +-- litellm_adapter.py  # LiteLLM integration
|
+-- persistence/    # Event sourcing and checkpoints
|   +-- event_store.py # Event storage
|   +-- checkpoint.py  # Checkpoint/recovery
|   +-- schema.py      # Database schema
|
+-- observability/  # Logging and monitoring
|   +-- logging.py     # Structured logging
|   +-- drift.py       # Drift measurement
|   +-- retrospective.py  # Automatic retrospectives
|
+-- config/         # Configuration management
+-- cli/            # Command-line interface

Core Concepts

The Seed

The Seed is the "constitution" of a workflow — an immutable specification with:

  • Goal — Primary objective
  • Constraints — Hard requirements that must be satisfied
  • Acceptance Criteria — Specific criteria for success
  • Ontology Schema — Structure of workflow outputs
  • Exit Conditions — When to terminate

In the normal flow, seeds are auto-generated by the Socratic interview (ooo interview in Claude Code, or via MCP tools). Most users never need to create or edit a seed manually — the interview handles crystallization automatically.

Once generated, the Seed cannot be modified (frozen Pydantic model).

Advanced: For power users who want to hand-craft or edit seed YAML directly, see the Seed Authoring Guide.

Result Type

Ouroboros uses a Result type for handling expected failures without exceptions:

result: Result[int, str] = Result.ok(42)
# or
result: Result[int, str] = Result.err("something went wrong")

if result.is_ok:
    process(result.value)
else:
    handle_error(result.error)

Event Sourcing

All state changes are persisted as immutable events in a single SQLite table (events) via SQLAlchemy Core:

  • Event types use dot-notation past tense (e.g., orchestrator.session.started, orchestrator.session.completed)
  • Append-only — events can never be modified or deleted
  • Unit of Work pattern groups events + checkpoint into atomic commits
  • Replay capability — reconstruct any session by replaying its events

Enables:

  • Full audit trail
  • Checkpoint/recovery (3-level rollback depth, 5-minute periodic checkpointing)
  • Session resumption
  • Retrospective analysis

Event Schema:

  • Single events table with columns: id (UUID), aggregate_type, aggregate_id, event_type, payload (JSON), timestamp, consensus_id
  • 5 indexes: aggregate_type, aggregate_id, (aggregate_type, aggregate_id) composite, event_type, timestamp

Security Limits

Input validation constants for DoS prevention (defined in core/security.py):

Constant Value Purpose
MAX_INITIAL_CONTEXT_LENGTH 50,000 chars Interview input limit
MAX_USER_RESPONSE_LENGTH 10,000 chars Interview response limit
MAX_SEED_FILE_SIZE 1,000,000 bytes Seed YAML file size cap
MAX_LLM_RESPONSE_LENGTH 100,000 chars LLM response truncation

Drift Control

Drift measurement tracks how far execution has strayed from the original Seed:

  • Drift score 0.0 - 1.0
  • Automatic retrospective every N cycles
  • High drift triggers re-examination of the Seed

Runtime Abstraction Layer

Ouroboros decouples workflow orchestration from the agent runtime that executes tasks. The runtime abstraction layer allows different AI coding tools to serve as runtime backends while the core engine (event sourcing, six-phase pipeline, evaluation) remains unchanged.

Architecture overview

                          ┌──────────────────────────┐
                          │   Orchestrator / Runner   │
                          │  (runtime-agnostic core)  │
                          └────────────┬─────────────┘
                                       │ uses AgentRuntime protocol
                          ┌────────────┴─────────────┐
                          │      RuntimeFactory       │
                          │  create_agent_runtime()   │
                          └────┬──────────┬──────┬───┘
                               │          │      │
              ┌────────────────┘          │      └────────────────┐
              ▼                           ▼                       ▼
  ┌─────────────────────┐   ┌─────────────────────┐   ┌─────────────────────┐
  │  ClaudeAgentAdapter │   │   CodexCliRuntime    │   │   (future adapter)  │
  │   backend="claude"  │   │   backend="codex"    │   │                     │
  │  session-oriented   │   │   session-oriented   │   │                     │
  └─────────────────────┘   └─────────────────────┘   └─────────────────────┘

Both ClaudeAgentAdapter and CodexCliRuntime expose the same AgentRuntime protocol and provide equivalent session-oriented workflow capabilities. The orchestrator interacts with each backend exclusively through normalized AgentMessage / RuntimeHandle types — backend-specific communication details are fully encapsulated inside the adapters.

Key abstractions

Every runtime adapter satisfies the AgentRuntime protocol (defined in src/ouroboros/orchestrator/adapter.py), which requires two methods: execute_task() (async streaming) and execute_task_to_result() (collected result).

Type Purpose
AgentMessage Normalized streaming message (assistant text, tool calls, results)
RuntimeHandle Backend-neutral frozen dataclass for session resume/observe/terminate
TaskResult Collected outcome of a completed task execution

The orchestrator never inspects backend-specific internals — each adapter maps its native events into these shared types.

Shipped adapters

  • ClaudeAgentAdapter (backend="claude") — Wraps Claude Agent SDK / Claude Code CLI with streaming, retry, and session resumption. Module: src/ouroboros/orchestrator/adapter.py
  • CodexCliRuntime (backend="codex") — Drives the OpenAI Codex CLI as a session-oriented runtime with NDJSON event parsing. Module: src/ouroboros/orchestrator/codex_cli_runtime.py
  • OpenCodeRuntime (backend="opencode") — Drives the OpenCode CLI with multi-provider support. Module: src/ouroboros/orchestrator/opencode_runtime.py
  • HermesRuntime (backend="hermes") — Drives the Hermes Agent for local or hosted models. Module: src/ouroboros/orchestrator/hermes_runtime.py
  • GeminiCliRuntime (backend="gemini") — Drives the Google Gemini CLI in stream-json mode. Module: src/ouroboros/orchestrator/gemini_cli_runtime.py
  • KiroAdapter (backend="kiro") — Drives the Kiro CLI in headless mode. Module: src/ouroboros/orchestrator/kiro_adapter.py
  • CopilotCliLLMAdapter (backend="copilot") — Drives the GitHub Copilot CLI via copilot -p, with live model discovery (queries https://api.githubcopilot.com/models at setup) and automatic hyphen-to-dotted model name mapping for cross-runtime config compatibility. Module: src/ouroboros/providers/copilot_cli_adapter.py

Each runtime has different tool sets, permission models, and streaming semantics. Ouroboros normalizes these differences at the adapter boundary, but feature parity is not guaranteed across runtimes.

Runtime factory

create_agent_runtime() in src/ouroboros/orchestrator/runtime_factory.py resolves the backend name and returns the appropriate adapter. The backend can be set via:

  1. OUROBOROS_AGENT_RUNTIME environment variable
  2. orchestrator.runtime_backend in ~/.ouroboros/config.yaml
  3. Explicit backend= parameter

Accepted aliases: claude / claude_code, codex / codex_cli, opencode / opencode_cli, hermes / hermes_cli, gemini / gemini_cli, kiro / kiro_cli, copilot / copilot_cli.

For API details, see the source in src/ouroboros/orchestrator/adapter.py. For contributing a new runtime adapter, see Contributing.

Integration Points

MCP (Model Context Protocol)

Ouroboros functions as a bidirectional MCP Hub:

  • Server mode (ouroboros mcp serve) — Exposes tools (ouroboros_execute_seed, ouroboros_session_status, ouroboros_query_events) to Claude Desktop and other MCP clients
  • Client mode (ouroboros run --mcp-config mcp.yaml) — Discovers and consumes tools from external MCP servers (filesystem, GitHub, databases, etc.), merged with built-in tools

Tool precedence: built-in tools win over MCP tools; first MCP server in config wins for duplicates.

LiteLLM

All LLM calls go through LiteLLM for provider abstraction (100+ models), automatic retries, cost tracking, and streaming support.

Design Principles

  1. Frugal First - Start with the cheapest option, escalate only when needed
  2. Immutable Direction - The Seed cannot change; only the path to achieve it adapts
  3. Progressive Verification - Cheap checks first, expensive consensus only at gates
  4. Lateral Over Vertical - When stuck, change perspective rather than try harder
  5. Event-Sourced - Every state change is an event; nothing is lost

Extension Points

  • Skills — Add YAML-defined skills in skills/ with magic prefix detection and tool declarations
  • Agents — Add bundled specialist prompts in src/ouroboros/agents/; use OUROBOROS_AGENTS_DIR for explicit local overrides
  • MCP integration — Bidirectional: expose Ouroboros tools as an MCP server, or consume external MCP servers during execution
  • Runtime adapters — Implement the AgentRuntime protocol and register in the runtime factory

Error Handling & Recovery

Ouroboros handles errors through four categories: validation errors (invalid seeds), execution errors (agent failures/timeouts), system errors (network/resource), and business errors (ambiguity > 0.2, stagnation). Recovery mechanisms include session replay from checkpoints, agent respawn, tier escalation, and persona switching.

Configuration

For environment variables, config.yaml schema, and all configuration options, see config-reference.md.


For install instructions and first-run onboarding, see Getting Started. For backend-specific configuration, see the Claude Code, Codex CLI, OpenCode, Hermes, Gemini, Kiro CLI, and GitHub Copilot CLI runtime guides.