This changeset introduces Hermes CLI as a fourth runtime option alongside Claude, Codex, and OpenCode. It includes:
- New
HermesCliRuntimeclass and Hermes artifact installer - New
ouroboros.skillspackage for runtime-agnostic skill packaging - Tracked changes across config, setup, factory, adapter, and packaging
- Intentional decoupling: Codex/OpenCode setup no longer auto-registers with Claude MCP
The review covers correctness, packaging safety, regression risk, and test adequacy.
C1. src/ouroboros/skills/__init__.py shadows repo-root skills/ — breaks ALL editable-install skill resolution (8 test failures)
- Files:
src/ouroboros/skills/__init__.py(new) +src/ouroboros/codex/artifacts.py:162-166+pyproject.toml - Why: The parent-directory walk in
_packaged_codex_skills_dir()(lines 162-166) walks up fromcodex/artifacts.pyand findssrc/ouroboros/skills/before reaching the repo-rootskills/directory. Butsrc/ouroboros/skills/only contains__init__.py— no actual skill bundles. The walk stops too early, yielding a directory that has noSKILL.mdfiles. - Root cause: Adding
src/ouroboros/skills/__init__.pycreated a Python package that shadows the repo-root skills directory in the parent walk. Previously,src/ouroboros/skills/didn't exist, so the walk continued up to the repo root where the actual skills live. - Impact: 8 tests fail, including Codex skill resolution, skill smoke tests, and passthrough tests. ALL skill interception is broken in editable (development) installs. Wheel installs work because
ouroboros/skills/in the wheel contains both__init__.pyand skill bundles viaforce-include. - Evidence:
uv run pytest tests/shows:FAILED test_codex_artifacts.py::test_resolves_repo_packaged_skill_path_by_defaultFAILED test_codex_artifacts.py::test_installs_repo_packaged_skills_by_defaultFAILED test_codex_artifacts.py::test_resolves_repo_skills_and_packaged_rules_by_defaultFAILED test_codex_skill_smoke.py::test_packaged_ooo_prefixes_dispatch_from_skill_frontmatter[run]FAILED test_codex_skill_smoke.py::test_packaged_ooo_prefixes_dispatch_from_skill_frontmatter[interview]FAILED test_codex_cli_passthrough_smoke.py(2 failures)FAILED test_codex_skill_fallback.py::test_codex_mcp_timeout_falls_back_to_pass_through_cli_flow
- Fix options:
- Add
_contains_skill_bundles()check to_packaged_codex_skills_dirparent walk so it skips directories without actual skill content - Move the skill resolver out of
ouroboros.codexintoouroboros.skills.resolverand useimportlib.resources.files("ouroboros.skills")as primary path - Remove
src/ouroboros/skills/__init__.pyand rely on implicit namespace packages for the force-include
- Add
- File:
src/ouroboros/orchestrator/hermes_runtime.py:252 - Why:
await process.communicate()blocks indefinitely. Codex/OpenCode have startup timeouts (60s) and idle timeouts (300s). A hung Hermes process freezes the entire orchestrator. - Evidence: Line 252 — no timeout wrapper around
process.communicate(). - Repro: Start Hermes with a prompt that causes it to hang (e.g., waiting for interactive input). The orchestrator never recovers.
- File:
src/ouroboros/orchestrator/hermes_runtime.py:245-250 - Why: Codex/OpenCode track
_OUROBOROS_DEPTHand strip Ouroboros env vars before spawning subprocesses. Hermes doesn't. If Hermes invokes Ouroboros via MCP → which invokes Hermes again → infinite recursion. - Evidence: No
env=parameter passed tocreate_subprocess_exec(). No depth counter. Compare withcodex_cli_runtime.pywhich increments_OUROBOROS_DEPTHand caps at 5.
H3. _packaged_codex_skills_dir primary importlib resolution is dead — relies entirely on fragile fallback
- File:
src/ouroboros/codex/artifacts.py:155-169+pyproject.tomldiff - Why: The
force-includechange moves skills fromouroboros/codex/skillstoouroboros/skillsin the wheel. The primaryimportlib.resources.files("ouroboros.codex").joinpath("skills")no longer finds askillssubdirectory. In wheel installs, the parent-walk fallback happens to work by findingouroboros/skills/two levels up. In editable installs, the same fallback hits the C1 shadowing bug. - Verified: Wheel test passes (
resolve_packaged_codex_skill_path("run")works from installed wheel). Editable install fails (see C1).
- File:
src/ouroboros/cli/commands/setup.py:333-335(Codex) and:668-671(OpenCode) - Why:
_ensure_claude_mcp_entry()was removed from_setup_codex()and_setup_opencode(). Users who ranouroboros setup --runtime codexexpected it to also register with Claude MCP. This is now a silent regression for multi-runtime users. - Evidence: Diff removes the
if (Path.home() / ".claude").is_dir(): _ensure_claude_mcp_entry()block from both functions. Tests were updated to assert the new behavior. - Recommendation: Document the change in release notes. Consider a migration hint in setup output.
- File:
src/ouroboros/hermes/artifacts.py:76-77 - Why:
shutil.copytree(source_root, target_dir)copies everything from the resolved skills directory. In wheel installs,source_rootisouroboros/skills/which includes__init__.py(a Python package marker). This file is meaningless in~/.hermes/skills/autonomous-ai-agents/ouroboros/and pollutes the Hermes skill namespace. - Contrast:
install_codex_skills()incodex/artifacts.pyiterates individual skill directories and copies each one, avoiding non-skill files.
- File:
src/ouroboros/orchestrator/hermes_runtime.py:22 - Why:
from ouroboros.codex import resolve_packaged_codex_skill_path— the Hermes runtime depends on a Codex-specific module for its own skill resolution. If the Codex module's resolution strategy changes (it already has — see H3), Hermes is silently affected. The function name and module path mislead about which runtime is consuming it.
- File:
src/ouroboros/orchestrator/hermes_runtime.py:80-85 - Why:
_HERMES_SESSION_ID_PATTERN.search(output)finds the first match. Line 84 slicesoutput[:match.start()]discarding everything after the session_id line. If Hermes emits a session_id line followed by additional content, that content is silently dropped. - Evidence: The multiline flag
(?m)allows^to match any line start, sosession_id: ...could appear anywhere in output.
- File:
src/ouroboros/hermes/artifacts.py:59-61 - Why:
install_hermes_skills(prune=True)is called by_install_hermes_artifacts()in setup.py, but the function assigns_ = pruneand never uses it. Since the function doesshutil.rmtree+copytree(full replacement),pruneis semantically irrelevant, but the API is misleading.
- File:
src/ouroboros/orchestrator/hermes_runtime.py:44-46 - Why:
[a-f0-9]+excludes uppercase hex. If Hermes ever generates uppercase hex session IDs, parsing silently fails and session resume breaks without error.
- File:
src/ouroboros/orchestrator/hermes_runtime.py:315-323 - Why:
resolved_skill_path = Path(str(skill_md_path))captures the path string inside thewith resolve_packaged_codex_skill_path(...)context manager. After thewithblock exits,importlib.resourcesmay clean up the temp file. The path stored inSkillInterceptRequestcould point to a deleted temp file. - Current impact: LOW —
skill_pathis not read after construction. But it's a latent bug.
- File:
src/ouroboros/cli/commands/setup.py:407,417 - Why:
config_path.read_text()andconfig_path.open("w")omit encoding. Other setup functions (including_register_hermes_mcp_server) specifyencoding="utf-8". Inconsistent and could fail on non-UTF-8 default locales.
- File:
src/ouroboros/orchestrator/hermes_runtime.py:253 - Why:
.decode()defaults to system encoding. Codex/OpenCode explicitly decode as UTF-8. Could fail if Hermes outputs UTF-8 on a system with a different default encoding.
- File:
docs/runtime-guides/hermes.md - Why: Documentation describes session management and quiet-output parsing as if they are robust features. The implementation lacks timeout protection (H1), depth tracking (H2), and streaming (Hermes waits for full output). The docs should mention these limitations.
- File:
.codex - Why: Repository hygiene. Empty file with no apparent purpose. Should either be given content, added to
.gitignore, or removed.
- Why: Generated bytecode directory in the new skills package. Should be gitignored.
# 1. Sync dependencies
uv sync --dev
# 2. Linting
uv run ruff check src/ tests/
uv run ruff format --check src/ tests/
# 3. Type checking
uv run mypy src/ouroboros
# 4. New Hermes tests
uv run pytest tests/unit/hermes/test_artifacts.py tests/unit/orchestrator/test_hermes_runtime.py -v
# 5. Blast-radius tests
uv run pytest tests/unit/cli/test_setup.py tests/unit/cli/test_config.py \
tests/unit/orchestrator/test_runtime_factory.py tests/unit/providers/test_factory.py -v
# 6. Full orchestrator + CLI suites
uv run pytest tests/unit/orchestrator/ -v
uv run pytest tests/unit/cli/ -v
# 7. Full test suite with coverage
uv run pytest tests/ --cov=src/ouroboros --cov-report=term-missing -v# 8. Build wheel
uv build
# 9. Inspect wheel contents
unzip -l dist/ouroboros_ai-*.whl | grep -E "(skills/|codex/skills)"
# 10. Verify importlib resolution from wheel
pip install dist/ouroboros_ai-*.whl --force-reinstall --no-deps -q && \
python -c "
import importlib.resources
# Check new skills package
skills = importlib.resources.files('ouroboros.skills')
print('ouroboros.skills:', skills)
# Check old codex path
codex = importlib.resources.files('ouroboros.codex')
codex_skills = codex.joinpath('skills')
print('ouroboros.codex/skills exists:', codex_skills.is_dir())
# Check resolve_packaged_codex_skill_path still works
from ouroboros.codex.artifacts import resolve_packaged_codex_skill_path
with resolve_packaged_codex_skill_path('run') as p:
print('resolve run skill:', p, p.is_file())
"# 11. Hermes setup in isolated environment
export TMPDIR=$(mktemp -d)
export HOME=$TMPDIR
mkdir -p $HOME/.local/bin
# Create stub hermes binary
echo '#!/bin/sh' > $HOME/.local/bin/hermes && chmod +x $HOME/.local/bin/hermes
export PATH="$HOME/.local/bin:$PATH"
uv run ouroboros setup --runtime hermes
# Verify outcomes:
cat $HOME/.ouroboros/config.yaml # runtime_backend=hermes, llm.backend preserved
cat $HOME/.hermes/config.yaml # mcp_servers.ouroboros present
ls $HOME/.hermes/skills/autonomous-ai-agents/ouroboros/ # skills installed
test ! -f $HOME/.claude/claude_desktop_config.json # Claude config NOT touched
# Cleanup
rm -rf $TMPDIR# 12. Verify Codex setup no longer touches Claude config
export TMPDIR=$(mktemp -d)
export HOME=$TMPDIR
mkdir -p $HOME/.local/bin $HOME/.claude
echo '#!/bin/sh' > $HOME/.local/bin/codex && chmod +x $HOME/.local/bin/codex
export PATH="$HOME/.local/bin:$PATH"
uv run ouroboros setup --runtime codex
# Should NOT have Claude MCP entry:
test ! -f $HOME/.claude/claude_desktop_config.json && echo "PASS: Claude config untouched"
rm -rf $TMPDIR- Hermes runtime properties, skill intercept pattern matching, template resolution
- Quiet-output parsing (banner stripping, reasoning stripping, session ID extraction)
- Interview session resume via handle metadata
- Subprocess execution with mocked process (success + error paths)
- Hermes artifact installation (editable-install path)
- Setup config writing, MCP registration, scalar config repair
- Hermes not registered as LLM backend
- Runtime factory routing for hermes aliases
- Wheel-install skill resolution — No test verifies
resolve_packaged_codex_skill_pathworks after theforce-includepath change in a non-editable install install.shbehavior — Shell script changes are untested (no integration test harness)- Malformed Hermes quiet output — No tests for: empty stdout, binary garbage, session_id appearing mid-output, extremely large output
- Partial/missing session metadata — No test for
handle.metadatamissing_INTERVIEW_SESSION_METADATA_KEYwhencurrent_handleis non-None - Broken YAML frontmatter — Tests don't cover: truncated YAML, non-UTF-8 bytes, YAML with anchors/aliases
- Bad
~/.hermes/config.yamlshapes — Tests cover scalar top-level but not: nested non-dictmcp_servers, missing permissions, read-only file - Regression coverage for Claude/Codex/OpenCode — Existing tests were updated to match new behavior but no explicit regression test verifies Codex skill resolution still works after the packaging path change
- Timeout/hang scenarios — No test for Hermes subprocess that never exits
- Depth recursion — No test for Ouroboros→Hermes→Ouroboros recursion prevention
The codebase should move resolve_packaged_codex_skill_path and the skill resolution chain out of ouroboros.codex into a shared module (e.g., ouroboros.skills.resolver). Currently Hermes imports from ouroboros.codex and the Codex resolver's importlib path is broken by the packaging change. A unified resolver using importlib.resources.files("ouroboros.skills") would be the single source of truth.
HermesCliRuntime, CodexCliRuntime, and OpenCodeRuntime share: CLI path resolution, skill interception, session resume, template substitution, MCP handler loading, and interview session tracking. Extracting a SubprocessAgentRuntime base class would eliminate ~300 lines of duplicated logic across the three runtimes.
The current pattern of adding if/elif branches in runtime_factory.py, setup.py, config.py, and models.py for each new runtime doesn't scale. A registry pattern (decorator or dict-based) where each runtime self-registers its factory, setup handler, and config schema would reduce the blast radius of adding future runtimes.
_setup_hermes, _setup_codex, _setup_opencode, and _setup_claude share the same structure: read config → merge → write. Each has slightly different error handling and encoding practices (see L1). A shared ConfigWriter with safe-write (temp file + rename) and consistent encoding would reduce inconsistency.
The packaging change (H3) and setup decoupling (H4) demonstrate the need for an automated smoke-test that: builds the wheel → installs in a venv → verifies skill resolution → runs setup in a temp HOME → checks file-system side effects. This would catch packaging regressions before they reach CI unit tests (which run from source checkout).
Quality Gates:
ruff check: PASS
ruff format: FAIL — 4 files need reformatting
mypy: PASS — no issues in 244 source files
pytest: 8 FAILED, 4519 passed, 2 skipped (75% coverage)
Packaging:
uv build: PASS — wheel builds successfully
Wheel skills: 19 skill bundles at ouroboros/skills/ (correct)
Wheel codex: ouroboros/codex/skills/ ABSENT (expected after change)
importlib (wheel): resolve_packaged_codex_skill_path("run") WORKS via fallback
importlib (editable): resolve_packaged_codex_skill_path("run") FAILS (C1 bug)
Setup Smoke (temp HOME):
Hermes setup: PASS — config, MCP, skills all correct
Hermes Claude leak: PASS — no .claude dir created
Codex Claude decoup: PASS — Claude config untouched
Merge blocker: C1 — src/ouroboros/skills/__init__.py shadows repo-root skills/, breaking ALL skill resolution in editable installs (8 test failures). This must be fixed before merge.
Must fix: H1 (no timeout), H2 (no depth guard).
Should fix: 4 ruff format violations.
User-facing change requiring docs: H4 (Codex/OpenCode no longer auto-register with Claude).
Safe to merge after: Fixing C1 (skill shadowing), H1 (timeout), H2 (depth guard), formatting, and documenting H4.