Skip to content

Typed tmux operation chains#81

Open
tony wants to merge 22 commits into
mainfrom
chainable-commands-experiment-00
Open

Typed tmux operation chains#81
tony wants to merge 22 commits into
mainfrom
chainable-commands-experiment-00

Conversation

@tony

@tony tony commented Jun 20, 2026

Copy link
Copy Markdown
Member

Summary

  • Replace the raw run_command_chain and narrow build_forward_layout tools with one typed run_tmux_operations tool.
  • Add discriminated Pydantic operation models for split_pane, send_keys, resize_pane, select_layout, set_option, and capture_pane, plus structured per-step and per-dispatch results.
  • Compile ordered typed operations to the fewest safe native tmux dispatches: no-output mutations fold into tmux a ; b ; c, output/id capture steps stay attributable, and on_error="continue" uses standalone dispatches because native tmux chains abort on first failure.
  • Preserve the single-split decoration fast path: an id-producing split_pane can fold with immediate send_keys / resize_pane operations that target its pane_ref through tmux's {marked} target while still returning the concrete pane id.
  • Pin the libtmux dependency to a public commit instead of a sibling worktree path.

Changes by area

src/libtmux_mcp/tools/chain_tools.py

Adds the typed compiler and registers run_tmux_operations as a mutating, open-world tool. The compiler keeps a pending chain of no-output operations, flushes before output reads, captures split ids when needed, and returns both rendered argv and per-step status.

src/libtmux_mcp/models.py

Adds the typed operation union and structured result models. The operation list is validated through a module-level Pydantic TypeAdapter in the tool module.

Docs

Adds chain-tool docs and wires chain_tools plus the new models into the FastMCP docs catalog.

Dependency pin

Updates [tool.uv.sources] and uv.lock to use the public libtmux branch commit 591a312f78d165816bb95a035a46219657c4b53d.

Design notes

  • The tool is typed, not raw tmux argv. Unsupported commands such as kill-server are not operation variants.
  • Per-step stdout is only promised where it can be attributed. capture_pane and id-producing split steps force a dispatch boundary unless the single split-ref {marked} optimization applies.
  • on_error="continue" deliberately disables native chaining. tmux ; sequences have abort-on-first-error semantics, so continuing later operations requires separate dispatches.
  • The true one-connection, per-command-result path remains a future control-mode runner; this PR stays on the subprocess chain API.

Test plan

  • rm -rf docs/_build
  • uv run ruff check . --fix --show-fixes
  • uv run ruff format .
  • uv run mypy .
  • uv run py.test --reruns 0 -vvv (612 passed)
  • just build-docs

Companion PR

Depends on the libtmux experimental chain API in tmux-python/libtmux#685.

tony added 5 commits June 20, 2026 11:19
why: Build the experimental chain-command MCP tools against the in-progress
libtmux._experimental.chain API on the sibling libtmux worktree.
what:
- Add [tool.uv.sources] libtmux = { path = "../libtmux", editable = true }
- Relock against the local editable checkout
why: Agents needed to run several tmux commands as one native invocation
instead of one tool call per command.

what:
- Add run_command_chain: a list of {command, args, target} folded into one
  `tmux a ; b` dispatch via libtmux._experimental.chain (CommandChain.run,
  run off the event loop with asyncio.to_thread)
- Destructive tier; refuse kill-server; fail closed on an empty list/target
- Add ChainCommand / RunCommandChainResult models; register the tool
- Tests: one-dispatch effect, atomic abort, validation, kill-server denial
why: A single tmux `;` chain can't hand back the ids it creates (a fresh
id can't be substituted into the same invocation), so callers had no way
to split a pane and learn the new pane ids.

what:
- Add build_forward_layout: split a seed pane N ways and return each new
  pane id, resolved over the minimum dispatches via ForwardPlan and
  AsyncServerPlanRunner (off the event loop)
- Optional per-split shell / send_keys; mutating tier (reaches a shell)
- Add ForwardSplit / ForwardLayoutResult models; register the tool
- Tests: two splits capture distinct ids, single-split fold + send_keys
  lands, empty-list validation
why: FastMCP log capture can surface the same child logger event through
both direct and parent propagation paths, which made the level check brittle.

what:
- Assert the set of matching fastmcp.errors levels
- Keep warning/error demotion coverage for tool errors
why: CI cannot install a sibling worktree path, so the MCP branch needs a
public immutable libtmux source for the experimental chain API.

what:
- Replace the editable sibling path with the published libtmux commit
- Regenerate uv.lock with the Git source
@codecov-commenter

codecov-commenter commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.45977% with 68 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.31%. Comparing base (188784b) to head (5ad5e56).

Files with missing lines Patch % Lines
src/libtmux_mcp/tools/chain_tools.py 76.98% 35 Missing and 20 partials ⚠️
src/libtmux_mcp/models.py 87.96% 11 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
- Coverage   84.67%   84.31%   -0.36%     
==========================================
  Files          43       44       +1     
  Lines        3197     3545     +348     
  Branches      438      484      +46     
==========================================
+ Hits         2707     2989     +282     
- Misses        360      405      +45     
- Partials      130      151      +21     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tony tony changed the title Experimental: one-dispatch + forward-layout chain command tools Typed tmux operation chains Jun 20, 2026
@tony

tony commented Jun 21, 2026

Copy link
Copy Markdown
Member Author

Code review

Found 4 issues:

  1. ResizePaneOperation(zoom=False) toggles zoom instead of leaving it unchanged: _resize_pane_calls appends -Z whenever zoom is not None, but False is not None, and tmux resize-pane -Z only toggles (there is no zoom-off), so zoom=False flips zoom — the opposite of its apparent meaning. The model validator accepts zoom=False.

"""Build ``resize-pane`` calls for a typed resize operation."""
args: list[str | int] = []
if operation.zoom is not None:
args.append("-Z")
if operation.height is not None:
args.extend(("-y", operation.height))

  1. SetOptionOperation validates target-without-scope but not scope-without-target, so scope="pane"/"window"/"session" with target=None passes validation and emits e.g. set-option -p <opt> <val> with no -t, silently mutating the control client's current pane/session.

@model_validator(mode="after")
def _validate_target(self) -> SetOptionOperation:
if self.target is not None and self.scope is None:
msg = "scope is required when target is specified."
raise ValueError(msg)
return self

  1. ControlModeRunner (the default transport="control") is closed only on the normal-exit path, not in a try/finally. An exception in the dispatch loop (e.g. asyncio.CancelledError, or anything re-raised by handle_tool_errors_async) skips close() and leaks the tmux -C subprocess (it has no __del__); these accumulate over the server's lifetime.

created_pane_order,
)
if isinstance(runner, ControlModeRunner):
await asyncio.to_thread(runner.close)
return RunTmuxOperationsResult(
succeeded=succeeded,

  1. CHANGES describes an intermediate branch state for a never-released tool (AGENTS.md says "Do not mention intermediate branch states, abandoned approaches, or 'no longer' behavior unless users of a published release actually experienced the old state"). run-tmux-operations is new in this PR, so "now dispatches ... by default" and "a failing operation no longer aborts the rest" describe a transition no released user experienced.

libtmux-mcp/CHANGES

Lines 23 to 30 in 55ea0b4

**Per-operation results over tmux control mode**
{tooliconl}`run-tmux-operations` now dispatches over a persistent `tmux -C`
control connection by default (`transport="control"`), so each operation in a
folded chain keeps its own stdout and return code, and a failing operation no
longer aborts the rest. Pass `transport="subprocess"` to fold into a single
native `tmux a ; b ; c` sequence that returns one merged result and aborts on
the first error.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

tony added 12 commits June 21, 2026 06:05
why: Raw tmux command chains forced callers to choose between one native
dispatch and typed, per-operation results. A typed compiler fills that gap
while keeping output and continue-on-error semantics honest.

what:
- Replace raw chain/layout tools with run_tmux_operations
- Add discriminated operation models and structured step results
- Fold chainable runs and split standalone output/id captures
- Document the new chain tool surface
why: A split that immediately feeds typed decorations should keep the
one-dispatch behavior that tmux supports through the marked-pane target while
still returning the created pane id.

what:
- Detect immediate send_keys/resize operations targeting a fresh split ref
- Compile them through tmux's {marked} target in one sequence
- Assert the single-dispatch split-ref path in tests and docs
why: The typed operation compiler is now part of this branch's public
surface, but the unreleased notes and tool page did not describe what
callers can rely on.

what:
- Add an unreleased changelog entry for run_tmux_operations
- Document dispatch boundaries and the generic batch-tool boundary
why: The MCP operation compiler was duplicating libtmux's chainability
and scope contract, which let the two surfaces drift as new operations
were added.

what:
- Validate lowered commands with libtmux chain metadata
- Report contract failures as operation-level compile failures
- Add an exhaustiveness assertion for typed operation lowering
- Cover contract drift with focused tests
why: Agents need to inspect the native tmux dispatches a typed
operation list would produce before mutating tmux.

what:
- Add dry_run to run_tmux_operations and result models
- Return planned step and dispatch results with nullable exit codes
- Use deterministic placeholders for dry-run split pane refs
- Document dry-run behavior and add regression tests
why: Planned dry-run steps should not stop later operations when the
compiler flushes a pending dispatch before an output step.

what:
- Treat planned dry-run steps as successful for control flow
- Reuse the same success predicate for final results
- Add a regression covering dry-run output-step continuation
why: Native tmux chains can block the MCP call when a dispatch stalls,
so callers need a typed failure instead of an unbounded await.

what:
- Add dispatch_timeout validation to run_tmux_operations
- Mark timed-out dispatches and included steps as failed
- Cover chain, standalone, and marked split timeout paths
- Document the timeout behavior and background worker caveat
why: The typed chain compiler has branch-local failure paths for refs,
pending flushes, and marked split chains that need explicit coverage.

what:
- Cover unknown pane_ref compile failures
- Cover compile errors after a failed pending dispatch
- Cover marked split failure skipping later operations
why: A typed operation list can create panes before a later step fails,
leaving partial layout state behind for callers that need all-or-nothing
behavior.

what:
- Add rollback_on_error to run_tmux_operations
- Kill created split-ref panes in reverse order on failure
- Report rolled_back_panes and rollback_errors in results
- Document rollback behavior and cover enabled and disabled cases
why: The MCP chain tools need the pushed libtmux chain
control-mode surface that preserves per-command results.

what:
- Update the libtmux git pin to 6fc3db63
- Refresh uv.lock for the new pinned revision
why: A folded ";" dispatch returns one merged result, so chained operations
lost their own stdout and a single failure aborted the rest. libtmux's
ControlModeRunner returns one %begin/%end/%error block per command over a
persistent "tmux -C" connection.

what:
- Add transport="subprocess"|"control" to run_tmux_operations, default control
- Route dispatch through ControlModeRunner.run_calls for per-operation results
- Skip the {marked} split fold under control (splits self-capture their id)
- Close the control connection after each call
- Pin the {marked}-specific tests to subprocess; add per-operation attribution
- Note control transport in the unreleased changelog
why: A typed split's new pane took its cwd from the issuing client's context,
which differs by transport (the subprocess client's cwd vs. the control
client's), so the same split could land in different directories.

what:
- Pass -c "#{pane_current_path}" on split-window so the new pane inherits the
  target pane's directory deterministically under both transports
- Cover the inherited directory for the subprocess and control transports
@tony tony force-pushed the chainable-commands-experiment-00 branch from 55ea0b4 to 240a266 Compare June 21, 2026 11:59
tony added 5 commits June 21, 2026 07:18
why: libtmux-mcp should consume the reviewed control/forward-plan
fixes from the chainable-commands experiment branch.

what:
- Update the libtmux git source to 05f55e2a
- Regenerate uv.lock for the same libtmux revision
why: on_error="stop" did not actually stop. Consecutive no-output operations
folded into one control-mode batch that ran every call before any per-step
status existed, so a failing middle operation still let later operations run
and mutate state. Folding only ever applied under stop, and over the
persistent tmux -C connection it saved nothing the connection did not already
amortize, while a separate subprocess transport and a {marked} register fold
added a parallel execution model with weaker failure attribution.

what:
- Dispatch each operation on its own over the persistent tmux -C connection so
  every operation keeps its own stdout and return code
- Make on_error="stop" skip every operation after the first failure or
  unresolved target
- Remove the transport parameter; control mode is the only engine
- Delete the {marked} split fold, pending-chain dispatch, and the chainability
  gate that only mattered for folding
- Update the tool docs and CHANGES to describe per-operation control dispatch
why: the operation input was a discriminated union but the result was a flat
model whose stdout, stderr, returncode, and created_pane_id were all optional,
so a caller had to know out of band which fields each kind populated. The
dispatch records (rendered argv, counts, mode) sat on the primary result and
leaked the compiler's mechanism onto every response.

what:
- Return one typed result per operation, discriminated by kind: capture_pane
  carries lines, split_pane carries pane_id, and the rest carry status only,
  with an error message on failure
- Move the per-dispatch records behind an explain flag, returned under
  diagnostics, and shrink each record to one operation's argv and output
- Update the tool docs, CHANGES, and the autodoc model list for the new result
  models, and rework the tests around the typed steps
why: every pane operation carried pane_id and pane_ref as two nullable fields
guarded by a model validator that enforced exactly one of them, repeated across
four operations. Two nullable fields plus a validator is easy for a caller to
get wrong and gives the schema no single, self-describing place for the target.

what:
- Replace the pane_id/pane_ref pair with one discriminated target union
  (PaneIdTarget for a concrete pane, RefTarget for a name minted by an earlier
  split), removing the four repeated validators
- Resolve a ref target against the panes created earlier in the same list
- Keep targets to the two that resolve unambiguously over a detached control
  connection; relative and active targets depend on a client's current pane,
  which a control connection does not track reliably
- Update the tool docs, CHANGES, autodoc model list, and tests
why: the tool reads best as a plan an agent applies, not a list of
implementation operations to run, so naming it around the intent (a tmux plan)
fits the call better than naming it around the mechanism. The symbol has not
shipped, so it can be renamed in place with no alias.

what:
- Rename run_tmux_operations to run_tmux_plan and RunTmuxOperationsResult to
  RunTmuxPlanResult, keeping dispatch_timeout and the operations argument
- Retitle the tool and rename its docs page to run-tmux-plan
- Refresh the chain index and card text for per-operation control execution
- Update CHANGES, the autodoc model list, and the tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants