Workflow boards: kanban state machines that drive coding agents#3032
Workflow boards: kanban state machines that drive coding agents#3032ccdwyer wants to merge 1 commit into
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Addressed all 6 Macroscope findings in 87db8cc:
Same commit also hardens review-panel verdict capture (found during the live demo run): captured output now falls back to earlier assistant messages in the turn when the final message lacks the fenced json block, and the auto-appended captureOutput suffix explicitly overrides skill-driven output formats. 🤖 Generated with Claude Code |
ApprovabilityVerdict: Needs human review 2 blocking correctness issues found. Diff is too large for automated approval analysis. A human reviewer should evaluate this PR. You can customize Macroscope's approvability policy. Learn more. |
|
|
||
| const looksNotMergeable = (text: string): boolean => { | ||
| const lower = text.toLowerCase(); | ||
| return NOT_MERGEABLE_PATTERNS.some((pattern) => lower.includes(pattern)); |
There was a problem hiding this comment.
Merge errors match checks substring
Medium Severity
looksNotMergeable treats any GitHubCliError detail containing the substring checks as a not-mergeable PR outcome. Unrelated failures whose message mentions “checks” (for example API or CLI errors while loading check status) are converted to { ok: false } on mergePr instead of propagating as infrastructure errors on the error channel.
Reviewed by Cursor Bugbot for commit b81ddaa. Configure here.
| blockedBy: new Set(readStringArray(task.blockedBy)), | ||
| }); | ||
| } | ||
| return tasks.size > 0; |
There was a problem hiding this comment.
🟢 Low Layers/ClaudeAdapter.ts:791
When TaskList returns an empty array, tasks.clear() empties the map but the function returns false because tasks.size > 0 is false. This incorrectly signals no change occurred, even though all tasks were removed. If callers use the return value to trigger updates, the UI won't reflect that tasks were cleared.
- return tasks.size > 0;
+ return true;🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/server/src/provider/Layers/ClaudeAdapter.ts around line 791:
When `TaskList` returns an empty array, `tasks.clear()` empties the map but the function returns `false` because `tasks.size > 0` is false. This incorrectly signals no change occurred, even though all tasks were removed. If callers use the return value to trigger updates, the UI won't reflect that tasks were cleared.
Evidence trail:
apps/server/src/provider/Layers/ClaudeAdapter.ts lines 768-791 (REVIEWED_COMMIT): `TaskList` branch calls `tasks.clear()` unconditionally then returns `tasks.size > 0`. apps/server/src/provider/Layers/ClaudeAdapter.ts lines 2464-2471 (REVIEWED_COMMIT): caller uses the return value to decide whether to emit `emitClaudeTaskPlanUpdated`.
…StepCount Int, approval-gate interrupt safety, claude tasklist clear, dispatcher interrupt re-raise, unarchive hidden-thread gate, push-reject vs diverged, mobile empty-threadId deep-link)
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 5 total unresolved issues (including 3 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8500777. Configure here.
Per-project boards-as-state-machines: lanes hold pipelines of agent /
approval / script steps; one git worktree per ticket; event-sourced
bounded context (apps/server/src/workflow/**) with projections, durable
saga, worktree leases, lane-entry tokens, and durable approvals.
Includes:
- v1 boards + board-creation UX + script steps + smart routing (captured
step output, JSON-logic predicates, on:{success,failure,blocked}, lane
transitions/onEvent) + WIP enforcement (queue + FIFO auto-admit) +
visual editor (form + canvas) + version history/revert + delete-board.
- GitHub PR loop: PullRequestStep + a poller that observes CI/review/merge
and routes tickets via synthetic external events (two-phase durable outbox).
- Board notifications + minimal mobile surface: server outbox →
recovery-gated dispatcher → env-signed relay publish → APNs → deep-link
to a ticket action sheet + a "Needs you" inbox.
- Work arrives by itself: one-way GitHub Issues + Asana task sync. Board
declares sync sources (PAT in the secret store); a recovery-gated syncer
pulls items, diffs against a durable mapping (content-hash + scan
completeness), and reconciles through a lock-safe source committer
(admission->save->tx, atomic create+map, version-gated edit, source-aware
close, orphan-then-confirm) so synced tickets flow through all the
existing board automation. Read-only "Synced from" badge on web + mobile.
Single migration (033_WorkflowSchema). The live device-push (APNs to a
physical phone) and the live GitHub+Asana sync against real credentials are
the remaining human gates.
| const awaitTerminalExit = ( | ||
| terminals: TerminalManagerShape, | ||
| input: { readonly threadId: string; readonly terminalId: string | null; readonly timeoutMs?: number }, | ||
| ): Effect.Effect<{ readonly exitCode: number }, WorkflowEventStoreError> => { | ||
| const { terminalId } = input; | ||
| if (terminalId === null) { | ||
| return Effect.succeed({ exitCode: 0 }); | ||
| } | ||
|
|
||
| return Effect.gen(function* () { | ||
| const done = yield* Deferred.make<{ readonly exitCode: number }>(); | ||
| // Subscribe FIRST so we don't miss an exit event that races with our check. | ||
| const unsubscribe = yield* terminals.subscribe((event) => { | ||
| if (event.type !== "exited" || event.terminalId !== terminalId) { | ||
| return Effect.void; | ||
| } | ||
| return Deferred.succeed(done, { exitCode: event.exitCode ?? 1 }).pipe(Effect.asVoid); | ||
| }); | ||
| // THEN check current status: if the terminal already exited before we | ||
| // subscribed, resolve the deferred immediately with its recorded exit code. | ||
| const currentSnapshot = yield* terminals.getSnapshot({ | ||
| threadId: input.threadId, | ||
| terminalId, | ||
| }); | ||
| if (currentSnapshot !== null && currentSnapshot.status === "exited") { | ||
| yield* Deferred.succeed(done, { exitCode: currentSnapshot.exitCode ?? 1 }).pipe( | ||
| Effect.asVoid, | ||
| ); | ||
| } | ||
| const wait = Deferred.await(done); | ||
| const timed = | ||
| input.timeoutMs === undefined | ||
| ? wait | ||
| : wait.pipe( | ||
| Effect.timeoutOption(Duration.millis(input.timeoutMs)), | ||
| Effect.flatMap((result) => | ||
| Option.match(result, { | ||
| onNone: () => | ||
| Effect.fail( | ||
| new WorkflowEventStoreError({ | ||
| message: "setup terminal wait timed out", | ||
| }), | ||
| ), | ||
| onSome: Effect.succeed, | ||
| }), | ||
| ), | ||
| ); | ||
| return yield* timed.pipe( | ||
| Effect.mapError(toSetupError("setup terminal wait failed")), | ||
| Effect.ensuring(Effect.sync(unsubscribe)), | ||
| ); | ||
| }); | ||
| }; |
There was a problem hiding this comment.
🟢 Low Layers/SetupRunService.ts:111
When the timeout fires, the code fails with WorkflowEventStoreError at lines 149-152, but then line 159 wraps that error in another WorkflowEventStoreError via mapError. This produces nested WorkflowEventStoreError values with the inner message obscured by the outer "setup terminal wait failed" wrapper. Consider removing the mapError wrapper and failing directly with the descriptive timeout error, or changing the timeout branch to fail with a plain Error so mapError produces a single wrapper.
- return yield* timed.pipe(
- Effect.mapError(toSetupError("setup terminal wait failed")),
- Effect.ensuring(Effect.sync(unsubscribe)),
- );
+ return yield* timed.pipe(
+ Effect.ensuring(Effect.sync(unsubscribe)),
+ );🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/server/src/workflow/Layers/SetupRunService.ts around lines 111-163:
When the timeout fires, the code fails with `WorkflowEventStoreError` at lines 149-152, but then line 159 wraps that error in *another* `WorkflowEventStoreError` via `mapError`. This produces nested `WorkflowEventStoreError` values with the inner message obscured by the outer "setup terminal wait failed" wrapper. Consider removing the `mapError` wrapper and failing directly with the descriptive timeout error, or changing the timeout branch to fail with a plain `Error` so `mapError` produces a single wrapper.
Evidence trail:
apps/server/src/workflow/Layers/SetupRunService.ts lines 26-27: `toSetupError` creates `new WorkflowEventStoreError({ message, cause })`.
apps/server/src/workflow/Layers/SetupRunService.ts lines 149-152: timeout branch fails with `new WorkflowEventStoreError({ message: "setup terminal wait timed out" })`.
apps/server/src/workflow/Layers/SetupRunService.ts line 159: `Effect.mapError(toSetupError("setup terminal wait failed"))` wraps ALL errors, including the already-`WorkflowEventStoreError` from the timeout path, creating nested `WorkflowEventStoreError` values.
|
|
||
| const nowIso = DateTime.now.pipe(Effect.map(DateTime.formatIso)); | ||
|
|
||
| const make = Effect.gen(function* () { |
There was a problem hiding this comment.
🟡 Medium Layers/TicketPullRequestService.ts:24
The vars.baseRef template variable is assigned input.step.base ?? "" on line 47, but base isn't resolved until lines 70-71 where it falls back to github.defaultBranch(). This causes {{ticket.baseRef}} in templates to render as empty string even when the PR actually targets the default branch. Move the vars construction after resolving base, or populate baseRef with the resolved branch value.
🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/server/src/workflow/Layers/TicketPullRequestService.ts around line 24:
The `vars.baseRef` template variable is assigned `input.step.base ?? ""` on line 47, but `base` isn't resolved until lines 70-71 where it falls back to `github.defaultBranch()`. This causes `{{ticket.baseRef}}` in templates to render as empty string even when the PR actually targets the default branch. Move the `vars` construction after resolving `base`, or populate `baseRef` with the resolved branch value.
Evidence trail:
apps/server/src/workflow/Layers/TicketPullRequestService.ts lines 43-49 (vars construction with baseRef = input.step.base ?? ""), lines 70-71 (base resolution with defaultBranch fallback), lines 74-81 (template rendering using stale vars). apps/server/src/workflow/instructionTemplate.ts lines 20-28 (applyInstructionTemplate replaces {{ticket.baseRef}} with vars.baseRef).


Workflow Boards
Per-project kanban boards as event-sourced state machines that drive coding agents. Lanes hold pipelines of steps (agent / script / approval / merge); routing between lanes is decided by step outcomes, JSONLogic predicates over captured output, lane fallbacks, manual actions, or external webhook events. Every ticket gets its own git worktree, every move is audited and explained.
All screenshots below are from a live run on a mock project ("Snackbase"): the board's agent steps run GPT-5.5 at different reasoning levels per lane (planning = low, implementation = medium with escalation to extra-high on retry, review = high ×3 reviewers), and the "Fix off-by-one" ticket was driven through the pipeline by real agents.
The board
Lanes with per-lane colors and WIP limits, tickets with status stripes, dependency badges ("waiting on 1 dependency"), token budgets ("0 tok / 250k"), and usage roll-ups.
Creating a ticket: description, blocked-by dependencies, and an optional token budget that halts agent steps once spent.
Intake: braindump → tickets
Paste a braindump, pick the agent (provider/model + reasoning effort), and it proposes structured tickets — including dependency edges ("After #1") — which you edit and approve before anything is created. These proposals came from a real GPT-5.5 run.
The workflow editor
Canvas view: lanes as cards, steps typed and colored, routing edges colored by outcome (success/failure/blocked), numbered transitions, dotted action edges, routing-precedence legend. Edits are drag-to-connect or via the inspector; explicit Save lints and writes the board file (
.t3/boards/*.jsonis the source of truth).Selecting a lane dims every edge that doesn't touch it, so dense graphs stay readable:
Agent steps, fully configurable
The implement step: GPT-5.5 · Medium reasoning, 2 retry attempts, and "Escalate on retry" to GPT-5.5 · Extra High — a failed attempt automatically reruns on the stronger configuration.
The review step: GPT-5.5 · High, captured output (the agent ends with a fenced JSON verdict that routing predicates can read), and a 3-reviewer panel — three independent sessions vote, strict majority wins.
Lane form: merge steps, routing, external events
The Land lane in form view: a merge step (commits the ticket worktree and merges it into the checked-out branch; conflicts block instead of failing), lane success/failure/blocked routes, and external event matchers — a
ci.passedwebhook with a payload predicate moves the ticket to Done.Dry run
Simulate a hypothetical ticket through the definition you're editing (unsaved changes included) under all-succeed / all-fail / all-block scenarios. It mirrors the engine's exact routing semantics and explains every hop — here it correctly flags that the success path stalls in Review unless a verdict transition matches.
Version history
Every save is snapshotted per board with diffs and non-destructive revert.
External events
Each board gets a webhook endpoint with a rotating token (shown exactly once) and a copyable curl example. CI, PR automation, or cron can move correlated tickets (by
ticketIdorworkflow/<id>branch) through their lane's event matchers, with delivery dedupe.The board reports to you
A digest of the last 24h: shipped/created counts, tokens spent, agent time, and which tickets are waiting on a human.
Living with a ticket
The drawer: "Why is this ticket here?" route explainability (every hop with the rule that caused it), a discussion thread whose comments reach the next agent step as context, per-step status/duration/token usage, and one-click lane actions ("Retry build", "Back to backlog").
Script steps are gated by per-project trust — the first
node --testrun blocks until you allow it:Every ticket has a case file (
.t3/ticket/<id>/) the agents write into — here the PLAN.md the planning agent produced — plus the script output and reviewer sessions:And any agent step's full session is one click away, read-only:
Boards live in the sidebar with hover rename/delete (delete cascades tickets, events, versions, worktrees, and webhook tokens):
Closing the loop: GitHub pull requests
A
pullRequeststep opens a PR from the ticket's worktree branch; a background poller then watches that PR and feeds its lifecycle back into the board as synthetic external events that route the ticket through its lane's matchers — CI passed / failed (with the failing summary posted as a ticket comment), review approved / changes-requested, merged, closed. The loop is durable (two-phase outbox, dedup keyed on the transition) so observations survive restarts and never double-fire. A ticket can flow open-PR → CI-fails → back-to-implement → re-push → CI-passes → review → land with no human in the loop until something genuinely needs you.Your phone buzzes when a ticket needs you
When a ticket enters a needs-you state (
waiting_on_userorblocked), the server writes a durable outbox row in the same projection transaction; a recovery-gated dispatcher publishes an environment-signed push through the relay to your registered devices (reusing the agent-awareness APNs pipeline). Tapping it deep-links to a minimal mobile ticket action sheet — answer the agent's question, approve / reject, or tap a lane action (move / run lane / comment) over the existing operate RPCs — plus a "Needs you" inbox that aggregates every ticket awaiting you across boards and environments. Per-device preferences gate which states notify (notifyOnApproval/notifyOnInput/notifyOnBlocked); completion is silent by default. (Live APNs to a physical phone is the one human-run gate.)Work arrives by itself: GitHub Issues + Asana sync
A board can declare sync sources — a GitHub repo's issues or an Asana project's tasks — and a recovery-gated background syncer pulls them in as tickets and keeps them reconciled: new items become tickets in a destination lane, upstream title/description edits mirror in, and an item completed/closed upstream routes its ticket to a terminal lane. One-way in v1 (the provider abstraction reserves a
writeBackseam for future two-way), validated by two real providers. Personal access tokens live in the server secret store (never the board file); the sync is idempotent (a durable external-id ↔ ticket mapping + content-hash gating), scan-completeness-gated so a truncated fetch can't false-orphan, and applied through a lock-safe committer (admission→save→tx) so synced tickets flow through all the routing / WIP / notification machinery above. Configure connections + sources in the editor's Sources section and Settings → Work Sources; synced tickets carry a read-only "Synced from {provider}" badge on web + mobile. (Live sync against real GitHub + Asana tokens is the remaining human-run gate.)Not shown but included
Durable restart recovery (pipelines, retries, merges, approvals resume safely), WIP queueing with FIFO auto-admission, dependency auto-release, terminal-lane retention TTL with full state cleanup, aging badges and waiting-on-you toasts, ticket search, multi-environment boards, and an event-sourced audit trail under everything.
Notes
apps/server/src/workflow/**(Effect TS, event-sourced over SQLite) with contracts inpackages/contracts/src/workflow.tsand the web UI underapps/web/src/components/board/**.infra/relay/**), the lock-safe event committer + lane-entry cores, and a single consolidated migration (033_WorkflowSchema).docs/workflow-demo/has been removed from the branch tree (kept locally), so there is no demo material to drop before merge.🤖 Generated with Claude Code
Note
Add workflow kanban boards with agent-driven ticket execution, RPC surface, and mobile inbox
.t3/boards/*.json) with lanes, pipeline steps (agent, script, approval, merge, PR), and routing rules; a linter validates definitions and a dry-run simulator traces ticket paths without mutating state.WORKFLOW_WS_METHODS) covering board/ticket CRUD, subscriptions, approvals, answers, diffs, intake, digests, and work-source connection management, gated by newworkflow:readandworkflow:operateauth scopes.WorkflowEngine,WorkflowEventStore,WorkflowProjectionPipeline,WorkflowReadModel,WorkflowRecovery,WorkflowTerminalRetentionSweeper,WorkflowGitHubPoller, step executors (real/script/stub), lease service, checkpoint service, and janitorial cleanup of worktrees and threads.WorkflowSourceCommitter.NeedsYouInboxScreenaggregating attention tickets across environments and aTicketActionSheetScreenfor per-ticket actions; push notification payloads now support board/ticket routing alongside threads.publishBoardTicketrelay endpoint that fan-outs APNs notifications forwaiting_on_userandblockedtickets, respecting per-devicenotifyOnBlockedpreferences.Macroscope summarized 9ca88d4.
Note
High Risk
Large consolidated DB migration and new workflow startup/recovery gates affect boot order and persistence; hidden-thread filtering and mobile operate paths touch auth-scoped ticket mutations and notification routing.
Overview
Extends the mobile app so push notifications can open workflow tickets: ticket deep-link encode/normalize/extract (alongside existing thread links),
notifyOnBlockedin device registration, routes for/tickets/...and a Needs you inbox that aggregateslistNeedsAttentionTicketsacross environments, plus a ticket action sheet that loads detail and runs approve/answer/move/comment via workflow RPCs with affordance selection fromattentionKind.On the server, folds former workflow migrations into a single
033_WorkflowSchema(golden-tested), addsprojection_threads.hiddenso workflow-internal agent threads stay out of user-facing orchestration snapshots and agent-awareness relay publish, and bootstraps workflow background work after recovery (board notification dispatcher, source syncer, GitHub poller, terminal retention sweeper). Auth gainsworkflow:read/workflow:operateon standard clients and token exchange; workflow HTTP hooks and runtime layers are composed intomakeServerLayer/ startup.Smaller but user-visible fixes: OpenCode settles interrupts synchronously and ignores stale post-abort idle/error until the next
busy; failed steers roll back session model/agent state; Claude treats flatresult.usageas cumulative totals not context size; Codex avoids duplicating a synthetic Standard tier when the catalog already hasdefault. Adds sample.t3/boards/delivery.jsonand ignores.superpowers/in git.Reviewed by Cursor Bugbot for commit 9ca88d4. Bugbot is set up for automated code reviews on this repo. Configure here.