From 797a05e2f81ff9123f7c70d0a345061dbaa06062 Mon Sep 17 00:00:00 2001
From: Nathan Heskew <nathan@harperdb.io>
Date: Wed, 1 Jul 2026 08:16:04 -0600
Subject: [PATCH 1/5] docs(models): namespace registration API + document
 routing & fallback
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The merged registration docs describe the free-global `registerBackend` /
`defineBackend` form that 5.2 removed; correct them to the `models.`-namespaced
methods, and add a Routing & Fallback page for the pluggable routing shipped in
harper #1326 / #1537.

- reference/models/backends.md, api.md: registerBackend/defineBackend are methods
  on `models` (not standalone `harper` exports/globals) — fix signatures + example.
- reference/models/routing.md (new): fallback groups (`fallback:` config),
  capability routing (`opts.requires`, tools auto-require), fallback-on-error
  (primary error surfaced, abort short-circuits), custom routers
  (`models.registerRouter`).
- api.md: `requires` option on embed/generate; overview + sidebar cross-links.
- release-notes/v5-lincoln/5.2.md (new): breaking namespacing note + routing.

Refs harper #1326, #1534, #1537.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 reference/models/api.md         |  20 ++++---
 reference/models/backends.md    |  14 ++---
 reference/models/overview.md    |   2 +-
 reference/models/routing.md     | 100 ++++++++++++++++++++++++++++++++
 release-notes/v5-lincoln/5.2.md |  53 +++++++++++++++++
 sidebarsReference.ts            |   5 ++
 6 files changed, 177 insertions(+), 17 deletions(-)
 create mode 100644 reference/models/routing.md
 create mode 100644 release-notes/v5-lincoln/5.2.md

diff --git a/reference/models/api.md b/reference/models/api.md
index 0a5e7bb1..d425cbd3 100644
--- a/reference/models/api.md
+++ b/reference/models/api.md
@@ -27,6 +27,7 @@ const batch = await models.embed(['first document', 'second document']);
 | Option      | Type                      | Default     | Description                                                                                                                         |
 | ----------- | ------------------------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------------- |
 | `model`     | `string`                  | `'default'` | Logical name of a configured embedding model                                                                                        |
+| `requires`  | `Capability[]`            | —           | Capabilities the chosen backend must satisfy; used by [routing](./routing#capability-routing) to select a candidate in the group    |
 | `inputType` | `'document'` \| `'query'` | —           | Hint for models that distinguish document embeddings from query embeddings (e.g. `nomic-embed-text`); ignored by models that do not |
 | `signal`    | `AbortSignal`             | —           | Cancels the call; composed with the backend's configured `requestTimeoutMs`                                                         |
 
@@ -53,14 +54,15 @@ const result = await models.generate(
 console.log(result.content);
 ```
 
-| Option           | Type                                         | Default     | Description                                                                                            |
-| ---------------- | -------------------------------------------- | ----------- | ------------------------------------------------------------------------------------------------------ |
-| `model`          | `string`                                     | `'default'` | Logical name of a configured generative model                                                          |
-| `temperature`    | `number`                                     | backend     | Sampling temperature, passed through to the backend                                                    |
-| `maxTokens`      | `number`                                     | backend     | Completion token limit, passed through to the backend                                                  |
-| `responseFormat` | `'text'` \| `'json'` \| `{ schema: object }` | `'text'`    | Structured output. `{ schema }` requests output conforming to a JSON Schema; support varies by backend |
-| `toolMode`       | `'return'` \| `'auto'`                       | `'return'`  | How tool calls are handled — see [Tool Calling](./tool-calling)                                        |
-| `signal`         | `AbortSignal`                                | —           | Cancels the call; composed with the backend's configured `requestTimeoutMs`                            |
+| Option           | Type                                         | Default     | Description                                                                                                                                    |
+| ---------------- | -------------------------------------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
+| `model`          | `string`                                     | `'default'` | Logical name of a configured generative model                                                                                                  |
+| `requires`       | `Capability[]`                               | —           | Capabilities the backend must satisfy (e.g. `tools`); used by [routing](./routing#capability-routing). Tools in the input auto-require `tools` |
+| `temperature`    | `number`                                     | backend     | Sampling temperature, passed through to the backend                                                                                            |
+| `maxTokens`      | `number`                                     | backend     | Completion token limit, passed through to the backend                                                                                          |
+| `responseFormat` | `'text'` \| `'json'` \| `{ schema: object }` | `'text'`    | Structured output. `{ schema }` requests output conforming to a JSON Schema; support varies by backend                                         |
+| `toolMode`       | `'return'` \| `'auto'`                       | `'return'`  | How tool calls are handled — see [Tool Calling](./tool-calling)                                                                                |
+| `signal`         | `AbortSignal`                                | —           | Cancels the call; composed with the backend's configured `requestTimeoutMs`                                                                    |
 
 Additional options apply only when `toolMode: 'auto'`; they are documented in [Tool Calling](./tool-calling).
 
@@ -107,7 +109,7 @@ Errors detected before the call starts (unknown model name, missing capability)
 models.registerBackend(kind: 'embedding' | 'generative', id: string, backend: ModelBackend): void
 ```
 
-Registers a custom backend under a logical name, selectable by the `model` option on later calls. This is the programmatic path for in-process or third-party backends; pair it with `defineBackend()` to build the backend from a few methods. The same functions are exported from `harper` as `registerBackend` / `defineBackend`. See [Custom backends](./backends#custom-backends) for the full guide.
+Registers a custom backend under a logical name, selectable by the `model` option on later calls. This is the programmatic path for in-process or third-party backends; pair it with `models.defineBackend()` to build the backend from a few methods. Both are methods on `models` — reachable as `models.registerBackend(...)` / `scope.models.registerBackend(...)` (and likewise for `defineBackend`), not standalone `harper` exports. See [Custom backends](./backends#custom-backends) for the full guide.
 
 ## Errors and timeouts
 
diff --git a/reference/models/backends.md b/reference/models/backends.md
index 714e40eb..d419f17c 100644
--- a/reference/models/backends.md
+++ b/reference/models/backends.md
@@ -146,10 +146,10 @@ Custom backends are registered programmatically; the `backend` field in the [`mo
 ### defineBackend()
 
 ```typescript
-defineBackend(spec: DefineBackendSpec): ModelBackend
+models.defineBackend(spec: DefineBackendSpec): ModelBackend
 ```
 
-Builds a `ModelBackend` from the methods it implements. `capabilities()` is derived from which of `embed` / `generate` / `generateStream` are supplied; `tools` and `adapters` cannot be inferred from method presence, so declare them explicitly.
+A method on `models` (reachable as `models.defineBackend(...)` / `scope.models.defineBackend(...)`). Builds a `ModelBackend` from the methods it implements. `capabilities()` is derived from which of `embed` / `generate` / `generateStream` are supplied; `tools` and `adapters` cannot be inferred from method presence, so declare them explicitly.
 
 | Field            | Type       | Default | Description                                                          |
 | ---------------- | ---------- | ------- | -------------------------------------------------------------------- |
@@ -165,23 +165,23 @@ Builds a `ModelBackend` from the methods it implements. `capabilities()` is deri
 ### registerBackend()
 
 ```typescript
-registerBackend(kind: 'embedding' | 'generative', id: string, backend: ModelBackend): void
+models.registerBackend(kind: 'embedding' | 'generative', id: string, backend: ModelBackend): void
 ```
 
-Registers `backend` under the logical name `id` for the given `kind`. Also available as `models.registerBackend(...)` / `scope.models.registerBackend(...)`. Register during component initialization (for example, in `handleApplication`) so the backend is in place before requests arrive; the registry is process-wide, so each worker thread that loads the component registers its own instance.
+Registers `backend` under the logical name `id` for the given `kind`. A method on `models` (reachable as `models.registerBackend(...)` / `scope.models.registerBackend(...)`). Register during component initialization (for example, in `handleApplication`) so the backend is in place before requests arrive; the registry is process-wide, so each worker thread that loads the component registers its own instance.
 
 Use a provider-namespaced `id` (e.g. `local:bge-small`) to avoid collisions when more than one component registers backends.
 
 ```javascript
-import { models, registerBackend, defineBackend } from 'harper';
+import { models } from 'harper';
 import { init, embed } from 'some-local-embedding-library';
 
 await init();
 
-registerBackend(
+models.registerBackend(
 	'embedding',
 	'local:bge-small',
-	defineBackend({
+	models.defineBackend({
 		name: 'local:bge-small',
 		async embed(input) {
 			const texts = Array.isArray(input) ? input : [input];
diff --git a/reference/models/overview.md b/reference/models/overview.md
index eb6e6cc7..ac514b44 100644
--- a/reference/models/overview.md
+++ b/reference/models/overview.md
@@ -7,7 +7,7 @@ title: Models
 
 <VersionBadge version="v5.1.0" />
 
-Harper provides a unified API for calling AI models — text embeddings and text generation — from application code. Models are configured by an operator under logical names; application code requests a model by its logical name and Harper routes the call to the configured backend (Ollama, OpenAI, Anthropic, or Amazon Bedrock) — or to a [custom backend](./backends#custom-backends) a component registers. Swapping providers is a configuration change, not a code change.
+Harper provides a unified API for calling AI models — text embeddings and text generation — from application code. Models are configured by an operator under logical names; application code requests a model by its logical name and Harper routes the call to the configured backend (Ollama, OpenAI, Anthropic, or Amazon Bedrock) — or to a [custom backend](./backends#custom-backends) a component registers. Swapping providers is a configuration change, not a code change. A logical name can also name an ordered group of backends to try, and calls can require specific capabilities — see [Routing & Fallback](./routing).
 
 The API is exposed as a single process-wide `models` object:
 
diff --git a/reference/models/routing.md b/reference/models/routing.md
new file mode 100644
index 00000000..93620625
--- /dev/null
+++ b/reference/models/routing.md
@@ -0,0 +1,100 @@
+---
+id: routing
+title: Routing & Fallback
+---
+
+<!-- Source: harper resources/models/routing.ts, resources/models/Models.ts, resources/models/bootstrap.ts, resources/models/types.ts (v5.2) -->
+
+<VersionBadge version="v5.2.0" />
+
+Every `models` call resolves through a **router** that returns an ordered list of candidate backends for the requested logical name. The facade uses the first candidate whose capabilities satisfy the call and falls through to the next on failure — capability-aware selection and fallback are the same mechanism.
+
+By default the router is a name-lookup + capability filter over a logical name's [fallback group](#fallback-groups). A component can replace it with [`models.registerRouter()`](#custom-routers) for cost-, latency-, or tenant-aware policies.
+
+## Fallback groups
+
+A model entry can name other logical models to try, in order, after itself. Configure the group with `fallback` in the [models configuration](./overview#configuration):
+
+```yaml
+models:
+  generative:
+    default:
+      backend: openai
+      model: gpt-4o-mini
+      fallback: [local-llama] # try `default`, then `local-llama`
+    local-llama:
+      backend: ollama
+      model: llama3.2
+```
+
+A call to the `default` model tries `openai` first; if it fails, the call falls through to `local-llama`. The group is `[default, ...fallback]`, de-duplicated and in order. Groups are rebuilt on every configuration reload, so removing a `fallback` takes effect on the next reload.
+
+## Capability routing
+
+A call can require capabilities of the backend it lands on. The router keeps only the candidates whose `capabilities()` satisfy the requirement, in group order.
+
+- **`opts.requires`** — an explicit list of capabilities (`embed`, `generate`, `stream`, `tools`, `adapters`).
+- **Tools auto-require `tools`** — a `generate()` call whose input carries a `tools` array routes to a tools-capable candidate in the group instead of erroring on a backend that can't do tools.
+
+```javascript
+// Tools in the input auto-route to a tools-capable candidate in the group:
+const reply = await models.generate(
+	{ messages: [{ role: 'user', content: 'What is the weather?' }], tools: [weatherTool] },
+	{ model: 'default' }
+);
+
+// Or require a capability explicitly:
+const reply = await models.generate('…', { model: 'default', requires: ['tools'] });
+```
+
+If no candidate in the group satisfies the required capabilities, the call throws a capability error naming the primary backend — no request is made.
+
+## Fallback on error
+
+When a candidate fails, `embed` / `generate` record the attempt and try the next candidate. Every attempt — success or failure — is written to [model-call analytics](./analytics), so a fallthrough is observable.
+
+- **Any backend error falls through** to the next candidate. Candidates are heterogeneous — a limit or input error on one backend may succeed on another with different constraints — so whether an error is "worth a fallback" is a router or caller policy, not a facade default.
+- **A caller abort short-circuits.** If the call's `signal` is already aborted, the loop stops and surfaces the abort rather than spending another backend call.
+- **The primary error is surfaced.** If every candidate fails, the error thrown is the **first (primary)** candidate's — the model you asked for, and usually the most diagnostic — not the last fallback's.
+
+`generateStream` resolves to the **first** candidate only. There is no mid-stream fallback: once chunks have been yielded, switching backends would mean replaying already-delivered output.
+
+## Custom routers
+
+Replace the default policy with `models.registerRouter()`. A router is a single **synchronous** `route()` method that returns the ordered candidate backends for a request; an empty array means "no candidate."
+
+```typescript
+models.registerRouter(router: ModelRouter): void
+
+interface ModelRouter {
+	route(req: RouteRequest): ModelBackend[];
+}
+
+interface RouteRequest {
+	kind: 'embedding' | 'generative';
+	logicalName: string; // from opts.model; defaults to 'default'
+	requires: Capability[]; // capabilities the chosen backend must satisfy
+	hints?: Record<string, unknown>; // free-form (tenant, prompt size, …) for custom policies
+}
+```
+
+`route()` is synchronous so the facade keeps its up-front resolution errors (notably `generateStream`'s synchronous throw for an unknown model). A registered router **fully replaces** the default — it is responsible for every `kind` and logical name it should serve. Register during component initialization; the override is process-wide and one router serves the whole process (last registration wins), so it is a deployment/application control rather than something independent components should each set.
+
+```javascript
+import { models } from 'harper';
+
+// Backends this policy chooses among (or capture already-registered ones):
+const fast = models.defineBackend({ name: 'fast', generate: generateFast });
+const cheap = models.defineBackend({ name: 'cheap', generate: generateCheap });
+
+models.registerRouter({
+	route({ kind, requires, hints }) {
+		if (kind !== 'generative') return []; // this policy only routes generation
+		// e.g. prefer the cheap backend for small prompts, the fast one otherwise:
+		const order = hints?.small ? [cheap, fast] : [fast, cheap];
+		return order.filter((b) => requires.every((cap) => b.capabilities()[cap]));
+	},
+});
+```
+
+A custom router that returns no candidates for a backend that _does_ satisfy the requirement surfaces a plain "no routing candidates available" error — not a misleading capability error against a backend that actually supports the call.
diff --git a/release-notes/v5-lincoln/5.2.md b/release-notes/v5-lincoln/5.2.md
new file mode 100644
index 00000000..bbc7e24c
--- /dev/null
+++ b/release-notes/v5-lincoln/5.2.md
@@ -0,0 +1,53 @@
+---
+title: '5.2'
+---
+
+# 5.2 Release Notes
+
+### Patch Releases
+
+All patch release notes for 5.2.x are available on the [releases page](https://github.com/HarperFast/harper/releases?q=v5.2&expanded=true).
+
+## Models
+
+### Breaking: backend registration is namespaced under `models`
+
+`registerBackend` and `defineBackend` — introduced in 5.1.15 as standalone `harper` exports and globals — are now **methods on `models`**, and the standalone exports/globals have been removed.
+
+The migration is mechanical: prefix the calls with `models.` (also reachable as `scope.models` or the `models` global).
+
+```js
+// Before (5.1.x)
+import { registerBackend, defineBackend } from 'harper';
+registerBackend(
+	'embedding',
+	'local:x',
+	defineBackend({
+		/* … */
+	})
+);
+
+// After (5.2)
+import { models } from 'harper';
+models.registerBackend(
+	'embedding',
+	'local:x',
+	models.defineBackend({
+		/* … */
+	})
+);
+```
+
+See [Custom backends](/reference/models/backends#custom-backends).
+
+### Pluggable routing and fallback
+
+Model calls now resolve through a router that returns an ordered list of candidate backends, using the first that satisfies the call's required capabilities and falling through to the next on failure.
+
+- Configure a `fallback` group on a model entry to try alternates, in order, after the primary.
+- Require capabilities per call with `opts.requires`; a `generate()` call that declares tools automatically routes to a tools-capable candidate in the group.
+- Replace the selection policy entirely with `models.registerRouter(...)` for cost-, latency-, or tenant-aware routing.
+
+When a whole fallback chain fails, the error surfaced is the primary backend's (the model you asked for), not the last fallback's.
+
+See [Routing & Fallback](/reference/models/routing).
diff --git a/sidebarsReference.ts b/sidebarsReference.ts
index 08ecfdc5..63957800 100644
--- a/sidebarsReference.ts
+++ b/sidebarsReference.ts
@@ -115,6 +115,11 @@ const sidebars: SidebarsConfig = {
 					id: 'models/backends',
 					label: 'Backends',
 				},
+				{
+					type: 'doc',
+					id: 'models/routing',
+					label: 'Routing & Fallback',
+				},
 				{
 					type: 'doc',
 					id: 'models/analytics',

From 055ae36dc8602884e2c6843657d648d097fa4afd Mon Sep 17 00:00:00 2001
From: Nathan Heskew <nathan@harperdb.io>
Date: Wed, 1 Jul 2026 08:22:00 -0600
Subject: [PATCH 2/5] docs(models): fix release-note cross-links to reference
 (add /v5/ segment)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The absolute links to the reference docs were missing the version segment
(/reference/models/... → /reference/v5/models/...), which failed the
Docusaurus broken-links build check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 release-notes/v5-lincoln/5.2.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/release-notes/v5-lincoln/5.2.md b/release-notes/v5-lincoln/5.2.md
index bbc7e24c..0eef179b 100644
--- a/release-notes/v5-lincoln/5.2.md
+++ b/release-notes/v5-lincoln/5.2.md
@@ -38,7 +38,7 @@ models.registerBackend(
 );
 ```
 
-See [Custom backends](/reference/models/backends#custom-backends).
+See [Custom backends](/reference/v5/models/backends#custom-backends).
 
 ### Pluggable routing and fallback
 
@@ -50,4 +50,4 @@ Model calls now resolve through a router that returns an ordered list of candida
 
 When a whole fallback chain fails, the error surfaced is the primary backend's (the model you asked for), not the last fallback's.
 
-See [Routing & Fallback](/reference/models/routing).
+See [Routing & Fallback](/reference/v5/models/routing).

From 774fee327710e1ec1d1f1ffae6695d15d1da3b84 Mon Sep 17 00:00:00 2001
From: Nathan Heskew <nathan@harperdb.io>
Date: Wed, 1 Jul 2026 08:41:42 -0600
Subject: [PATCH 3/5] docs(models): routing shipped in v5.1.15; drop
 breaking-change release note
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Our model changes (routing #1326, namespaced registration #1534, fallback #1537)
all landed in v5.1.15 — not a future 5.2 — and the free-global registration form
was never in a released version (added by #1405 and removed by #1534 within the
same release), so there is no user-facing breaking change to note.

- routing.md: version badge v5.2.0 -> v5.1.15 (+ source comment)
- remove release-notes/v5-lincoln/5.2.md (wrong version + moot breaking framing)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 reference/models/routing.md     |  4 +--
 release-notes/v5-lincoln/5.2.md | 53 ---------------------------------
 2 files changed, 2 insertions(+), 55 deletions(-)
 delete mode 100644 release-notes/v5-lincoln/5.2.md

diff --git a/reference/models/routing.md b/reference/models/routing.md
index 93620625..f62f8e5b 100644
--- a/reference/models/routing.md
+++ b/reference/models/routing.md
@@ -3,9 +3,9 @@ id: routing
 title: Routing & Fallback
 ---
 
-<!-- Source: harper resources/models/routing.ts, resources/models/Models.ts, resources/models/bootstrap.ts, resources/models/types.ts (v5.2) -->
+<!-- Source: harper resources/models/routing.ts, resources/models/Models.ts, resources/models/bootstrap.ts, resources/models/types.ts (v5.1.15) -->
 
-<VersionBadge version="v5.2.0" />
+<VersionBadge version="v5.1.15" />
 
 Every `models` call resolves through a **router** that returns an ordered list of candidate backends for the requested logical name. The facade uses the first candidate whose capabilities satisfy the call and falls through to the next on failure — capability-aware selection and fallback are the same mechanism.
 
diff --git a/release-notes/v5-lincoln/5.2.md b/release-notes/v5-lincoln/5.2.md
deleted file mode 100644
index 0eef179b..00000000
--- a/release-notes/v5-lincoln/5.2.md
+++ /dev/null
@@ -1,53 +0,0 @@
----
-title: '5.2'
----
-
-# 5.2 Release Notes
-
-### Patch Releases
-
-All patch release notes for 5.2.x are available on the [releases page](https://github.com/HarperFast/harper/releases?q=v5.2&expanded=true).
-
-## Models
-
-### Breaking: backend registration is namespaced under `models`
-
-`registerBackend` and `defineBackend` — introduced in 5.1.15 as standalone `harper` exports and globals — are now **methods on `models`**, and the standalone exports/globals have been removed.
-
-The migration is mechanical: prefix the calls with `models.` (also reachable as `scope.models` or the `models` global).
-
-```js
-// Before (5.1.x)
-import { registerBackend, defineBackend } from 'harper';
-registerBackend(
-	'embedding',
-	'local:x',
-	defineBackend({
-		/* … */
-	})
-);
-
-// After (5.2)
-import { models } from 'harper';
-models.registerBackend(
-	'embedding',
-	'local:x',
-	models.defineBackend({
-		/* … */
-	})
-);
-```
-
-See [Custom backends](/reference/v5/models/backends#custom-backends).
-
-### Pluggable routing and fallback
-
-Model calls now resolve through a router that returns an ordered list of candidate backends, using the first that satisfies the call's required capabilities and falling through to the next on failure.
-
-- Configure a `fallback` group on a model entry to try alternates, in order, after the primary.
-- Require capabilities per call with `opts.requires`; a `generate()` call that declares tools automatically routes to a tools-capable candidate in the group.
-- Replace the selection policy entirely with `models.registerRouter(...)` for cost-, latency-, or tenant-aware routing.
-
-When a whole fallback chain fails, the error surfaced is the primary backend's (the model you asked for), not the last fallback's.
-
-See [Routing & Fallback](/reference/v5/models/routing).

From 2a761eee62ede409c4768f40228dd2cee889b438 Mon Sep 17 00:00:00 2001
From: Nathan Heskew <nathan@harperdb.io>
Date: Wed, 1 Jul 2026 08:51:40 -0600
Subject: [PATCH 4/5] docs(models): clarify fallback-policy and no-candidates
 wording (review)

Address gemini-code-assist clarity notes on routing.md: the facade default IS
to fall back on any error (filtering which errors skip the fallback is the
custom policy), and the no-candidates case reads more clearly as a backend that
satisfies the requirement.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 reference/models/routing.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/reference/models/routing.md b/reference/models/routing.md
index f62f8e5b..b04f9b41 100644
--- a/reference/models/routing.md
+++ b/reference/models/routing.md
@@ -53,7 +53,7 @@ If no candidate in the group satisfies the required capabilities, the call throw
 
 When a candidate fails, `embed` / `generate` record the attempt and try the next candidate. Every attempt — success or failure — is written to [model-call analytics](./analytics), so a fallthrough is observable.
 
-- **Any backend error falls through** to the next candidate. Candidates are heterogeneous — a limit or input error on one backend may succeed on another with different constraints — so whether an error is "worth a fallback" is a router or caller policy, not a facade default.
+- **Any backend error falls through** to the next candidate — the facade's default is to fall back on any error. Candidates are heterogeneous (a limit or input error on one backend may succeed on another with different constraints), so _filtering_ which errors should skip the fallback is a router or caller policy, not something the facade decides.
 - **A caller abort short-circuits.** If the call's `signal` is already aborted, the loop stops and surfaces the abort rather than spending another backend call.
 - **The primary error is surfaced.** If every candidate fails, the error thrown is the **first (primary)** candidate's — the model you asked for, and usually the most diagnostic — not the last fallback's.
 
@@ -97,4 +97,4 @@ models.registerRouter({
 });
 ```
 
-A custom router that returns no candidates for a backend that _does_ satisfy the requirement surfaces a plain "no routing candidates available" error — not a misleading capability error against a backend that actually supports the call.
+A custom router that returns no candidates when a backend _does_ satisfy the requirement surfaces a plain "no routing candidates available" error — not a misleading capability error against a backend that actually supports the call.

From 0011798f4c581e2f3b4315136038df0c60a9178a Mon Sep 17 00:00:00 2001
From: Nathan Heskew <nathan@harperdb.io>
Date: Wed, 1 Jul 2026 09:10:00 -0600
Subject: [PATCH 5/5] docs(models): document config-selectable backends
 (backend: <module>)

Fold the config-selectable backend docs from #556 into this PR (superseding it):
a `backend` value that is not a built-in is resolved as a module specifier
(bare package / instance-root-relative / absolute) and imported at startup.
Shipped in v5.1.15 (registerFromModule), so it sits under the existing v5.1.15
Custom backends heading; corrected #556 stray v5.1.16 badge and namespaced the
registerBackend reference.

Refs harper #1471. Supersedes documentation #556.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 reference/models/backends.md | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/reference/models/backends.md b/reference/models/backends.md
index d419f17c..221b0c45 100644
--- a/reference/models/backends.md
+++ b/reference/models/backends.md
@@ -141,7 +141,7 @@ The model identifier's vendor prefix (`anthropic.`, `meta.`, `amazon.titan-`, `c
 
 Beyond the four built-ins, a component or application can register its own backend — including an in-process one that runs inference locally instead of calling an HTTP service. A registered backend is selected by its logical name through the same `model` option as a configured backend.
 
-Custom backends are registered programmatically; the `backend` field in the [`models` configuration](./overview#configuration) selects only the built-ins above.
+Custom backends can be added two ways: **registered programmatically** (below), or **selected in config** by pointing the `backend` field at a module — see [Config-selectable backends](#config-selectable-backends).
 
 ### defineBackend()
 
@@ -196,3 +196,23 @@ const [vector] = await models.embed('What is Harper?', { model: 'local:bge-small
 ```
 
 A registered backend takes precedence over a configuration entry with the same logical name, because registration runs after the configuration is loaded. A backend whose `capabilities()` disagrees with the methods it actually implements is registered as-is and fails at call time — `defineBackend()` keeps the two consistent.
+
+### Config-selectable backends
+
+A `backend` value in the [`models` configuration](./overview#configuration) that isn't a built-in name is resolved as a **module specifier** and imported at startup; the module's default export — or a `register` export — is a factory that registers the backend. This lets an operator select a custom backend entirely from config, the same way the built-ins are selected.
+
+```yaml
+models:
+  embedding:
+    default:
+      backend: '@acme/embedder' # an installed package
+      model: bge-small
+```
+
+The `backend` specifier is resolved as:
+
+- a **bare package** (`@acme/embedder`) — resolved from the Harper instance's `node_modules`; install the backend as a dependency. Preferred, since it carries no filesystem path and travels with the deployment.
+- an **instance-root-relative path** (`./backends/local.js`) — resolved against the Harper instance root.
+- an **absolute path**.
+
+The factory has the signature `({ logicalName, kind, config }) => void | Promise<void>` and registers via [`models.registerBackend`](#registerbackend); it receives the config entry with `${VAR}` placeholders already resolved. A `backend` that is neither a built-in nor an importable module is logged and skipped at startup, leaving other entries unaffected.