diff --git a/reference/models/api.md b/reference/models/api.md index d9355d6e..0a5e7bb1 100644 --- a/reference/models/api.md +++ b/reference/models/api.md @@ -99,6 +99,16 @@ Each chunk may carry: Errors detected before the call starts (unknown model name, missing capability) throw synchronously; errors during generation propagate through the iterable. +## registerBackend() + + + +```typescript +models.registerBackend(kind: 'embedding' | 'generative', id: string, backend: ModelBackend): void +``` + +Registers a custom backend under a logical name, selectable by the `model` option on later calls. This is the programmatic path for in-process or third-party backends; pair it with `defineBackend()` to build the backend from a few methods. The same functions are exported from `harper` as `registerBackend` / `defineBackend`. See [Custom backends](./backends#custom-backends) for the full guide. + ## Errors and timeouts - An unconfigured logical model name throws a not-found error. The error names the missing logical name only — it does not enumerate configured names. diff --git a/reference/models/backends.md b/reference/models/backends.md index 4fe4b7d3..e01baf42 100644 --- a/reference/models/backends.md +++ b/reference/models/backends.md @@ -134,3 +134,87 @@ models: | `model` | — | Bedrock model identifier; the vendor prefix selects the request format | The model identifier's vendor prefix (`anthropic.`, `meta.`, `amazon.titan-`, `cohere.`, `mistral.`) determines the request/response format Harper uses; an unrecognized prefix is rejected with an error. Tool support depends on the underlying model family. Bedrock embedding APIs accept one text per request, so batch `embed()` calls are issued sequentially. + +## Custom backends + + + +Beyond the four built-ins, a component or application can register its own backend — including an in-process one that runs inference locally instead of calling an HTTP service. A registered backend is selected by its logical name through the same `model` option as a configured backend. + +Custom backends can be added two ways: **registered programmatically** (below), or **selected in config** by pointing the `backend` field at a module — see [Config-selectable backends](#config-selectable-backends). + +### defineBackend() + +```typescript +defineBackend(spec: DefineBackendSpec): ModelBackend +``` + +Builds a `ModelBackend` from the methods it implements. `capabilities()` is derived from which of `embed` / `generate` / `generateStream` are supplied; `tools` and `adapters` cannot be inferred from method presence, so declare them explicitly. + +| Field | Type | Default | Description | +| ---------------- | ---------- | ------- | -------------------------------------------------------------------- | +| `name` | `string` | — | Backend name, used in analytics and error messages (required) | +| `embed` | `function` | — | `embed(input, opts)` implementation, if the backend embeds | +| `generate` | `function` | — | `generate(input, opts)` implementation, if the backend generates | +| `generateStream` | `function` | — | `generateStream(input, opts)` implementation, if the backend streams | +| `tools` | `boolean` | `false` | Whether `generate` supports tool calls | +| `adapters` | `boolean` | `false` | Whether the backend supports per-call adapter selection | + +`embed` and `generate` return the shape the built-in backends return: `{ status: 'completed', output, usage? }`, where `output` is `Float32Array[]` for `embed` and `{ content, finishReason }` for `generate`. `generateStream` is an async generator yielding incremental `{ deltaContent?, deltaToolCalls?, finishReason? }` chunks — the same [`generateStream()`](./api#generatestream) shape, not a wrapped result. At least one method must be supplied. A backend that supplies only `generateStream` still satisfies `generate()`: Harper drains the stream into a single result. + +### registerBackend() + +```typescript +registerBackend(kind: 'embedding' | 'generative', id: string, backend: ModelBackend): void +``` + +Registers `backend` under the logical name `id` for the given `kind`. Also available as `models.registerBackend(...)` / `scope.models.registerBackend(...)`. Register during component initialization (for example, in `handleApplication`) so the backend is in place before requests arrive; the registry is process-wide, so each worker thread that loads the component registers its own instance. + +Use a provider-namespaced `id` (e.g. `local:bge-small`) to avoid collisions when more than one component registers backends. + +```javascript +import { models, registerBackend, defineBackend } from 'harper'; +import { init, embed } from 'some-local-embedding-library'; + +await init(); + +registerBackend( + 'embedding', + 'local:bge-small', + defineBackend({ + name: 'local:bge-small', + async embed(input) { + const texts = Array.isArray(input) ? input : [input]; + const vectors = await embed(texts); + return { status: 'completed', output: vectors.map((v) => Float32Array.from(v)) }; + }, + }) +); + +// Selected like any other model: +const [vector] = await models.embed('What is Harper?', { model: 'local:bge-small' }); +``` + +A registered backend takes precedence over a configuration entry with the same logical name, because registration runs after the configuration is loaded. A backend whose `capabilities()` disagrees with the methods it actually implements is registered as-is and fails at call time — `defineBackend()` keeps the two consistent. + +### Config-selectable backends + + + +A `backend` value in the [`models` configuration](./overview#configuration) that isn't a built-in name is resolved as a **module specifier** and imported at startup; the module's default export — or a `register` export — is a factory that registers the backend. This lets an operator select a custom backend entirely from config, the same way the built-ins are selected. + +```yaml +models: + embedding: + default: + backend: '@acme/embedder' # an installed package + model: bge-small +``` + +The `backend` specifier is resolved as: + +- a **bare package** (`@acme/embedder`) — resolved from the Harper instance's `node_modules`; install the backend as a dependency. Preferred, since it carries no filesystem path and travels with the deployment. +- an **instance-root-relative path** (`./backends/local.js`) — resolved against the Harper instance root. +- an **absolute path**. + +The factory has the signature `({ logicalName, kind, config }) => void | Promise` and registers via [`registerBackend`](#registerbackend); it receives the config entry with `${VAR}` placeholders already resolved. A `backend` that is neither a built-in nor an importable module is logged and skipped at startup, leaving other entries unaffected. diff --git a/reference/models/overview.md b/reference/models/overview.md index d47cdc89..eb6e6cc7 100644 --- a/reference/models/overview.md +++ b/reference/models/overview.md @@ -7,7 +7,7 @@ title: Models -Harper provides a unified API for calling AI models — text embeddings and text generation — from application code. Models are configured by an operator under logical names; application code requests a model by its logical name and Harper routes the call to the configured backend (Ollama, OpenAI, Anthropic, or Amazon Bedrock). Swapping providers is a configuration change, not a code change. +Harper provides a unified API for calling AI models — text embeddings and text generation — from application code. Models are configured by an operator under logical names; application code requests a model by its logical name and Harper routes the call to the configured backend (Ollama, OpenAI, Anthropic, or Amazon Bedrock) — or to a [custom backend](./backends#custom-backends) a component registers. Swapping providers is a configuration change, not a code change. The API is exposed as a single process-wide `models` object: