chore(ai-gateway): remove unused MORPH provider and morph warp grep model#4004
Merged
Conversation
…odel Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com>
Contributor
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Executive SummaryClean removal of the unused MORPH gateway provider and Files Reviewed (5 files)
Fix these issues in Kilo Cloud Reviewed by claude-4.6-sonnet-20260217 · 504,825 tokens Review guidance: REVIEW.md from base branch |
Contributor
|
Did we get rid of the experimental option already client side? |
lambertjosh
approved these changes
Jun 12, 2026
Contributor
Author
|
I don't know, but this stopped working a long time ago. |
Contributor
Author
|
I'll make a PR |
iscekic
added a commit
that referenced
this pull request
Jun 13, 2026
…ests Main merged PR #4004 which deleted the morph provider. The two test files that exercised the rejection branch of modelServesAllGatewayChatApis used morph as the only available Kilo-exclusive model on a chat_completions-only gateway. With morph gone, no real catalog entry satisfies that condition. Both test files now stub findKiloExclusiveModel via jest.mock/requireActual so that the marker id 'test-exclusive/alibaba-only' returns a KiloExclusiveModel with gateway: 'alibaba'. The real PROVIDERS.ALIBABA definition supports only chat_completions, so the rejection path is exercised without relying on any specific provider file being present in the catalog.
iscekic
added a commit
that referenced
this pull request
Jun 15, 2026
…ficient (#3982) * refactor(auto-routing): move classifier core into contracts package * feat(auto-routing): add tier, routing-table, decision and benchmark contracts * feat(auto-routing): add benchmark-driven decision engine and KV routing table * feat(auto-routing): return routing decisions from /decide * fix(auto-routing): log unparseable routing table JSON before falling back * feat(auto-routing-benchmark): scaffold benchmark worker with D1 schema * feat(auto-routing-benchmark): classifier golden dataset and grading * style(auto-routing-benchmark): apply oxfmt formatting * feat(auto-routing-benchmark): decider golden dataset with deterministic checkers * fix(auto-routing-benchmark): unambiguous whitespace instruction in off-by-one case * feat(auto-routing-benchmark): queue-driven benchmark runs with aggregation and table publish * feat(auto-routing-benchmark): admin config, runs and routing-table endpoints * feat(admin): proxy routes for auto-routing benchmark service * feat(admin): benchmark config, runs and routing table panel * fix(admin): stabilize benchmark runs polling interval dependencies * feat(web): internal token mint endpoint for auto-routing benchmark Mints a short-lived (6h) user API token for a given userId, guarded by the shared internal secret over Authorization: Bearer. The decider benchmark uses this to authenticate the kilo CLI against the gateway under a real user's identity. * feat(auto-routing-benchmark): run decider cases through kilo CLI in a container The decider benchmark now executes each case through the stable kilo CLI (@kilocode/cli) running in a Cloudflare Container, instead of bare OpenRouter chat completions, so it measures the real agent harness. - Container (Dockerfile + dependency-free server.mjs) spawns `kilo run --format json --auto` per case; the kilo user token is injected only as a child-process env var, never logged or written to disk. - BenchRunnerContainer DO + wrangler containers/durable_objects/migrations. - kilo-events.ts: pure parser for the CLI JSON event stream (text + cost), tolerant of both part.* and flattened event shapes. - cli-runner.ts: proxies a case to the container and parses the result. - run.ts: chunks decider cases (10/chunk) into per-(model,chunk) queue messages; fetches a short-lived user token once per message; fails fast when benchmarkUserId is unset (plus a defensive per-case guard). Classifier path unchanged. - New benchmarkUserId config field (nullable) on BenchmarkConfig. - vitest aliases @cloudflare/containers to a node-safe stub so unit tests can import the worker entry without the cloudflare:workers chain. * feat(admin): benchmark user id config field Adds a Benchmark user id input to the benchmark config editor (empty -> null), with help text noting decider runs fail until it is set. Round-trips through configToFormState/formStateToConfig. * feat(gateway): add kilo-auto/efficient with blocking auto-routing decisions * chore(auto-routing): drop unused import in routing-table contracts * fix(auto-routing-benchmark): harden decider CLI parsing, grading and retries - accept step_finish (underscore) events so per-case cost is summed - retry once when a CLI session ends with no assistant text - exact checks also accept the last non-empty output line - uniform final-answer suffix on decider prompts - /admin/debug-cli endpoint returning raw CLI events for diagnosis * fix(auto-routing-benchmark): warm up CLI container before concurrent decider cases * fix(auto-routing-benchmark): faster container turnover to avoid instance exhaustion * fix(auto-routing-benchmark): address review findings - serialize CLI runs per container and run decider cases sequentially (the CLI sqlite migration is unsafe under concurrent sessions) - add dead-letter queue and raise container instance ceiling - redact the kilo token from captured stderr before it leaves the container - timing-safe secret comparison and tokenSource audit field on minted tokens - validate persisted routing tables before serving them from the admin API - regenerate worker types with the production web base URL - dedupe the routing-table response schema; tier boundary tests * style(auto-routing-benchmark): format wrangler.jsonc * fix(auto-routing-benchmark): guard against double finish on spawn failure Also documents the queue handler's throw-to-retry contract. * fix(auto-routing): break contracts module cycle and keep response schema client-safe madge flagged tiers.ts -> index.ts (type-only but counted); tier derivation now takes a structural subset of ClassifierOutput. The routing-table response schema moves into contracts so the client component no longer pulls config.server (server-only) through the admin client re-export. * chore(admin): drop unused import after schema move * feat(auto-routing): classifier model becomes an admin override over the benchmark winner * feat(auto-routing): manual benchmark runs, classifier override, decider reasoning effort - benchmark runs start only from the admin panel; models with existing results are skipped (latest summaries carried forward) unless forced - classifier benchmark publishes a winner; the admin-set classifier model becomes an override on top of it (clearable from the panel) - decider models accept a reasoning effort, forwarded to the kilo CLI as --variant and mirrored in the routing table and live decisions * refactor(auto-routing): simplification pass - benchmark worker: single run-state read per queue message; decider chunks require caseIds (legacy fallback removed); dead defensive branch and unused DeciderCase.maxTokens dropped; container owns CLI warmup via /warmup instead of a synthetic benchmark case; admin routes use zodJsonValidator like sibling services - apps/web: parseAdminResponse and the worker-admin fetch wrapper are shared modules instead of per-file copies; BenchmarksSection.types re-export shim deleted; dead prevConfigRef guard removed; classifier-model sync effect keyed on stable primitives; tier sort order hoisted to module scope * refactor(auto-routing-benchmark): use drizzle for all D1 access * refactor(auto-routing-benchmark): normalize D1 schema and adopt drizzle-kit migrations Eliminate all JSON blob columns from the benchmark worker's D1 database: - Add drizzle-kit, drizzle.config.ts, and pnpm db:generate script - Replace config_json/runtime_json blobs with dedicated tables (config_classifier_models, config_decider_models) and snapshot columns on benchmark_runs (min_accuracy, max_concurrency, benchmark_user_id) - Replace detail_json blob in case_results with explicit diagnostic columns (fallback_reason, retried, exit_code, output_prefix, event_count, last_event_types) - Add run_models table for per-run model config snapshots (enqueued flag, api kind flags, reasoning_effort) - Add carried flag to model_summaries (true = prior-run summary copied in at startRun for skipped models) - Explode routing_tables.table_json into routing_table_candidates rows - Squash old migrations into a single baseline 0000 migration Rewrite storage layer accordingly: apiKindsToFlags/flagsToApiKinds helpers, getConfigRows/replaceConfig, insertRun(run, models, carried), getRunWithModels, saveRoutingTable(table, publishedAt), getLatestRoutingTable returning RoutingTable with safeParse, getClassifierWinner from D1 directly. Move pickClassifierWinner to src/winner.ts (pure, no D1 dep). Add GET /admin/classifier-winner endpoint. Add ClassifierWinnerResponseSchema to contracts. KV puts removed; finalizeRunIfComplete now only deletes KV keys so the auto-routing worker repopulates as a read-through cache. * fix(auto-routing-benchmark): preserve null candidate cost and type drizzle batches Replace `avg_cost_usd ?? 0` with a transparent pass-through cast so a stored NULL is not silently promoted to 0 (cheapest) in the ranking; the downstream RoutingTableSchema.safeParse in getLatestRoutingTable will reject a corrupted table rather than serve it with wrong costs. Add a round-trip test confirming null is preserved through routingTableToRows → rowsToRoutingTable. Replace the three `any[]` + `as unknown as Parameters<typeof orm.batch>[0]` patterns in replaceConfig, insertRun, and saveRoutingTable with the typed `BatchItem<'sqlite'>` tuple form from drizzle-orm/batch, removing the eslint-disable suppressions. * refactor(auto-routing-benchmark): make candidate cost non-null to match the contract * feat(auto-routing): read-through KV cache backed by the benchmark service On a KV miss (or corrupt value), fetch routing-table and classifier-winner from the benchmark worker via a service binding, write the result back with a 1h TTL, and return it. Corrupt cached values are treated as misses. The existing 60s isolate-level ttlCached wrappers and fail-closed defaults are unchanged. * fix(auto-routing): await read-through cache writes and surface origin error bodies * ci(workers): run worker predeploy scripts (D1 migrations) before deploy * fix(auto-routing-benchmark): reuse loaded run state in finalize and build tables from the run snapshot * refactor(auto-routing): share ttl cache, single-source schemas and drop dead exports - Move TtlCache/ttlCached to @kilocode/worker-utils; delete the two identical service-local copies and update all import sites - Single-source ReasoningEffortSchema in packages/auto-routing-contracts/tiers.ts; routing-table.ts and index.ts use it; benchmark.ts re-exports for compatibility - Add BenchmarkRunStatus type to contracts; db-schema.ts uses it instead of the inline literal union - Replace local ApiKind in benchmark db.ts with ClassifierApiKind from contracts - Extract DecideBaseParams / buildDecidePayload shared helper from mirror into auto-routing-mirror.ts; auto-routing-decision.ts consumes it - Delete AutoRoutingAdminResult<T> type alias from both admin client files (zero consumers); delete BenchmarkRoutingTableResponseSchema re-export from benchmark admin client (consumers import from contracts directly) - Replace route.ts timingSafeStringEqual with timingSafeEqual from @kilocode/encryption; keep extractBearerToken local (jose/jest constraint) - Replace inline 'classifier'|'decider' and api-kind array types in BenchmarksSection.tsx with BenchmarkKind and ClassifierApiKind from contracts * docs(gateway): drop stale keep-in-sync comment on DecideBaseParams * feat(gateway): bill classifier cost to the user for kilo-auto/efficient * fix(gateway): fix type error and remove dead guard in classifier billing * fix(auto-routing): apply decision reasoningEffort to efficient routing * feat(auto-routing): align kilo-auto/efficient catalog with balanced, hide from listing * fix(admin): correct run-summaries colspan in benchmarks section * feat(admin): derive decider model API kinds from gateway provider definitions * feat(auto-routing): drop default routing table; no table means no decision * fix(auto-routing): keep classifier override when benchmark origin is unavailable * docs(contracts): fix stale classifier-winner comment * fix(benchmark): exclude no-cost-signal summaries from routing table ranking * test(benchmark): fix expected ranking order in no-cost-signal test * feat(benchmark): remove fabricated default config; runs require a saved config * chore(benchmark): drop redundant case_results index, regenerate baseline migration * docs(benchmark): fix stale KV comment in wrangler config * feat(auto-routing-benchmark): grade subtaskType and riskLevel, expand classifier dataset to per-pair coverage * feat(auto-routing-benchmark): expand decider dataset to per-pair taxonomy coverage Grow the decider benchmark from 30 to 76 cases so every (taskType, subtaskType) pair in the classifier taxonomy has at least 4 mechanically-checkable cases, with at least 20 cases per difficulty tier (23 low / 31 medium / 22 high). - DeciderCase gains subtaskType; ids follow the <taskType>-<subtype>-<topic> scheme used by the classifier dataset - Existing cases retagged with subtypes where they genuinely fit (three system-behavior investigation cases moved to planning_design/system_design, the HTTP 201 lookup to investigation/external_research, and the let-closure case reframed as refactoring/migration) - New agentic_execution cases are self-contained file/terminal tasks deterministic in the node:22-slim container - Tests now enforce per-pair and per-tier quotas from the classifierTaxonomy export, subtype/taskType consistency, regex compilability, and json_equal round-tripping * feat(auto-routing): session-sticky decisions with switch-cost factor Remember the last served model per conversation in the decision-cache DO and keep it while it meets the current tier's accuracy threshold, unless the fresh pick is cheaper by more than the routing table's new switchCostFactor. Switching models discards provider prompt caches, so a session whose difficulty tier oscillates no longer ping-pongs between models. Decisions report a sticky flag in the response and the auto_routing_decision log line. * feat(auto-routing-benchmark): plumb switchCostFactor through config, runs, and routing table Store the new BenchmarkConfig.switchCostFactor in the benchmark_config singleton, snapshot it into benchmark_runs at startRun, and carry the run's snapshotted value into published routing tables so the schema's required RoutingTableSchema.switchCostFactor parses on read. Regenerate the squashed D1 baseline migration, add a Switch cost factor field to the admin config form, and update test fixtures (including the apps/web decision fixtures missing the new required sticky flag). * fix(ai-gateway): align efficient fallback with Qwen-for-all-APIs after main merge * refactor(auto-routing): drop per-candidate API-kind plumbing, validate at config save All decider candidates are served via providers that speak every gateway chat API (in practice OpenRouter), so per-candidate supportedApiKinds was dead weight in the contracts, decision engine, D1 schema, and routing table. The one real failure mode - an admin configuring a model whose serving provider is chat-completions-only - is now rejected at config save time instead. * fix(auto-routing): review-pass fixes - never let a heuristic fallback classification re-anchor the session's sticky model (same trust rule as the classification cache) - drop the dead ClassifierApiKindSchema export - rename the decider pages-helper case so its id no longer collides with the classifier dataset's debug-fix-pagination-slice in shared telemetry - trim a stale JSDoc in model-api-kinds.ts * test(ai-gateway): add sticky field to decision fixture * feat(dev): move auto-routing workers into their own opt-in dev group * fix(auto-routing): make the decider benchmark runnable in local dev - Inject KILO_API_URL into the benchmark container via a new KILO_CLI_API_URL worker var so the kilo CLI targets the same gateway the worker mints tokens against (prod default: api.kilo.ai). - Add .dev.vars.example mapping both URLs to the local apps/web dev server (worker-side localhost, container-side host.docker.internal). - Add AUTO_ROUTING_BENCHMARK_WORKER_URL to the apps/web env example so the admin panel proxies to the local benchmark worker instead of prod. - Work around wrangler force-pulling the amd64 container egress proxy on Apple Silicon (its transparent-proxy setsockopt crashes under emulation, failing every local container start) by pinning the arm64 manifest digest via MINIFLARE_CONTAINER_EGRESS_IMAGE in the dev runner. * fix(auto-routing): kill the whole CLI process tree on decider case timeout The kilo bin is a Node wrapper that spawns the real CLI binary as a grandchild. SIGKILLing only the wrapper orphaned the grandchild on timeout: it kept running (and spending) and held the stdout/stderr pipes open, so 'close' never fired, the case promise never resolved, and the chunk's queue message hung until the runtime cut it — then retried from case 0 and eventually dead-lettered. Observed live: a runaway agentic case ran 20+ minutes past the 180s cap and wedged the whole run. Spawn the CLI detached so it leads its own process group, kill the group on timeout, and add an after-exit grace backstop so a stray pipe-holder can never hang a case again. * feat(auto-routing): benchmark repetitions, p95 latency, and classifier latency gate - Config gains classifierRepetitions, deciderRepetitions (1-5), and classifierMaxP95LatencyMs (null = no constraint); run rows snapshot the active repetition count and latency budget at start time. - case_results PK extended with rep column; timed_out column added. - model_summaries gains p95_latency_ms (nearest-rank p95 over all rows) and timeouts count. - pickClassifierWinner enforces an optional p95 latency budget: candidates meeting both accuracy and latency are ranked by cost; when none meet the budget, falls back to lowest-p95 among accuracy-meeting models. - classifier_winner contract surfaces the winner's p95LatencyMs. - DECIDER_CHUNK_SIZE reduced from 10 to 5 to stay well within queue consumer wall-clock limits. - Container server propagates timedOut flag through ContainerRunResponse and CliRunResult so timed-out cases are recorded in D1. * fix(auto-routing): correct case_results migration backfill and close test gaps - Migration 0001: replace "rep"/"timed_out" column refs in INSERT...SELECT with literal 0,0 — old table lacks those columns; D1 silently degrades double-quoted unknowns to string literals, corrupting NOT NULL integer rows. - Contracts: add BenchmarkConfigSchema defaults test (classifierRepetitions=1, deciderRepetitions=1, classifierMaxP95LatencyMs=1000 when omitted). - Benchmark: extract buildDeciderMessages() pure function; add fan-out test asserting models × reps × ceil(76/5) messages each carrying the correct rep. * feat(admin): benchmark repetitions, latency budget, and p95/timeout columns Add classifier/decider repetitions (1–5) and classifierMaxP95LatencyMs inputs to the Benchmark Config card; add p95 latency and Timeouts columns to the run summaries table; update test fixtures with new fields. * fix(admin): correct runs-table colSpan and cover config form round-trip Set both RunSummariesTable colSpan values back to 6 to match the outer BenchmarkRunsTable's 6-column header (chevron, Kind, Status, Started, Completed, Error). Export configToFormState and formStateToConfig for unit testing and add focused tests covering null-config defaults, round-trip preservation of repetitions/latency fields, and empty-string classifierMaxP95LatencyMs coercing to null. * chore(auto-routing): squash benchmark D1 migrations into one baseline * test(ai-gateway): stop depending on removed morph model in API-kind tests Main merged PR #4004 which deleted the morph provider. The two test files that exercised the rejection branch of modelServesAllGatewayChatApis used morph as the only available Kilo-exclusive model on a chat_completions-only gateway. With morph gone, no real catalog entry satisfies that condition. Both test files now stub findKiloExclusiveModel via jest.mock/requireActual so that the marker id 'test-exclusive/alibaba-only' returns a KiloExclusiveModel with gateway: 'alibaba'. The real PROVIDERS.ALIBABA definition supports only chat_completions, so the rejection path is exercised without relying on any specific provider file being present in the catalog. * fix(auto-routing-benchmark): return 400 when starting a run without config The POST /admin/runs handler let startRun's "config not set" precondition error propagate to the global error handler, surfacing a client-side precondition as HTTP 500. Guard the null config in the route handler, mirroring the /admin/debug-cli pattern, and return 400 instead. * fix(auto-routing-benchmark): slice queue fan-out under sendBatch limit Cloudflare Queues caps sendBatch at 100 messages; a decider fan-out is models × reps × ceil(76/5) messages, which clears 100 with as few as two models, so the dispatch is now sliced into <=100-message batches. A mid-dispatch enqueue failure marks the run failed (surfacing in the admin panel) instead of leaving a partially-enqueued run wedged in 'running' until the stale sweep. * fix(ai-gateway): suppress first-usage events for classifier overhead row The internal auto-routing/classifier microdollar row reused the primary request's posthog_distinct_id, so it could emit the generic first_usage / first_microdollar_usage lifecycle events and race the primary usage row — mis-attributing auto-routing/classifier as the user's first model. Drop the distinct id on that row so the events stay gated to the primary usage; DB billing is unaffected (it keys on kiloUserId). * fix(ai-gateway): bill classifier cost regardless of final-provider BYOK The auto-routing classifier always runs on Kilo's own OpenRouter credential, so its cost is owed whether or not the final inference is served via the user's BYOK key. The billing guard skipped the classifier usage row whenever the final provider was BYOK, letting BYOK users incur repeated Kilo-funded classification with no attribution. Bill on positive classifier cost alone; the row stays is_byok:false / user_byok:false. * fix(ai-gateway): make efficient classifier spend authenticated + exit-safe Two leaks in the kilo-auto/efficient classifier-billing path: - The paid /decide classifier ran before any access check, including for unauthenticated requests — which are then rejected (efficient resolves to a paid model), spending Kilo-funded inference with no user to attribute. The classifier is now skipped when the request has no authenticated user. - Classifier billing was scheduled only at the end of the successful upstream path, so any intervening early return (abuse block, provider/api-kind rejection, balance/org checks, upstream 4xx) dropped the already-incurred cost. Billing is now registered via after() right after auth resolves, so the row persists regardless of how the request ends. Adds tests for the unauthenticated-skip and downstream-rejection (abuse block) paths. * fix(auto-routing): reject duplicate benchmark model ids at validation config_classifier_models.model and config_decider_models.model are D1 primary keys, but BenchmarkConfigSchema only validated per-entry shape and minimum length. Duplicate ids passed validation and surfaced as an opaque D1 constraint violation (HTTP 500) at replaceConfig. Add a superRefine that flags duplicate (trim-normalized) ids with field-specific issues, so a duplicate save returns an actionable 400. Adds contract + route tests. * fix(auto-routing): reject model-experiment ids as decider candidates Per .specs/model-experiments.md, an experimented public_model_id is a dedicated preview id users must explicitly select and MUST NOT enter kilo-auto candidate sets. Benchmark-config save only validated gateway chat-API support, so an experiment public id could be saved as a decider candidate and then automatically selected for kilo-auto/efficient. Add a status-independent ownership check (findExperimentReservedModelIds queries all experiment statuses, not just the active|paused Redis membership) and reject such ids with a 400. Adds a route test. * fix(auto-routing-benchmark): invalidate carried summaries on identity change A model's prior summaries were carried into a new run on model-id match alone, so a run that changed reasoning effort, repetitions, the dataset, or grading/CLI would publish a routing table pairing the current run_models.reasoning_effort with measurements taken under different conditions. Persist an engine_identity (dataset content hash + engine version) per run and carry a prior result only when engine identity, repetitions, AND the model's reasoning_effort all match; otherwise the model is re-benchmarked. Adds the column (migration 0001), identity computation, and carry/invalidation tests. * fix(auto-routing-benchmark): one active run per kind + stale recovery Adds a coherent server-side run-admission state machine: - A partial unique index (one running run per kind) is the atomic backstop; startRun pre-checks for an active run and throws RunAlreadyActiveError, which the admin route maps to 409 instead of creating overlapping runs. - Stale runs are swept on GET /admin/runs (not only when starting a run), so a dead/wedged run is recovered without the UI deadlock where Start is disabled while a run shows 'running'. - finalizeRunIfComplete skips publishing the routing table / classifier winner when a newer run of the same kind has already completed, so a slow older run can't overwrite newer published results. Squashes the branch's D1 migrations into a single baseline now that this schema isn't deployed to a used database. * fix(auto-routing): harden benchmarks admin panel (a11y, overflow, dirty state) Addresses the admin-panel review findings: - Dirty-state tracking in the config editor: a background config refetch (poll / focus) no longer overwrites unsaved edits; the form syncs from server only while pristine, with an explicit "Discard & reload". - Invalidate the routing-table / config queries on the running→terminal run transition so published output refreshes instead of showing stale data. - Expandable run rows now expose a keyboard-accessible button with aria-expanded / aria-controls (row click kept as a mouse convenience). - Wide nested summary + routing tables wrapped in overflow-x-auto. - Full run error shown in the expanded row (plus a title tooltip on the truncated cell) instead of being permanently clipped. * docs(auto-routing): add ADR and benchmark service README
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
morph_warp_grep_free_model(morph-warp-grep-v2) Kilo-exclusive model and deletedproviders/morph.ts.MORPHgateway provider fromprovider-definitions.tsand dropped'morph'from theProviderIdunion — the only model routed through this gateway was the one removed above, so it is no longer used.'morph-warp-grep-v2'toforbiddenFreeModelIdsso stale clients receive an appropriate error (per the AI gateway policy for removed free models).gemma_4_26b_a4b_it_free_modelinstead of the removed morph model.Verification
Visual Changes
N/A
Reviewer Notes
'morph'entries inopenrouter/inference-provider-id.ts(OpenRouterInferenceProviderIdSchemaandVercelNonUserByokInferenceProviderIdSchema) are intentionally left in place — those are external upstream inference provider IDs for routing/BYOK on OpenRouter and Vercel, distinct from our internalMORPHgateway provider.