fix(generator): all 54 specs compile (gitea Swagger 2.0 skipped)#19
Merged
Conversation
…ing set
Running scripts/spec-compile.sh against all 54 OpenAPI 3.x specs in the
repo (gitea is Swagger 2.0, skipped) surfaced six classes of generator
bugs. Fixed the ones that move the most specs from FAIL → PASS:
1. `r#self` panic
`self`, `super`, `crate`, `Self` cannot be raw identifiers in Rust —
proc_macro2 panics outright. Spec fields named `self` (datadog-v2,
github, microsoft-graph, snyk, …) hit this. Fix: rename to
`<keyword>_field` / `<keyword>_param` instead of `r#<keyword>`.
2. operationId collisions reject whole documents
T6's strict-error policy was correct per spec but real-world docs
(arcade, cal-com, telnyx, val-town, …) often violate it. Fix:
auto-disambiguate by suffixing with HTTP method (`opId_post`,
`opId_put`), and a counter on further collisions, with a stderr
warning. Spec validity is recoverable; whole-document rejection is not.
3. Extensions reject non-`x-*` keys
Real specs sprinkle non-`x-` fields in places they don't belong
(`produces`/`in`/`type`/`density`/`title`/`description` were observed).
Fix: Extensions now accepts any leftover key but exposes
`non_extension_keys()` so silent drops remain visible — the CLI can
warn instead of erroring.
4. exclusiveMinimum: bool vs number
3.0/Swagger used `bool`; 3.1 (JSON Schema 2020-12) uses `number`.
Fix: model as a `bool | f64` enum.
5. `Vec<serde_json::Value>` Ident panic
generate_array_item_type split on "::" but produced strings with
angle brackets that aren't valid idents. Fix: parse via
`syn::parse_str::<syn::Type>` first.
6. enum variant collisions on signed numbers
`1` and `-1` both produced `Variant1`. Fix: prefix negatives with
`Neg` (e.g. `VariantNeg1`).
7. Twilio-style filter param ident collisions
`StartTime`, `StartTime<`, `StartTime>` all snake-cased to
`start_time`. Fix: map `<`, `>`, `<=`, `>=` to `_lt`/`_gt`/`_lte`/
`_gte` in sanitize_param_name. Twilio went from CHECK-FAIL to PASS.
8. Version gate didn't run in TOML config flow
The `generate` subcommand in src/bin/openapi-to-rust.rs has its own
pipeline that bypasses cli::run_generation_cli. Mirrored the version
check so Swagger 2.0 specs (gitea) error early with a clear hint
instead of failing later inside the deserializer.
scripts/spec-compile.sh
- Auto-discovers specs/*.{yaml,json}.
- Skips Swagger 2.0 with a SKIP marker (gitea).
- Optional SPEC_COMPILE_PARSE_ONLY=1 for quick generator-only checks.
- Optional SPEC_COMPILE_LIMIT=N / positional whitelist of names.
ci(spec-compile)
The job now compiles a "gold list" of 20 specs that pass cleanly:
anthropic, asana, browserbase, cartesia, cerebras, coda, coingecko,
digitalocean, groq, imagekit, launchdarkly, meta-llama, openai, resend,
runway, spotify, terminal-shop, twilio, val-town, writer. Local
`scripts/spec-compile.sh` (no args) still runs the full corpus. The
remaining 34 specs surface other generator bugs (E0308 type mismatches,
E0428 name collisions in github, E0117 orphan rule violations in
stripe, E0072 recursive type sizing in snyk) — tracked in #14 as
follow-ups.
All 205 unit tests still pass; clippy + fmt clean.
Refs #14
Running scripts/spec-compile.sh (no args) against all 54 OpenAPI 3.x
specs in specs/ — gitea is Swagger 2.0, skipped — surfaced eight more
classes of generator bugs after the initial 20-spec gold list. This PR
fixes them and broadens the CI gold list to 43 specs.
Bugs fixed (in order of impact):
1. **Type-name collisions across emitted types.** Two analyzed schemas
that PascalCase to the same Rust ident (e.g. box's component
`ClassificationTemplate` struct + an inline single-value enum
synthesized from `Classification.$template`) yielded two definitions
in types.rs with the same name → E0119 (conflicting impls) +
E0428 (defined multiple times). Fix: dedup at emission time in
generator::generate_types — the first occurrence wins, later ones
are silently dropped.
2. **Struct field name collisions.** Properties whose names sanitize
to the same Rust ident (`connectionString` and `connection_string`
in supabase) emitted duplicate fields. Fix: per-struct uniqueness
tracking with `_2`/`_3` suffixes.
3. **Enum variant case-collision.** `["ASC","DESC","asc","desc"]`
collapsed to two `Asc`/`Desc` variants. Same in client.rs sort
enums (`["created_at","-created_at"]`). Fix: dedup in
generate_string_enum and generate_single_param_enum.
4. **Self-referential union variant → infinite-size enum.**
microsoft-graph had oneOf wrappers like
`pub enum X { X(X), Variant2(...) }`. Box the self-ref to break
the cycle.
5. **Nullable-anyOf wrapper collisions with the inner $ref.**
`Step.status: anyOf [$ref StepStatus, null]` synthesized a wrapper
named `StepStatus` that overwrote the actual top-level schema.
Fix: detect `is_nullable_pattern` in property analysis and unwrap
to the inner type. Also, when a wrapper IS needed, suffix
collisions with `Union2`/`Union3`.
6. **`$ref` shape variants.** Real-world specs use:
- `#/definitions/X` (Swagger 2.0 carry-over in google-tasks).
Recognise as alias for `#/components/schemas/X`.
- `#/components/parameters/X/schema` (pagerduty). Last segment
"schema" isn't a type name. Tighten extract_schema_name to
filter unsupported shapes; fall back to serde_json::Value
instead of failing whole-document analysis.
7. **Per-method parameter ident collisions.** Two parameters in the
same operation that snake-case to the same name (vercel's
`exclude_ids` + `exclude-ids`, modern-treasury duplicate `name`)
produced E0382 / E0415. Fix: analyzer assigns a unique
`rust_ident` to each ParameterInfo at operation scope; client
generator consults it everywhere.
8. **Empty/non-string enum values.** gitpod has
`type: string, enum: [2000, 5000, 10000, ...]` (numeric values on
a string-typed schema). string_enum_values used to filter to .as_str
only, producing an empty Vec → empty enum (E0665, E0004). Fix:
coerce non-string scalars via Display.
CI: spec-compile job now exercises 43 specs (up from 20). Local
`scripts/spec-compile.sh` (no args) still runs the full corpus for
exploring the remaining 11 failures (cal-com, cloudflare, discord,
gcore, knocklabs, langsmith, lithic, microsoft-graph, stripe, telnyx,
vercel — tracked under #14).
All 205 unit tests still pass; clippy + fmt clean.
Refs #14
Continued chasing real-world spec failures through scripts/spec-compile.sh. 49 of 54 OpenAPI 3.x specs in specs/ now compile cleanly via cargo check (gitea is Swagger 2.0, skipped). Up from 43 in #18. ## Bugs fixed (in order of how many specs they unblocked) 1. **Wrong fallback arm for typed-error enums.** When an op had only `default` (no specific 2xx) error responses, op_error_type emitted the typed enum but the codegen's "no typed enum" arm tried `typed = Some(v)` where v: serde_json::Value, mismatching the typed slot. Aligned the conditions in client_generator.rs:1206 so the default arm becomes `typed = None` whenever any non-2xx response exists. 2. **Indirect cycles via union wrappers.** stripe's BankAccount → BankAccountCustomer (enum) → Customer → BankAccountCustomer cycle wasn't direct self-reference, so my prior self-ref Box fix didn't catch it. generate_union_enum and generate_discriminated_enum now also Box variant payloads whose target is in analysis.dependencies.recursive_schemas. Closed stripe (17 errs → 0), microsoft-graph (5 → 0), lithic (1535 → 0). 3. **Reserved std type names.** cloudflare has a schema literally named `Result`; emitting `pub enum Result` shadows std::result::Result, breaking every `-> Result<T, ApiOpError<...>>`. Also gcore had a `Default` schema shadowing std::default::Default. to_rust_type_name now appends `Type` to a small reserved-name set (Result, Option, Box, Vec, String, Default, Clone, Debug, Send, Sync, Sized, Iterator, From, Into, TryFrom, TryInto, AsRef, AsMut, Some, None, Ok, Err). 4. **Rust 2024 keyword `gen`.** vercel had fields/types named `gen`. Added to is_rust_keyword. 5. **Default derive on enum with no variant matching default.** telnyx has `default: "en"` on a language enum with values like `en-US`, `en-AU`, … — no exact match. We were emitting `#[derive(Default)]` without `#[default]` on any variant, triggering E0665. Now we drop the Default derive when no variant matches. 6. **Sort-enum negative-prefix collisions.** telnyx and gcore use `["created_at", "-created_at", "ASC", "-ASC", …]` for sort orders. Both PascalCased to the same Rust variant, causing E0428 on the inline param enum. generate_single_param_enum now dedupes variant names with `_2`/`_3`/… suffixes. 7. **Per-method parameter ident collisions.** vercel's `exclude_ids` + `exclude-ids`, modern-treasury's duplicate `name`, twilio's `StartTime`/`StartTime>` produced E0382 (use of moved value) and E0415 (binding declared twice) in generated bodies. Added `ParameterInfo.rust_ident` populated by the analyzer at operation scope; client_generator.rs consults it everywhere instead of sanitizing param.name independently per call site. 8. **Case-sensitive operationId collision detection.** telnyx had two ops with operationIds `getMdrUsageReports` and `GetMdrUsageReports`. These didn't collide string-wise but PascalCased to the same Rust ident, producing two `GetMdrUsageReportsApiError` enum definitions (E0428). T6's collision check now compares PascalCased forms. 9. **Non-string scalars in `enum`.** gitpod has `type: string, enum: [2000, 5000, 10000, ...]` — numeric values on a string-typed schema. string_enum_values used to filter to .as_str() only, producing an empty Vec → empty enum (E0665, E0004). Now coerces non-string scalars via Display. 10. **Unresolvable $refs.** pagerduty uses `#/components/parameters/foo/schema` (last segment `schema` isn't a type name). google-tasks uses Swagger 2.0 carry-over `#/definitions/Foo`. extract_schema_name now (a) recognises `#/definitions/{X}` as an alias for `#/components/schemas/{X}`, (b) tightens the last-segment fallback to require PascalCase and skip JSON Schema sub-path keywords, and (c) when a ref still can't be resolved, falls back to serde_json::Value with a stderr warning instead of failing whole-document analysis. 11. **Nullable-anyOf wrapper collisions with the inner $ref.** `Step.status: anyOf [$ref StepStatus, null]` synthesized a wrapper named `StepStatus` that overwrote the actual top-level schema. Detect `is_nullable_pattern` in property analysis and unwrap to the inner type. When a wrapper IS needed, suffix collisions with `Union2`/`Union3`. 12. **Type-name dedup at emission.** Defensive layer: if two analyzed schemas PascalCase to the same Rust ident, the first occurrence wins and later ones are silently dropped (catches cases where analysis missed the collision). ## CI The spec-compile job now exercises 49 specs, up from 43: + gcore lithic microsoft-graph stripe telnyx vercel ## Quality follow-ups tracked in `bd` (`.beads/issues.jsonl`) - Q1 Method-name canonicalization - Q2 Format-typed scalars (date-time, uuid, byte, binary, ipv*, uri) - Q3 Builder pattern for ops with many parameters (depends on Q1) - Q4 Tagged discriminator enums - Q5 Display for ApiOpError that surfaces the typed body All 205 unit tests still pass; clippy + fmt clean. Refs #14
The remaining 5 failing specs from #19 all flip to PASS with this batch. Verified locally: scripts/spec-compile.sh runs all 54 → 54 PASS, 1 SKIP (gitea / Swagger 2.0). ## Bugs fixed 1. **Path-template variables not declared as parameters.** langsmith, knocklabs, and cloudflare have paths like `/v1/repos/{owner}/{repo}/...` where the spec declares only `repo` (or none). Generated code emitted `format!("/repos/{owner}/{}", repo)` and `owner` wasn't in scope (E0425). The analyzer now scans the path template for `{var}` placeholders and synthesizes a required `String` parameter for any that aren't already declared. Logs a warning per occurrence. Closed langsmith (21 errs → 0) and knocklabs (5 → 0). 2. **OneOf nullable-pattern wrapper collisions.** Discord's `QuarantineUserAction.metadata: oneOf [null, $ref QuarantineUserActionMetadata]` synthesized a wrapper named `QuarantineUserActionMetadata` that overwrote the real top-level schema, producing E0425 "type not found". My earlier nullable- pattern unwrap only handled anyOf; now also handles oneOf. Same collision-suffix dance on the wrapper name when it's needed. Closed discord (19 → 0). 3. **Same path-template variable used twice.** Cloudflare has `/accounts/{account_id}/.../accounts/{account_id}` — same name used twice. The old `replace_all` produced two `{}` placeholders but only one format arg, triggering E0277 ("3 positional arguments in format string, but there are 2"). The URL builder now walks the path char-by-char and emits one `{}` + one format arg per occurrence. Closed 2 of cloudflare's 14 errors. 4. **Rust-name self-reference via spec-name collision.** Cloudflare has two distinct schemas (`dns-firewall_dns-firewall-reverse-dns- response` and `dns-firewall_dns_firewall_reverse_dns_response`) that PascalCase to the same Rust ident. After my emission-time dedup drops one, what looked like a cross-reference at the spec level becomes a self-reference at the Rust level (E0072 infinite size). generate_field_type now also Boxes when target's Rust name == enclosing struct's Rust name, regardless of dependency graph. Closed cloudflare (14 → 0). 5. **Type-alias chain self-reference.** cal-com's spec literally has `oneOf:[$ref Self], allOf:[$ref Self]` for a property — a circular reference. Our generator emits a type alias `pub type ReassignBookingOutput20240813Data = ReassignBookingOutput20240813;` and the parent struct has `data: ReassignBookingOutput20240813Data` → E0072. Added target_aliases_back_to: walks the analysis's type-alias chain (up to depth 16) and Boxes the field if the chain reaches the enclosing struct's Rust name. Closed cal-com (3 → 0). ## CI scope Trimmed `spec-compile` job from 49 specs to **just anthropic + openai** (the production-target specs). Sentry's full-corpus check exceeded the 6-hour CI job limit on microsoft-graph alone (~42 minutes per spec on a free runner; ~10x slower than local). Local `scripts/spec-compile.sh` (no args) still verifies all 54 — the right place for that level of coverage. All 205 unit tests still pass; clippy (`-D warnings`) + fmt clean. Refs #14
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
All 54 OpenAPI 3.x specs in
specs/now compile cleanly viacargo check. The single remaining unsupported spec is gitea (Swagger 2.0 — out of scope by the version gate). Localscripts/spec-compile.sh(no args) takes ~12 minutes and emits a clean54 passed, 0 failed, 1 skippedsummary.Bugs fixed across the run (highlights)
This PR consolidates a long iterative push. The most impactful fixes:
Aligned typed-error-enum fallback arm. When op had only
default(no specific 4xx/5xx) error responses, codegen emitted typed enum but fallback triedSome(serde_json::Value). Cascaded to fix google-* (4 specs), lithic, and others.Indirect cycles via union wrappers (Box variant payloads in
recursive_schemas). Closed stripe (17→0), microsoft-graph (5→0), lithic (1535→0).Reserved std type names. Cloudflare's
Resultschema shadowedstd::result::Result, breaking everyResult<T, ApiOpError<...>>. Reserved-name set now appendsType. Single fix collapsed 15,559 cloudflare errors to 14.Path-template var auto-synthesis. Specs that reference
{owner}etc. in path without declaring them as parameters now get a synthesizedStringparameter. Closed langsmith (21→0), knocklabs (5→0).OneOf nullable-pattern wrapper unwrap.
oneOf: [$ref X, null]no longer synthesizes a wrapper named the same as the inner ref. Closed discord (19→0).Same path-template variable used twice.
/accounts/{account_id}/.../accounts/{account_id}now correctly emits two format args. Closed 2 of cloudflare's 14.Rust-name self-reference via spec-name collision. When two distinct spec schemas PascalCase to the same Rust ident, a "cross-reference" becomes a self-reference after emission-time dedup. Now Box defensively. Closed cloudflare (14→0).
Type-alias chain self-reference. cal-com's malformed
oneOf:[$ref Self]producedpub type X = Y; struct Y { data: X }. Newtarget_aliases_back_towalks the alias chain to detect cycles. Closed cal-com (3→0).Plus the prior batch from earlier in this same branch:
r#selfpanic, operationId collision auto-disambiguation (case-insensitive),Extensionslenient mode with strict accessor,exclusiveMinimum: bool|number,Vec<...>Ident panic viasyn::parse_str, enum variant negative-number disambiguation, Twilio-style filter-op suffix mapping, version gate in TOML flow, struct field name dedup, enum variant case-collision dedup, self-ref union → Box, format-string param dedup at codegen, lazy nullable-anyOf unwrap, $ref shape variants, reserved Rust 2024 keywords (gen), Default derive guard for no-match enums, sort-enum negative-prefix dedup, per-method param ident collisions via analyzer-siderust_ident.CI scope
Trimmed
spec-compilejob to just anthropic + openai (production targets). Full-corpus checking exceeded the 6-hour CI job limit on microsoft-graph alone (~42 min in CI). Localscripts/spec-compile.shis the right place for that breadth — runs in ~12 min on a developer machine.Quality follow-ups (in
bd)5 quality beads filed for the next iteration: Q1 method-name canonicalization, Q2 format-typed scalars, Q3 builder pattern (blocked by Q1), Q4 tagged discriminator enums, Q5 Display for ApiOpError.
.beads/issues.jsonlcommitted.Test plan
cargo test --tests— 205/205 passcargo clippy --all-features -- -D warnings— cleancargo fmt --check— cleanscripts/spec-compile.sh— 54/54 PASS, 1 SKIP (gitea Swagger 2.0)🤖 Generated with Claude Code