fix(generator): compile 43 specs (was 20), broaden CI corpus#18
Closed
lightsofapollo wants to merge 2 commits into
Closed
fix(generator): compile 43 specs (was 20), broaden CI corpus#18lightsofapollo wants to merge 2 commits into
lightsofapollo wants to merge 2 commits into
Conversation
…ing set
Running scripts/spec-compile.sh against all 54 OpenAPI 3.x specs in the
repo (gitea is Swagger 2.0, skipped) surfaced six classes of generator
bugs. Fixed the ones that move the most specs from FAIL → PASS:
1. `r#self` panic
`self`, `super`, `crate`, `Self` cannot be raw identifiers in Rust —
proc_macro2 panics outright. Spec fields named `self` (datadog-v2,
github, microsoft-graph, snyk, …) hit this. Fix: rename to
`<keyword>_field` / `<keyword>_param` instead of `r#<keyword>`.
2. operationId collisions reject whole documents
T6's strict-error policy was correct per spec but real-world docs
(arcade, cal-com, telnyx, val-town, …) often violate it. Fix:
auto-disambiguate by suffixing with HTTP method (`opId_post`,
`opId_put`), and a counter on further collisions, with a stderr
warning. Spec validity is recoverable; whole-document rejection is not.
3. Extensions reject non-`x-*` keys
Real specs sprinkle non-`x-` fields in places they don't belong
(`produces`/`in`/`type`/`density`/`title`/`description` were observed).
Fix: Extensions now accepts any leftover key but exposes
`non_extension_keys()` so silent drops remain visible — the CLI can
warn instead of erroring.
4. exclusiveMinimum: bool vs number
3.0/Swagger used `bool`; 3.1 (JSON Schema 2020-12) uses `number`.
Fix: model as a `bool | f64` enum.
5. `Vec<serde_json::Value>` Ident panic
generate_array_item_type split on "::" but produced strings with
angle brackets that aren't valid idents. Fix: parse via
`syn::parse_str::<syn::Type>` first.
6. enum variant collisions on signed numbers
`1` and `-1` both produced `Variant1`. Fix: prefix negatives with
`Neg` (e.g. `VariantNeg1`).
7. Twilio-style filter param ident collisions
`StartTime`, `StartTime<`, `StartTime>` all snake-cased to
`start_time`. Fix: map `<`, `>`, `<=`, `>=` to `_lt`/`_gt`/`_lte`/
`_gte` in sanitize_param_name. Twilio went from CHECK-FAIL to PASS.
8. Version gate didn't run in TOML config flow
The `generate` subcommand in src/bin/openapi-to-rust.rs has its own
pipeline that bypasses cli::run_generation_cli. Mirrored the version
check so Swagger 2.0 specs (gitea) error early with a clear hint
instead of failing later inside the deserializer.
scripts/spec-compile.sh
- Auto-discovers specs/*.{yaml,json}.
- Skips Swagger 2.0 with a SKIP marker (gitea).
- Optional SPEC_COMPILE_PARSE_ONLY=1 for quick generator-only checks.
- Optional SPEC_COMPILE_LIMIT=N / positional whitelist of names.
ci(spec-compile)
The job now compiles a "gold list" of 20 specs that pass cleanly:
anthropic, asana, browserbase, cartesia, cerebras, coda, coingecko,
digitalocean, groq, imagekit, launchdarkly, meta-llama, openai, resend,
runway, spotify, terminal-shop, twilio, val-town, writer. Local
`scripts/spec-compile.sh` (no args) still runs the full corpus. The
remaining 34 specs surface other generator bugs (E0308 type mismatches,
E0428 name collisions in github, E0117 orphan rule violations in
stripe, E0072 recursive type sizing in snyk) — tracked in #14 as
follow-ups.
All 205 unit tests still pass; clippy + fmt clean.
Refs #14
Running scripts/spec-compile.sh (no args) against all 54 OpenAPI 3.x
specs in specs/ — gitea is Swagger 2.0, skipped — surfaced eight more
classes of generator bugs after the initial 20-spec gold list. This PR
fixes them and broadens the CI gold list to 43 specs.
Bugs fixed (in order of impact):
1. **Type-name collisions across emitted types.** Two analyzed schemas
that PascalCase to the same Rust ident (e.g. box's component
`ClassificationTemplate` struct + an inline single-value enum
synthesized from `Classification.$template`) yielded two definitions
in types.rs with the same name → E0119 (conflicting impls) +
E0428 (defined multiple times). Fix: dedup at emission time in
generator::generate_types — the first occurrence wins, later ones
are silently dropped.
2. **Struct field name collisions.** Properties whose names sanitize
to the same Rust ident (`connectionString` and `connection_string`
in supabase) emitted duplicate fields. Fix: per-struct uniqueness
tracking with `_2`/`_3` suffixes.
3. **Enum variant case-collision.** `["ASC","DESC","asc","desc"]`
collapsed to two `Asc`/`Desc` variants. Same in client.rs sort
enums (`["created_at","-created_at"]`). Fix: dedup in
generate_string_enum and generate_single_param_enum.
4. **Self-referential union variant → infinite-size enum.**
microsoft-graph had oneOf wrappers like
`pub enum X { X(X), Variant2(...) }`. Box the self-ref to break
the cycle.
5. **Nullable-anyOf wrapper collisions with the inner $ref.**
`Step.status: anyOf [$ref StepStatus, null]` synthesized a wrapper
named `StepStatus` that overwrote the actual top-level schema.
Fix: detect `is_nullable_pattern` in property analysis and unwrap
to the inner type. Also, when a wrapper IS needed, suffix
collisions with `Union2`/`Union3`.
6. **`$ref` shape variants.** Real-world specs use:
- `#/definitions/X` (Swagger 2.0 carry-over in google-tasks).
Recognise as alias for `#/components/schemas/X`.
- `#/components/parameters/X/schema` (pagerduty). Last segment
"schema" isn't a type name. Tighten extract_schema_name to
filter unsupported shapes; fall back to serde_json::Value
instead of failing whole-document analysis.
7. **Per-method parameter ident collisions.** Two parameters in the
same operation that snake-case to the same name (vercel's
`exclude_ids` + `exclude-ids`, modern-treasury duplicate `name`)
produced E0382 / E0415. Fix: analyzer assigns a unique
`rust_ident` to each ParameterInfo at operation scope; client
generator consults it everywhere.
8. **Empty/non-string enum values.** gitpod has
`type: string, enum: [2000, 5000, 10000, ...]` (numeric values on
a string-typed schema). string_enum_values used to filter to .as_str
only, producing an empty Vec → empty enum (E0665, E0004). Fix:
coerce non-string scalars via Display.
CI: spec-compile job now exercises 43 specs (up from 20). Local
`scripts/spec-compile.sh` (no args) still runs the full corpus for
exploring the remaining 11 failures (cal-com, cloudflare, discord,
gcore, knocklabs, langsmith, lithic, microsoft-graph, stripe, telnyx,
vercel — tracked under #14).
All 205 unit tests still pass; clippy + fmt clean.
Refs #14
This was referenced May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After the previous PR's gold list of 20 specs, ran
scripts/spec-compile.sh(no args) against all 54 OpenAPI 3.x specs inspecs/(gitea is Swagger 2.0, skipped) and chased the failures. 43/54 now compile cleanly viacargo check. CI'sspec-compilejob now gates on the broader 43-spec list.Bugs fixed
In order of how many specs they unblocked:
Type-name collisions across emitted types. Two analyzed schemas that PascalCase to the same Rust ident (e.g. box's
ClassificationTemplatestruct + a synthesized inline enum fromClassification.$template) emitted two definitions → E0119/E0428. Dedup at emission time; first occurrence wins.Struct field name collisions.
connectionString+connection_stringin supabase both →connection_string. Per-struct uniqueness tracking with_2/_3suffixes.Enum variant case collisions.
[ASC,DESC,asc,desc]collapsed to twoAsc/Descvariants; same in client.rs sort enums ([created_at,-created_at]). Dedup in bothgenerate_string_enumandgenerate_single_param_enum.Self-referential union variant → infinite-size enum. microsoft-graph had
pub enum X { X(X), V2(...) }. Box the self-ref.Nullable-anyOf wrapper collisions.
Step.status: anyOf [$ref StepStatus, null]synthesized a wrapper namedStepStatusthat overwrote the top-level schema. Detectis_nullable_patternin property analysis and unwrap; suffix collisions withUnion2when a wrapper IS needed.$refshape variants.#/definitions/X(Swagger 2.0 carry-over in google-tasks) recognized as alias.#/components/parameters/X/schema(pagerduty) falls back toserde_json::Valueinstead of failing whole-document analysis.Per-method parameter ident collisions. vercel's
exclude_ids/exclude-ids, modern-treasury's duplicatename, twilio'sStartTime/StartTime>. Analyzer now assigns eachParameterInfo.rust_identat operation scope, consulted by the client generator everywhere.Numeric enum values on string-typed schemas. gitpod has
type: string, enum: [2000, 5000, ...]. Coerce non-string scalars via Display instead of filtering them out (which produced an empty enum, E0665/E0004).Conformance
Remaining 11 (tracked under #14): cal-com, cloudflare, discord, gcore, knocklabs, langsmith, lithic, microsoft-graph, stripe, telnyx, vercel. Top error classes: E0308 (cloudflare 7k+, lithic 662, stripe 4), E0428 (gcore 50, telnyx 39), E0425 (langsmith 20, knocklabs 3, discord 6).
Test plan
cargo test --tests— 205/205 passcargo clippy --all-features -- -D warnings— cleancargo fmt --check— cleanscripts/spec-compile.shagainst all 54 specs locally — 43 compile cleanly, 11 fail with categorized errors🤖 Generated with Claude Code