Skip to content

fix(generator): compile 20 real-world specs, gate CI on the working set#17

Closed
lightsofapollo wants to merge 1 commit into
mainfrom
feat/spec-compile-all
Closed

fix(generator): compile 20 real-world specs, gate CI on the working set#17
lightsofapollo wants to merge 1 commit into
mainfrom
feat/spec-compile-all

Conversation

@lightsofapollo
Copy link
Copy Markdown
Contributor

Summary

Ran scripts/spec-compile.sh against the 50+ OpenAPI specs in specs/ to surface generator bugs across real-world documents. 20 of 54 specs (gitea is Swagger 2.0 — skipped) now compile cleanly via cargo check. The remaining 34 surface other generator bugs that need deeper work; this PR documents them and gates CI on the working set so we don't regress what works.

Bugs fixed (in order of impact)

  1. r#self panicself/super/crate/Self cannot be raw identifiers in Rust. proc_macro2 panics outright when fed r#self. Specs that name a field self (datadog-v2, github, microsoft-graph, snyk, google-calendar, digitalocean, launchdarkly, …) all hit this. Fix: rename to <keyword>_field / <keyword>_param.

  2. operationId collisions blocked whole documents — T6's strict-error policy was correct per spec but real-world docs (arcade, cal-com, telnyx, val-town) often violate it. Auto-disambiguate by suffixing with HTTP method, with a stderr warning.

  3. Extensions rejected non-x-* keysproduces, in, type, density, title, description were observed on objects where they don't belong. Loosened Extensions to accept any leftover key and exposed non_extension_keys() so silent drops remain visible.

  4. exclusiveMinimum: bool vs number — 3.0/Swagger used bool; 3.1 uses number. Modeled as a bool | f64 enum.

  5. Vec<serde_json::Value> Ident panicgenerate_array_item_type split on :: and produced strings with angle brackets that aren't valid Rust idents. Now parses via syn::parse_str::<syn::Type> first.

  6. enum variant collisions on signed numbers1 and -1 both produced Variant1. Prefix negatives with Neg (e.g. VariantNeg1).

  7. Twilio-style filter param ident collisionsStartTime, StartTime<, StartTime> all snake-cased to start_time. Map <, >, <=, >= to _lt/_gt/_lte/_gte in sanitize_param_name.

  8. Version gate bypassed by the TOML config flowgenerate subcommand has its own pipeline that bypassed cli::run_generation_cli. Mirrored the check so Swagger 2.0 specs error early with a clear hint instead of failing later inside the deserializer.

CI: spec-compile gold list

The spec-compile job now verifies a gold list of 20 specs that compile cleanly: anthropic, asana, browserbase, cartesia, cerebras, coda, coingecko, digitalocean, groq, imagekit, launchdarkly, meta-llama, openai, resend, runway, spotify, terminal-shop, twilio, val-town, writer.

Local scripts/spec-compile.sh (no args) still runs the full corpus for development. The script auto-discovers specs/*.{yaml,json}, skips Swagger 2.0 (gitea), and supports SPEC_COMPILE_PARSE_ONLY=1 for quick generator-only checks.

What's still failing (34 specs, tracked in #14)

Error class Affected Root cause
E0308 mismatched types stripe (1632), box (99), vercel (111), … Codegen emits wrong types in some property paths
E0428 name defined multiple times github, gcore Type/variant deduplication misses some collisions
E0117 orphan rule stripe, gcore Tries to impl foreign traits on foreign types
E0072 recursive type infinite size snyk Missing Box<T> for some recursive references
E0382 use of moved value (now reduced) String reused in generated method body

Each is a focused follow-up. Adding a fix → spec moves from CHECK-FAIL → PASS → can be added to the gold list.

Test plan

  • cargo test --tests — 205/205 pass
  • cargo clippy --all-features -- -D warnings — clean
  • cargo fmt --check — clean
  • scripts/spec-compile.sh against all 54 specs — 20 compile, 34 fail with categorized errors
  • CI spec-compile job gates on the 20 working specs

🤖 Generated with Claude Code

…ing set

Running scripts/spec-compile.sh against all 54 OpenAPI 3.x specs in the
repo (gitea is Swagger 2.0, skipped) surfaced six classes of generator
bugs. Fixed the ones that move the most specs from FAIL → PASS:

1. `r#self` panic
   `self`, `super`, `crate`, `Self` cannot be raw identifiers in Rust —
   proc_macro2 panics outright. Spec fields named `self` (datadog-v2,
   github, microsoft-graph, snyk, …) hit this. Fix: rename to
   `<keyword>_field` / `<keyword>_param` instead of `r#<keyword>`.

2. operationId collisions reject whole documents
   T6's strict-error policy was correct per spec but real-world docs
   (arcade, cal-com, telnyx, val-town, …) often violate it. Fix:
   auto-disambiguate by suffixing with HTTP method (`opId_post`,
   `opId_put`), and a counter on further collisions, with a stderr
   warning. Spec validity is recoverable; whole-document rejection is not.

3. Extensions reject non-`x-*` keys
   Real specs sprinkle non-`x-` fields in places they don't belong
   (`produces`/`in`/`type`/`density`/`title`/`description` were observed).
   Fix: Extensions now accepts any leftover key but exposes
   `non_extension_keys()` so silent drops remain visible — the CLI can
   warn instead of erroring.

4. exclusiveMinimum: bool vs number
   3.0/Swagger used `bool`; 3.1 (JSON Schema 2020-12) uses `number`.
   Fix: model as a `bool | f64` enum.

5. `Vec<serde_json::Value>` Ident panic
   generate_array_item_type split on "::" but produced strings with
   angle brackets that aren't valid idents. Fix: parse via
   `syn::parse_str::<syn::Type>` first.

6. enum variant collisions on signed numbers
   `1` and `-1` both produced `Variant1`. Fix: prefix negatives with
   `Neg` (e.g. `VariantNeg1`).

7. Twilio-style filter param ident collisions
   `StartTime`, `StartTime<`, `StartTime>` all snake-cased to
   `start_time`. Fix: map `<`, `>`, `<=`, `>=` to `_lt`/`_gt`/`_lte`/
   `_gte` in sanitize_param_name. Twilio went from CHECK-FAIL to PASS.

8. Version gate didn't run in TOML config flow
   The `generate` subcommand in src/bin/openapi-to-rust.rs has its own
   pipeline that bypasses cli::run_generation_cli. Mirrored the version
   check so Swagger 2.0 specs (gitea) error early with a clear hint
   instead of failing later inside the deserializer.

scripts/spec-compile.sh
- Auto-discovers specs/*.{yaml,json}.
- Skips Swagger 2.0 with a SKIP marker (gitea).
- Optional SPEC_COMPILE_PARSE_ONLY=1 for quick generator-only checks.
- Optional SPEC_COMPILE_LIMIT=N / positional whitelist of names.

ci(spec-compile)
The job now compiles a "gold list" of 20 specs that pass cleanly:
anthropic, asana, browserbase, cartesia, cerebras, coda, coingecko,
digitalocean, groq, imagekit, launchdarkly, meta-llama, openai, resend,
runway, spotify, terminal-shop, twilio, val-town, writer. Local
`scripts/spec-compile.sh` (no args) still runs the full corpus. The
remaining 34 specs surface other generator bugs (E0308 type mismatches,
E0428 name collisions in github, E0117 orphan rule violations in
stripe, E0072 recursive type sizing in snyk) — tracked in #14 as
follow-ups.

All 205 unit tests still pass; clippy + fmt clean.

Refs #14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant