feat(generator): Q2 typed-scalar formats with opt-out config#20
Merged
Conversation
… (Q2.0) Introduce src/type_mapping.rs as the single chokepoint for every (openapi_type, format) → Rust-type decision. Pre-refactor the same logic lived in two places (openapi_type_to_rust_type and get_number_rust_type) plus inline "String".to_string() literals in the Typed/TypedMulti arm. Adding format-aware mappings (chrono, uuid, url, …) without a chokepoint would mean touching every site for every format; with TypeMapper each future Q2.* issue edits one method. Wiring: - TypeMapper holds TypeMappingConfig + UsedFeatures; defaults preserve pre-refactor behavior bit-for-bit. - GeneratorConfig.types carries the config; ConfigFile parses [generator.types] from TOML and threads it through into_generator_config(). - SchemaAnalyzer gains a type_mapper field; new() defaults it, with_type_mapper() takes a caller-built mapper. The TOML-config path in src/bin/openapi-to-rust.rs uses with_type_mapper so user config drives type generation. - openapi_type_to_rust_type, get_number_rust_type, and the Typed/TypedMulti arm at analysis.rs:1151 now delegate to TypeMapper. Verification: - 18 lib unit tests pass (incl. 5 new TypeMapper tests). - Full integration suite: zero snapshot diffs. - scripts/spec-compile.sh: 54 passed, 0 failed, 1 skipped (gitea, baseline). Closes openapi-generator-r36 (Q2.0). Unblocks Q2 (quq) and Q2.1–Q2.8. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`format: date-time` / `uuid` / `uri` / `binary` / `byte` / `ipv4` / `ipv6` on a `string` property now produces typed Rust scalars by default instead of bare `String`. Opt out per format in `[generator.types]` or globally via `--types-conservative`. ## Defaults | format | rust type | crate | |------------|---------------------------------|-----------------| | date-time | chrono::DateTime<Utc> | chrono+serde | | date | chrono::NaiveDate | chrono+serde | | time | chrono::NaiveTime | chrono+serde | | uuid | uuid::Uuid | uuid+serde | | uri / url | url::Url | url+serde | | ipv4 / ipv6| std::net::Ipv*Addr | std (no dep) | | binary | bytes::Bytes | bytes+serde | | byte | Vec<u8> + #[serde(with = "base64_serde")] | base64 | `email` and `duration` stay as String for now (less universal / needs ISO 8601 codec; both follow-ups). ## Wiring - `TypeMappingConfig` switched from `Option<String>` placeholders to proper `DateStrategy`/`UuidStrategy`/`ByteStrategy`/etc enums; each defaults to its typed strategy. - `TypeMapper.string_format()` dispatches on the normalized format and records used crates in `UsedFeatures` (consumed by Q2.8 later). - `SchemaType::Primitive` gained a `serde_with: Option<String>` field carrying the codec hint; threaded from `MappedType` through analysis to the generator's field-attr emission. - `analyze_property_schema_with_context`'s String non-enum arm now routes through `TypeMapper` (Q2.0 only got the top-level Typed arm). - `SchemaAnalysis.used_type_features` snapshots the mapper's used crates after analysis; the generator emits `mod base64_serde` only when `format: byte` was actually referenced. - `base64_serde` includes an `option` submodule so nullable `Option<Vec<u8>>` fields use `with = "base64_serde::option"` — serde dispatches on field type and the base codec only handles `Vec<u8>`. - `type_lacks_default()` extended for chrono / url / time / iso8601 / email_address types so `#[serde(default)]` is suppressed where the scalar has no `Default` impl. - `type_name_to_variant_name` + `generate_union_enum` handle qualified / generic Rust paths in primitive oneOf variants (`bytes::Bytes`, `chrono::DateTime<chrono::Utc>`, …) — without these, oneOf variants like `Vec<u8>+VideoReferenceInputParam` produced `BytesBytes(BytesBytes)` and refused to compile. - `generate_type_alias` and `generate_field_type` now use a single `parse_rust_type()` helper backed by `syn::parse_str` instead of ad-hoc `::`-splitting that choked on generics. ## CLI - `openapi-to-rust generate --types-conservative` — overrides `[generator.types]` to set every format back to "string", useful for bisecting regressions caused by typed-scalar adoption. ## Verification - 33 lib + integration unit tests (10 new typed-scalar end-to-end + 7 new TypeMapper tests). - spec-compile gate: 54 passed, 0 failed, 1 skipped (gitea, baseline). `--types-conservative` not directly gated yet — the conservative mapper is exercised by the dedicated unit/integration tests. - Bumps test-rig Cargo.toml templates (spec-compile, test_helpers, fixture_tests, multi_response_client_test) with the new optional deps so the gates exercise the typed-scalar path. Closes openapi-generator-quq (Q2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Q2 turned typed-scalar formats on by default, which means generated code now references chrono/uuid/url/bytes/base64 even though the generator doesn't own the consuming crate's Cargo.toml. Without an advisory, users hit "use of unresolved module `chrono`" on first build with no clear pointer to the fix. This change surfaces required deps via three mechanisms: 1. `GenerationResult.required_deps: Vec<DepRequirement>` — programmatic access for library consumers. 2. `<output_dir>/REQUIRED_DEPS.toml` — copy-pasteable file with a `[dependencies]` block, written by `write_files()` only when the generated code references at least one optional crate. 3. CLI `openapi-to-rust generate` prints the same summary to stderr and ends with the file path so the artifact is discoverable. ## Wiring - `TypeFeature::dep_requirement()` — canonical (crate, version, features) per feature; single source of truth so the spec-compile gate, test harnesses, and end-user advisory can't drift. - `DepRequirement::to_toml_line()` — picks the most compact valid `[dependencies]` form (string version when no features, inline table when features are needed). - `collect_dep_requirements()` snapshots `UsedFeatures` as a sorted, de-duplicated list — output is deterministic for diffs. - `render_required_deps_toml()` returns `None` when input is empty so callers can skip writing the file (no clutter for pure-string specs). ## Verification - 5 new unit tests (dep_requirement rendering, sorted/deduped collection, empty-vs-populated render). - 4 new end-to-end tests (required_deps populated from real analysis, REQUIRED_DEPS.toml written/skipped correctly). - Smoke test against anthropic spec: stderr advisory + on-disk file both produced as expected (chrono + base64). - Full integration suite passes (28 lib + 14 typed-scalar tests). - spec-compile gate: 54 passed, 1 skipped (gitea, baseline). Closes openapi-generator-fbn (Q2.8). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-Q2.7 the `oneOf` and `anyOf` paths diverged on primitive
variants. Same input, different output:
oneOf: [string, integer] → pub enum X { String(String), Integer(i64) }
anyOf: [string, integer] → pub type XString = String;
pub type XIntegerVariant1 = i64;
pub enum X { XString(XString), XIntegerVariant1(XIntegerVariant1) }
Both are #[serde(untagged)] so they round-trip the same JSON, but
the anyOf shape leaked synthetic type aliases into the generated
module and gave callers worse-named variants. The original Q2.7
bead description claimed primitive unions fell back to
`serde_json::Value`; that was stale — primitives have always
become Union variants. The real gap was alias bloat on the anyOf
path.
This change makes `analyze_anyof_union`'s primitive branch mirror
`analyze_untagged_oneof_union`: route the variant schema through
TypeMapper, push the resulting Rust type directly as the variant
target. The generator's `generate_union_enum` already knew how
to render bare primitive types as variants (the `bool|i32|String`
match at line 1319) so no generator-side change was needed.
Toggle:
[generator.types.shape]
primitive_unions = false # restore pre-Q2.7 alias shape
Default `true`. The opt-out exists for users with snapshot
checks that depend on the aliased variant names.
Verification:
- 6 new tests in tests/primitive_unions_test.rs covering oneOf,
anyOf (default + opt-out), 3-variant unions, and explicit-null
filtering.
- 7 existing snapshots updated to reflect the cleaner shape:
content_union_structured, discriminator_array_standalone,
inline_variant_naming, multi_array_variants, nested_union_array,
property_underscore_types, union_array_naming.
- Full integration suite passes.
- spec-compile gate: 54 passed, 1 skipped (gitea, baseline).
Closes openapi-generator-j6n (Q2.7).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Q2.2, Q2.3)
Three small Q2 follow-ups, all default-on, opt-out per-feature.
## Q2.1 — uint32/uint64 → u32/u64
`format: uint32` / `uint64` now map to `u32` / `u64` instead of
degrading to `i64`. ~288 usages across the spec corpus. `[generator.types]
unsigned = false` reverts to pre-Q2.1 i64 fallback.
## Q2.2 — built-in format aliases
Vendor-specific format names normalize to canonical ones before
the standard format dispatch. Built-in aliases:
uuid4, uuid_v4, UUID → uuid
unix-time, unix_time, unixtime, timestamp → int64
User-supplied [generator.types.format_aliases] entries win on
collision so users can override built-ins (e.g. force `uuid4`
back to plain string).
## Q2.3 — typed BTreeMap from additionalProperties: <schema>
Pre-Q2.3 `additionalProperties: <schema>` collapsed to
`BTreeMap<String, serde_json::Value>`, dropping the value-type
information. Now the schema is analyzed and the emitted field is
`BTreeMap<String, T>` where T is the resolved type (including
typed scalars from Q2 — e.g. `additionalProperties: { format:
uuid }` produces `BTreeMap<String, uuid::Uuid>`). Implementation:
- `SchemaType::Object.additional_properties: bool` →
`ObjectAdditionalProperties` enum (Forbidden / Untyped / Typed).
- Generator emits the BTreeMap field with the right value type.
- `[generator.types.shape] additional_properties_typed = false`
reverts to the pre-Q2.3 untyped behavior.
## Verification
- 21 lib unit tests (added 6 for Q2.1/Q2.2 alias and unsigned coverage).
- 8 new integration tests in tests/integer_formats_test.rs.
- 7 new integration tests in tests/additional_properties_typed_test.rs.
- 1 snapshot update (nested_inline_objects_test) reflecting the
typed BTreeMap shape from Q2.3.
- spec-compile gate: previously verified 54/54 pass under Q2.1+Q2.2;
Q2.3 changes have no spec-corpus regressions in local checks.
Closes openapi-generator-bw1 (Q2.1), openapi-generator-gub (Q2.2),
openapi-generator-61h (Q2.3).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two doc-comment-emitting features. Both default-on, both feed non-binding human-readable hints to callers without adding any runtime crate dependencies. ## Q2.4 — constraint annotations as doc comments Pre-Q2.4 the generator parsed minimum/maximum/min_length/ max_length/pattern/multiple_of/min_items/max_items/uniqueItems into SchemaDetails but never emitted them. Real specs use these heavily (13k+ uniqueItems and 4k+ min/max occurrences across the corpus); dropping them was a real loss for callers trying to understand the contract. Now each property with at least one constraint gets a `/// Constraint: <key>=<value>, …` doc comment. Pattern strings are escaped so `///` and `*/` substrings can't terminate the surrounding doc comment. Toggle: `[generator.types.constraints] mode = "doc"` (default) / `"off"` (suppress entirely). **No client-side validation** by design. The generator never emits `#[validate(...)]` attributes or pulls in the `validator` crate. OpenAPI constraints belong on the wire contract; the server is the source of truth. Doc comments give callers visibility without the SDK duplicating server logic and going brittle when rules drift. The `no_validate_attribute_is_ever_emitted` test pins this guarantee. Implementation: - `PropertyConstraints` struct in analysis.rs captures the relevant SchemaDetails fields per property. - `PropertyInfo` carries the constraints alongside the schema type. - Generator emits the doc line via `generate_constraint_doc()` + `format_constraints_doc()` helper. ## Q2.6 — x-enum-varnames / x-enum-descriptions Common vendor extensions for enum schemas: arrays of Rust-friendly variant identifiers and per-variant descriptions, parallel to the spec's `enum` array. Used by arcade.yaml, datadog-v2.yaml, and others in the corpus. When `x-enum-varnames` is present and length-matches the enum array, the generator uses those identifiers for variant names instead of the default PascalCase heuristic. Wire format is preserved via `#[serde(rename = "<original-value>")]`. When `x-enum-descriptions` is present, each entry becomes the variant's doc comment. Length-mismatched extensions are silently dropped at analysis time with a stderr warning; the generator falls back to the default heuristic. Toggles: `[generator.types.enums]` `x_enum_varnames` / `x_enum_descriptions` (both default true). Implementation: - `EnumExtensions` struct in analysis.rs holds the validated varnames + descriptions. - `SchemaAnalysis.enum_extensions` side-channel keyed by analyzed- schema name (avoided extending every StringEnum constructor). - `extract_enum_extensions()` populates after analyze() by reading `original` JSON. - `generate_string_enum` + `generate_extensible_enum` accept an `Option<&EnumExtensions>` and apply overrides when toggles allow. ## Verification - 8 new tests in tests/constraint_doc_test.rs (Q2.4). - 6 new tests in tests/x_enum_varnames_test.rs (Q2.6). - 1 snapshot updated (union_array_naming) where a real spec field with a `pattern` got its constraint doc surfaced. - Full integration suite passes; spec-compile gate verification pending in next commit. Closes openapi-generator-d8y (Q2.4) and openapi-generator-4mu (Q2.6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- fmt: rustfmt across new/modified files (test helpers, generator
bin, ConstraintMode wiring).
- clippy:
- Convert `impl Default for <Strategy>` blocks to
`#[derive(Default)]` + `#[default]` per-variant for all eight
type-mapping strategy enums and ConstraintMode (clippy
`derivable_impls`).
- Replace literal U+200B chars in `format_constraints_doc`'s
pattern escaping with the `\u{200B}` Rust escape (clippy
`invisible_characters`).
- doc: wrap `Vec<u8>` in backticks in the SchemaType::Primitive
docstring (rustdoc `invalid_html_tags` treated `<u8>` as an
unclosed HTML tag).
- test: add `serde_with: ..` to a SchemaType::Primitive pattern
match in `examples/number_formats.rs`, and add the new `types`
field to the GeneratorConfig literal in `examples/complete_workflow.rs`.
No behavior change. All four gates pass locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Generates better Rust types out of OpenAPI specs by honoring
formathints and vendor extensions instead of collapsing everything to
String/serde_json::Value. Default-on for common cases, opt-outper feature via a new
[generator.types]TOML block (or globally via--types-conservative).What changed
ce94ff6)TypeMapperchokepoint insrc/type_mapping.rs— single point for every(openapi_type, format)→ Rust-type decision. No behavior change on its own.fa17c50)date-time→chrono::DateTime<Utc>;uuid→uuid::Uuid;byte→Vec<u8>+ inlined base64 codec;binary→bytes::Bytes;ipv4/ipv6→std::net::Ip*Addr;uri→url::Url. Threaded codec hints throughSchemaType::Primitive.serde_withto the field-emission site.--types-conservativeCLI flag for bisecting.ddfcf3f)REQUIRED_DEPS.tomlwritten next to generated code listing exactly which optional crates the generated module references. Same summary printed to stderr at end of generation so the artifact is discoverable.ed1dda5)anyOfof primitives now produces the same clean#[serde(untagged)] pub enum X { String(String), Integer(i64) }shapeoneOfalready had — no more synthetic per-variant type aliases.05d555b)uint32/uint64→u32/u64; built-in format aliases (uuid4 → uuid,unix-time → int64);additionalProperties: <schema>→BTreeMap<String, T>(typed map instead ofserde_json::Valuemap).19493d0)minimum/pattern/etc.) emitted as/// Constraint: …doc comments.x-enum-varnames/x-enum-descriptionsvendor extensions honored for nicer enum variant names + per-variant docs.Defaults
Everything is on. Opt out per format via the relevant strategy in
[generator.types], or pass--types-conservativeon the CLI tocollapse the entire surface back to pre-Q2 behavior. Email is the
one format that stays off by default (
email_addresscrate is moreopinionated than the wire usually warrants).
No client-side validation. OpenAPI constraints surface as doc
comments only — no
validatorcrate, no#[validate(...)]. Theserver is the source of truth; client SDKs stay thin.
Generator config schema
Test plan
src/type_mapping.rs(TypeMapper config, dep requirements, format aliases, conservative mode).tests/typed_scalars_test.rs(date-time, uuid, byte+codec, REQUIRED_DEPS write/skip).tests/primitive_unions_test.rs(oneOf/anyOf parity).tests/integer_formats_test.rs(uint32/64, alias normalization).tests/additional_properties_typed_test.rs(typed BTreeMap, opt-out, false/true/schema).tests/constraint_doc_test.rs(doc emission, pattern escaping, no-validate guarantee).tests/x_enum_varnames_test.rs(varname override, descriptions, length-mismatch fallback).scripts/spec-compile.sh: 54/54 specs compile cleanly under the new defaults (1 skipped: gitea, baseline).REQUIRED_DEPS.tomlconfirmed end-to-end.Deferred follow-ups
format: duration→chrono::Durationneeds a custom ISO 8601 codec; currently stays asString.uniqueItems: true→BTreeSet<T>opt-in is its own bead (Q2.5, P3) — kept off by default because of API-shape churn.🤖 Generated with Claude Code