Process inline RBS comments natively without the require-hook rewriter#2639
Draft
paracycle wants to merge 10 commits into
Draft
Process inline RBS comments natively without the require-hook rewriter#2639paracycle wants to merge 10 commits into
paracycle wants to merge 10 commits into
Conversation
Tapioca used to discover method signatures purely from Sorbet's runtime
reflection. To support inline `#:` / `# @...` RBS comments, it shipped a
require-hook (`lib/tapioca/rbs/rewriter.rb`) that, at every load, rewrote
sources into `sig {}` blocks so `sorbet-runtime` would track them. That
detour required `require-hooks`, made boot slower, and forced a separate
bootsnap cache to be remotely usable on large apps.
Read RBS straight from source instead. The gem pipeline always builds a
`Rubydex::Graph` (with core/stdlib RBS seeded for constant resolution),
and the listeners that needed a runtime signature now also accept inline
RBS comments. A matching path on the DSL side picks up `#:` sigs for
arbitrary host-app methods so DSL compilers see the same signatures
they used to see through the rewriter.
Highlights:
- New `Tapioca::RBS::Comments`, `Tapioca::RBS::TypeQualifier`, and
`Tapioca::RBS::DslSignatures` modules handle parsing, fully-qualified
type rendering, and DSL-side lookup.
- `Gem::Pipeline` exposes `gem_graph`, `rbs_comments_for_constant`,
and `rbs_comments_for_method`. Listeners (`SorbetSignatures`,
`SorbetHelpers`, `SorbetRequiredAncestors`, `SorbetTypeVariables`)
surface `#: ...`, `# @abstract`, `# @requires_ancestor:`, and
`#: [A, B]` from source.
- `Dsl::Compiler#compile_method_*_to_rbi` and
`ActiveModelTypeHelper.type_for` fall back to `DslSignatures.build`
when no Sorbet runtime sig exists.
- All `T::` and `T.*` constants are emitted fully qualified (`::T::Array`,
`::T.proc`, ...). User-defined constants are resolved through Rubydex
so relative references like `Bar` inside `Foo::Bar` become
`::Foo::Bar`. Lexical nesting for anonymous classes is recovered from
source via a small Prism visitor.
- Removes `lib/tapioca/rbs/rewriter.rb`, the `require-hooks` dependency,
the bootsnap shim, `dsl --only-bootsnap-rbs-cache`, and the
`TAPIOCA_RBS_CACHE` README section.
The DSL pipeline memoizes its per-process Rubydex graph and snapshots the list of `$LOADED_FEATURES` paths it was indexed against. Test suites that don't fork between tests (e.g. `DslSpec`) end up sharing one graph across tests, but each test `require`s its own freshly-written fixture file under a different `tmp_path/lib/...`. The cached graph never picks those up, so `DslSignatures.build` returns nil for any method defined in the new file and the DSL compiler falls back to `T.untyped`. Track which paths the graph has already indexed and, on each `graph` call, incrementally index whatever showed up in `$LOADED_FEATURES` since last time. Rubydex's `Graph#index_all` + `Graph#resolve` is incremental, so this is cheap: the no-new-files path is one set diff and an early return.
Two related changes that simplify and tighten DSL-side RBS resolution
now that Rubydex (on the `expose-definition-lexical-nesting` branch)
ships `Definition#lexical_owner` and `Definition#lexical_nesting`:
- `Tapioca::RBS::DslSignatures.nesting_for` reads the lexical nesting
straight off the matching `Rubydex::Definition` instead of parsing
the source again with Prism. The transformation into the shape
`Graph#resolve_constant` wants — short names, outermost first, with
`::Foo` markers for compound or absolute openings — is done in one
place and covers plain nesting, `class Foo::Bar` compound paths,
and `module ::Bar` absolute paths uniformly. The Prism-based
`NestingVisitor` is gone.
- `Static::SymbolLoader.graph_from_paths` now also accepts the gem's
`.rbi` stub files (collected from `rbi/` in the gem directory) and
feeds them in through `Rubydex::Graph#index_source`. RBI is plain
Ruby, so the indexer just needs to see the content under a `.rb`
URI. This recovers constants that only exist in a gem's
native-code shim (e.g. `Rubydex::ConstantReference`), which used
to resolve through runtime reflection under the old rewriter
path and would otherwise produce unresolvable references in the
generated RBI.
Also picks up the new `Definition#lexical_owner`/`lexical_nesting`
RBI surface in `sorbet/rbi/gems/rubydex@*.rbi`, points the Gemfile at
the in-flight Rubydex branch, fixes a few Sorbet errors my earlier
commits introduced (`added_any` typing, nilable `Definition#declaration`,
`RBI::Extend#names` vs `name`, the `Module` upper bound on
`Dsl::Compiler::ConstantType`), and switches the `T.must` cast in
`DslSignatures.graph` to an `#: as !nil` inline RBS so we stop using
`T.xxx` calls in the new code paths.
Shopify/rubydex#832 has landed on main, so the dedicated branch is gone and the lexical-nesting API ships from main going forward. Also re-runs `tapioca gem rubydex` to refresh the gem RBI against the post-merge commit.
The inline RBS \`#: [ConstantType < Module[top]]\` annotation on
\`Tapioca::Dsl::Compiler\` is the source of truth for the class's
generic shape; the explicit \`extend T::Generic\` / \`ConstantType =
type_member\` lines I previously added are redundant duplication
of the same statement in a different idiom.
DSL compiler subclasses that need a refined \`ConstantType\` either
keep using the inline \`#: [ConstantType = ...]\` form (no runtime
\`type_member\` needed — \`constant\` is a plain instance variable)
or, if they prefer the explicit runtime form, declare \`extend
T::Generic\` and \`ConstantType = type_member { { fixed: ... } }\`
themselves. Both styles work.
Updates the \`compiler_spec.rb\` fixtures to use the inline RBS form
(they relied on the parent being generic at runtime, which is now
no longer the case) and adds the same Rubydex pin to
\`MockProject#tapioca_gemfile\` so subprocess test runs pick up the
\`Definition#lexical_owner\`/\`lexical_nesting\` API.
\`Runtime::Reflection.signature_of\` used to leak Sorbet's raw
\`T::Private::Methods::Signature\` to every caller. That meant
\`SorbetSignatures#compile_signature\`, \`Dsl::Compiler\`,
\`ActiveModelTypeHelper\`, and \`GraphqlTypeHelper\` all reached into
the same set of internals — \`arg_types\` / \`kwarg_types\` /
\`rest_type\` / \`block_name\` / \`mode\` / \`owner\` / \`method_name\`
— and reimplemented the same \"build a positional type list\",
\"sanitize the return type\", \"is this signature final?\" logic in
slightly different shapes.
Introduces \`Tapioca::Runtime::Signature\` as a small abstract type
with one initial concrete impl, \`SorbetSignature\`, that wraps the
old object. The interface is deliberately high-level and exposes
only what callers actually need:
- \`method\` — the canonical \`UnboundMethod\` the sig is attached
to.
- \`parameter_type_strings\` — positional type strings,
post-sanitization. Encapsulates the arg/kwarg/rest/keyrest/block
plumbing that used to live in both \`compile_signature\` and
\`Dsl::Compiler#parameters_types_from_signature\`.
- \`return_type_string\` — sanitized return type.
- \`valid_return_type_string\` — same, but \`nil\` when the type
string is meaningless (\`void\`, \`T.untyped\`, \`T.noreturn\`,
\`<NOT-TYPED>\`, ...).
- \`valid_first_arg_type_string\` — first positional argument's
sanitized type, or \`nil\` when meaningless. Replaces the only
surviving \`arg_types.dig(0, 1)\` consumer.
- \`compile_to_rbi_sig(parameters, &push_symbol)\` — emits an
\`RBI::Sig\`. The body of the old \`SorbetSignatures#compile_signature\`
plus the final-method lookup lifted onto the type itself.
Caller migrations:
- \`SorbetSignatures#on_method\` collapses into a single
\`signature.compile_to_rbi_sig(event.parameters) { |sym| @pipeline.push_symbol(sym) }\`
call. \`compile_signature\` and \`signature_final?\` are gone.
- \`Methods#compile_method\`'s writer-method detection no longer
inspects \`signature.arg_types.size\` — it was a redundant cross-check
against \`method.parameters.size\`, which we already test.
- \`Dsl::Compiler#parameters_types_from_signature\` keeps the same
public shape and delegates to \`signature.parameter_type_strings\`.
\`compile_method_return_type_to_rbi\` delegates to
\`signature.return_type_string\`.
- \`ActiveModelTypeHelper#lookup_return_type_of_method\` becomes
\`signature.valid_return_type_string\`. \`lookup_arg_type_of_method\`
becomes \`signature.valid_first_arg_type_string\`. The
\`MEANINGLESS_TYPES\` / \`MEANINGLESS_TYPE_STRINGS\` filtering
moves onto \`Signature\` (\`MEANINGLESS_TYPE_STRINGS\` is now a
shared constant; the runtime-type sentinels stay private to
\`SorbetSignature\`).
- \`GraphqlTypeHelper\` swaps \`signature&.return_type\` +
\`valid_return_type?\` checks for \`signature&.valid_return_type_string\`.
The Scalar branch's \`T::Utils.unwrap_nilable\` becomes
\`RBIHelper.as_non_nilable_type\` on the resulting string, which is
the same transformation but expressed at string level.
Types tightened: \`MethodNodeAdded#signature\` and \`Pipeline#push_method\`'s
\`signature\` parameter both move from \`untyped\` to
\`Tapioca::Runtime::Signature?\`.
No behavioural change for downstream callers; this is purely a
refactor that prepares the ground for an \`RbsSignature\`
implementation to land in a follow-up.
The DSL pipeline used to translate inline `#: ...` comments into a
bare `RBI::Sig` via `Tapioca::RBS::DslSignatures.build` and then have
every consumer (`Dsl::Compiler#rbs_*`, `ActiveModelTypeHelper`'s
fallback branches) reach into the resulting `RBI::Sig` directly —
`params.first.type.to_s`, `return_type.to_s`, the meaningless-type
filter, etc. — duplicating the same surface that `SorbetSignature`
already encapsulates.
Wrap the parsed sig in a new `Tapioca::Runtime::RbsSignature`
subclass of `Tapioca::Runtime::Signature`. It carries the original
method, the qualified `RBI::Sig`, and the RBS method-level
annotations (`# @abstract`, `# @override`,
`# @without_runtime`, ...). The interface is the same one
`SorbetSignature` already exposes:
- `method`
- `parameter_type_strings`
- `return_type_string` / `valid_return_type_string`
- `valid_first_arg_type_string`
- `compile_to_rbi_sig(parameters) { |sym| ... }`
`compile_to_rbi_sig` is where the RBS-specific bits — annotation
application, the `method_added` / `singleton_method_added`
`without_runtime` rule — finally live in one place instead of
being inlined into each consumer.
`# @without_runtime` is back to driving `sig.without_runtime = true`
on the emitted RBI sig (rather than dropping the sig entirely, which
was a holdover from the rewriter days that didn't make sense for
static RBI generation). The spec that previously asserted the
without-runtime method had no sig is updated to expect the
`T::Sig::WithoutRuntime.sig` form Sorbet's static checker wants.
Caller migrations:
- `Tapioca::RBS::DslSignatures.build` returns `RbsSignature?` and
folds annotation harvesting into the construction.
- `Dsl::Compiler#rbs_parameter_types_for` and
`Dsl::Compiler#rbs_return_type_for` delegate to
`signature.parameter_type_strings` / `return_type_string`.
- `ActiveModelTypeHelper#lookup_return_type_of_method` and
`lookup_arg_type_of_method` collapse from two branches into one
`signature&.valid_…_string` call sourced from a single
`lookup_signature_of_method` that picks Sorbet sig first and
RBS sig as fallback.
- `Signature#method` widens to `(Method | UnboundMethod)` to
accommodate the DSL-side `obj.method(:foo)` call sites; the gem
pipeline narrows back to `UnboundMethod` at its call site via
`Method#unbind`.
The gem-pipeline-side `MethodNodeAdded#rbs_lookup` /
`Pipeline::RBSMethodLookup` / `SorbetSignatures#compile_rbs_lookup`
path stays as-is for now — that's the next commit. This commit
just lays the polymorphic groundwork so the DSL side already runs
through it.
The pre-existing `T.must` typecheck error in
`Dsl::Compiler#compile_method_parameters_to_rbi` is also gone as a
side effect: `parameters_types_from_signature` now returns a
concrete `Array[String]`, so `method_types[index]` is `String?`
(not `T.untyped`) and the `T.must` is no longer redundant.
Both paths now build an `RbsSignature` via the shared `Tapioca::RBS::SignatureBuilder`: parse the `#:` comments, translate to `RBI::Sig`, qualify every constant against a Rubydex graph for the declaration's lexical scope. They differ only in which graph they pass in — workspace vs. gem. `Reflection.signature_of` now takes an optional block as the RBS lookup override: callers that need a non-default scope (the gem-RBI pipeline) pass one; everything else gets the workspace-scoped `DslSignatures.build` by default. `compile_to_rbi_sig` returns `Array[RBI::Sig]` so RBS overloads survive the polymorphic interface. This deletes ~120 lines of duplicated translate/qualify/annotate code from `SorbetSignatures` and `DslSignatures`, drops the `RBSMethodLookup` wrapper and the `MethodNodeAdded#rbs_lookup` plumbing, and lets the gem listener collapse to a single `signature.compile_to_rbi_sig` call for both backends.
0.2.5 ships `Definition#lexical_owner` and `Definition#lexical_nesting`, which is the only Rubydex API this branch needed from the unreleased main. Now that it's out, we can drop the github pins from the dev Gemfile and from MockProject's subprocess Gemfile and bump the gemspec floor.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
#:/# @...RBS comments directly from source instead of relying on a load-time rewriter that producedsig {}blocks at boot.lib/tapioca/rbs/rewriter.rb, therequire-hooksdependency, the bootsnap shim,dsl --only-bootsnap-rbs-cache, and theTAPIOCA_RBS_CACHEplumbing. No more Bootsnap iseq cache to manage.Rubydex::Graph(seeded with core/stdlib RBS) so constants in inline sigs can be resolved to their fully-qualified names. A matching path on the DSL side picks up#:sigs for arbitrary host-app methods.Why
The require-hook rewriter rewrote every
.rbfile at load time sosorbet-runtimewould track#:comments assig {}blocks. It worked, but it forcedrequire-hooksinto the dependency tree, made boot slower, and shipped its own bootsnap iseq cache so large apps could survive the cost. With Tapioca now able to parse RBS straight from source (via Rubydex), all of that goes away.What changed
New modules under
lib/tapioca/rbs/comments.rb— parses a stream of[comment_string, line]tuples (as obtained from Rubydex'sDefinition#comments) into signatures and annotations classified into class-level / method-level groups.type_qualifier.rb— walks anRBI::Typetree and emits a fully-qualified string. User constants resolve through a Rubydex graph (soBarinsideFoobecomes::Foo::Bar); Sorbet's ownT::constants are prefixed with::T::; user-defined generics (SorbetT::Generic) are resolved but emitted without a leading::to match Sorbet's runtime serializer convention.dsl_signatures.rb— DSL-side RBS lookup. Builds a per-process Rubydex graph from the host workspace + already-loaded source files + core/stdlib RBS, looks up declarations by qualified name (or by source location for anonymous classes built withClass.new), and synthesizes anRBI::Sig. Lexical nesting for anonymous classes is recovered via a small Prism visitor.Gem pipeline (
lib/tapioca/gem/...)Pipelinebuilds the graph unconditionally (previously only wheninclude_doc: true) and exposesgem_graph,rbs_comments_for_constant, andrbs_comments_for_method.Listeners::Methods+MethodNodeAddedcarry an optionalPipeline::RBSMethodLookupwhen no Sorbet runtime sig was found.Listeners::SorbetSignaturessynthesizes a sig from the RBS lookup when present (with attr-vs-method-vs-overload handling and method-level annotations).Listeners::SorbetHelperspicks up# @abstract,# @final,# @sealed,# @interface.Listeners::SorbetRequiredAncestorspicks up# @requires_ancestor:.Listeners::SorbetTypeVariablespicks up class-level#: [A, B](with variance /upper:/fixed:blocks).Listeners::Documentationfilters Rubydex definitions down to ones inside the gem so the now-always-on graph doesn't leak core-RBS docs.DSL pipeline (
lib/tapioca/dsl/...)Dsl::Compiler#compile_method_*_to_rbifalls back toDslSignatures.buildwhen no Sorbet runtime sig is registered.Compileritself now explicitlyextend T::Genericand declaresConstantType = type_member.ActiveModelTypeHelper.type_foralso falls back to RBS comments fordeserialize/cast/cast_value/serialize.Cleanup
lib/tapioca/rbs/rewriter.rb,sorbet/rbi/shims/bootsnap.rbi,sorbet/rbi/gems/require-hooks@0.4.0.rbi.require-hooksfromtapioca.gemspecandGemfile.lock.dsl --only-bootsnap-rbs-cacheand the related CLI / command code.internal.rbkeepsModule.include(T::Sig)so baresig {}calls in user/gem code (and our specs) still work withoutextend T::Sig.Test status
bin/testruns 788 tests with 0 failures, 0 errors, 2 pre-existing skips.A couple of tests had to move with the design:
spec/tapioca/runtime/reflection_spec.rbnow uses a realsig { ... }instead of relying on the rewriter to translate#: -> Stringto a runtime sig.spec/tapioca/gem/pipeline_spec.rbupdates theT.proc/T::Array[::String]/requires_ancestor { Kernel }expectations to the new::Tand::Kernelqualification convention.spec/tapioca/cli/dsl_spec.rbdrops the bootsnap- and--only-bootsnap-rbs-cache-specific tests, renames the "preserves RBS comment rewriting" case to "exposes inline RBS method signatures to DSL compilers", and broadens a Ruby-version-sensitive stack trace regex.