feat(samples): discovery UX with variant grouping and faceted search#544
feat(samples): discovery UX with variant grouping and faceted search#544staging-devin-ai-integration[bot] wants to merge 15 commits into
Conversation
Surface the growing sample-pipeline catalog as grouped scenario cards with a variant selector and faceted/fuzzy search instead of a flat list of near-duplicate templates. Discovery metadata (group/variant/category/tags) is optional in sample YAML; when omitted the server derives best-effort values from node kinds, the client section, and filename patterns, so existing samples need no edits. Explicit YAML values win per-field; derived tags union with curated ones. The TemplateSelector (used by both Convert and Stream views) collapses a variant family (e.g. the colorbars codec variants) into one card with radio pills, and adds category/capability/needs-hardware facet chips alongside the existing origin filter and search. Signed-off-by: streamkit-devin <devin@streamkit.dev>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
| const base = variants.find((v) => !v.variant) ?? variants[0]; | ||
| return { key, base, variants }; |
There was a problem hiding this comment.
🟡 Variant-only groups choose the alphabetically first variant as the card representative
When every sample in a scenario group has a variant, groupSamplePipelinesByScenario sorts the variants and then falls back to variants[0] as the group base. The PR adds variant: Software to the previously canonical samples/pipelines/dynamic/video_moq_colorbars.yml:8, so the colorbars family no longer has any no-variant member; the group card title/description rendered from group.base in ui/src/components/converter/TemplateSelector.tsx:89-95 will come from whichever codec/hardware variant sorts first rather than the intended generic/software sample. This makes grouped sample cards display the wrong representative metadata even though the variants are still selectable.
Prompt for agents
Fix representative selection for grouped sample cards in ui/src/utils/samplePipelineOrdering.ts. The current code sorts variants before choosing base and falls back to the first sorted variant when all members have a variant. That is now triggered by samples/pipelines/dynamic/video_moq_colorbars.yml setting variant: Software on the previously canonical sample. Choose the group representative independently from variant sort order, for example by preserving/marking the intended representative before sorting or by adding an explicit group display label/base selection policy. Ensure the colorbars family still shows the generic/software card metadata while retaining the “Software” variant label in the selector.
Was this helpful? React with 👍 or 👎 to provide feedback.
Debug
There was a problem hiding this comment.
Valid catch — this is the same tradeoff I flagged in the PR description. By giving the canonical video_moq_colorbars.yml an explicit variant: Software, the group lost its no-variant member, so base falls through to the alphabetically-first variant for the title/description (the variants themselves are all still selectable and correct).
Two clean ways to fix, both cheap:
- Drop
variant: Softwarefrom the exemplar so the canonical sample stays the natural (no-variant) representative — keeps the override demo viacategory/tagsonly. - Pick the representative independently of variant order (e.g. the member whose id has the fewest variant tokens / shortest id), so an explicit base label can coexist with a stable card.
I've left this for @streamer45 to choose since it's a UX-label call rather than a correctness bug. Happy to apply either.
| export function matchesSamplePipelineQuery(pipeline: SamplePipeline, query: string): boolean { | ||
| const normalizedQuery = query.trim().toLowerCase(); | ||
| if (!normalizedQuery) return true; | ||
|
|
||
| const haystack = searchableText(pipeline); | ||
| const terms = normalizedQuery.split(/\s+/).filter(Boolean); | ||
|
|
||
| return terms.every((term) => expandTerm(term).some((candidate) => haystack.includes(candidate))); |
There was a problem hiding this comment.
📝 Info: Search now uses AND semantics across query tokens
The new query matcher splits input on whitespace and requires every term to match some searchable field or synonym. This is an intentional behavioral change from the previous single-substring match, and the tests cover it (ui/src/utils/samplePipelineOrdering.test.ts:160-168). It means queries like video audio will only show pipelines whose combined metadata includes both terms, which is useful for faceting but may surprise users who expected broad OR-style search.
Was this helpful? React with 👍 or 👎 to provide feedback.
Debug
There was a problem hiding this comment.
Correct, AND-across-tokens is intentional — it makes multi-word queries (e.g. nvidia av1) narrow rather than balloon, which pairs well with the facet chips. Single-token search (the common case) is unaffected. Tests at samplePipelineOrdering.test.ts:160-168 lock in the behavior.
| Discovery { | ||
| group: explicit.group.or(Some(derived_group)), | ||
| variant: explicit.variant.or(derived_variant), |
There was a problem hiding this comment.
📝 Info: Derived groups are emitted for every sample, not only known variant families
derive always returns group: Some(derived_group) even when the filename has no variant-like token. In the current UI this is mostly harmless because groups of one render as normal cards, and accidental multi-sample collisions are mitigated by the token rules and current bundled filenames. Reviewers should be aware that any future samples with filenames that only differ by tokens in SINGLE_TOKENS or LANGUAGE_TOKENS will automatically collapse into one selector card unless they set explicit discovery fields.
Was this helpful? React with 👍 or 👎 to provide feedback.
Debug
There was a problem hiding this comment.
Right — every sample gets a derived group, but a group of one renders as a normal flat card (see ScenarioCard for variants.length === 1), so there's no visible difference unless two filenames actually collapse. The compound-token rules (e.g. vulkan_video) guard the known families. If a future pair of filenames differs only by a SINGLE_TOKENS/LANGUAGE_TOKENS token, they'd merge into one selector — which is usually the desired behavior, and any author can override with explicit group/variant. Worth keeping in mind when adding samples.
| const [capabilityFilter, setCapabilityFilter] = React.useState<string | null>(null); | ||
| const [hardwareOnly, setHardwareOnly] = React.useState(false); | ||
|
|
||
| const facets = React.useMemo(() => collectSampleFacets(templates), [templates]); |
There was a problem hiding this comment.
📝 Info: Facet chips are built from all templates rather than current origin/search scope
collectSampleFacets(templates) computes the visible facet options from the full template list, while filtering applies origin, category, capability, hardware, and query afterward. This keeps facet choices stable as users type or toggle filters, but it also means a facet chip can remain visible even when selecting it with the current origin/search filters will yield an empty result set. That appears to be a UX tradeoff rather than a correctness bug.
Was this helpful? React with 👍 or 👎 to provide feedback.
Debug
There was a problem hiding this comment.
Intentional — facet options are computed from the full template set so they don't flicker/reorder as you type or toggle other facets. The "selected hidden by filters" hint + Clear filters covers the empty-result case. Scoping facets to the current filter set is a possible future refinement if it proves confusing.
The extended SamplePipeline type makes group/variant/category/tags required; update the two inline mocks in the samples/fragments service tests to match. Signed-off-by: streamkit-devin <devin@streamkit.dev>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #544 +/- ##
==========================================
+ Coverage 79.96% 80.03% +0.06%
==========================================
Files 234 236 +2
Lines 68061 68299 +238
Branches 1846 1970 +124
==========================================
+ Hits 54428 54664 +236
- Misses 13627 13629 +2
Partials 6 6
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Grouped scenario cards render each variant as a radio pill whose visible label is the short variant name, so E2E specs that clicked a sample by its full name could no longer select it (colorbars and webcam-PiP families). Set each variant pill's accessible name to the full sample name and add a selectPipelineTemplate() helper that selects via the radio role for grouped samples and falls back to the name text for ungrouped cards. Signed-off-by: streamkit-devin <devin@streamkit.dev>
- Drop variant/category override from video_moq_colorbars so the software sample stays the canonical group representative and shares the derived Video Encoding category with its codec siblings. - expandTerm: match synonyms only on exact or >=3-char prefix, dropping the reverse-substring branch that leaked short tokens (mic/cam) into unrelated queries (dynamic, scam). - Give flat-card radios an explicit sample-name accessible label so the E2E helper selects grouped and ungrouped cards through one role lookup. - Pass typed Input/OutputType into discovery derivation, removing the hand-rolled snake_case glue and dead match arms. - De-duplicate the Steps/Dag arms in parse_pipeline_metadata. - Drop the no-op group_tokens.dedup(); reuse labelFromKey for capability chips; remove redundant slice; extract ScenarioHeader. Signed-off-by: streamkit-devin <devin@streamkit.dev>
Signed-off-by: streamkit-devin <devin@streamkit.dev>
- Label the no-variant base member 'Software' instead of the opaque 'Default'. - Acronym-aware capability labels (MoQ/MP4/MSE/RTMP/WebM/VP9) via formatCapabilityLabel. - Give GroupCard hover + a filled selected state matching flat cards. - Square off filter chips so they read distinctly from the rounded variant pills. - Add a persistent 'Clear all filters' affordance plus empty-state recovery. - Drop capability chips already covered by a shown Category facet to remove the redundant facet rows. Signed-off-by: streamkit-devin <devin@streamkit.dev>
Add 'vad' to the capability acronym map so it renders 'VAD' not 'Vad', and remove the explicit 'vad' tag from the vad-demo exemplar since the vad node already derives 'voice-activity-detection' (avoids a duplicate facet chip). Signed-off-by: streamkit-devin <devin@streamkit.dev>
- Revert the Category/Capability dedup: category is a single priority-picked bucket while tags are multi-valued, so dropping a capability chip whose category is shown removed a cross-cutting filter (e.g. compositor demos that also encode were unreachable from the encoding filter). - Rename the 'Needs hardware' facet to 'Needs GPU' (the underlying tags are all GPU accel APIs: vaapi/nvidia/vulkan). - Collapse the duplicate clear affordances to the single persistent 'Clear all filters' control; drop the empty-state and hidden-hint buttons. Signed-off-by: streamkit-devin <devin@streamkit.dev>
- Derive variant codecs from typed VideoCodec/AudioCodec enums (single source) instead of string-sniffing; TS uses exhaustive Record maps so a new codec is a compile error until a label is added. - Only auto-derive group for system samples so unrelated user saves with colliding name slugs no longer collapse into one card. - Guard two-letter language tokens to a translation context. - Drop codec/format/transport tags from capability facets (codec is the variant-pill axis); derive the no-variant base pill from its codec tag. - Tighten expandTerm so a shared token cannot pull both video families; precompile per-query synonym expansion once instead of per pipeline. - Collapse duplicated Steps/Dag metadata arms; merge the identical ExplicitDiscovery/Discovery structs; extract GroupSection; drop trivial useMemo over primitives. Signed-off-by: streamkit-devin <devin@streamkit.dev>
Carries the current Customize-editor YAML (with any edits) into the visual Design editor via router state, which DesignView imports once node definitions are loaded. Signed-off-by: streamkit-devin <devin@streamkit.dev>
Signed-off-by: streamkit-devin <devin@streamkit.dev>
|
Direction update from Claudio after architecture review: Please make the sample discovery backend simpler and more explicit, not more heuristic. Implementation direction:
Please keep the PR focused and avoid increasing backend complexity with a taxonomy engine. The desired outcome is less magic and fewer ways to misclassify samples. |
…stics Make sample YAML the source of truth for Convert/Stream discovery UX: authored group/variant/canonical/category/tags/keywords replace the runtime filename/node-kind derivation. The server emits these fields as-authored plus a resolved, lowercased search_terms document; the UI does plain substring matching and groups directly off canonical. - Remove all heuristic derivation from sample_discovery.rs (filename tokenization, substring category/tag inference, codec sniffing). - Add canonical/keywords to the YAML schema and SamplePipeline; emit search_terms (name + description + category + tags + keywords + flattened node kinds). - Backfill all bundled dynamic/ + oneshot/ samples with explicit metadata; group near-duplicate families with one canonical member. - Enforce the contract in CI: bundled samples must carry category+tags, grouped samples must have exactly one canonical and per-member variants, ungrouped samples must not set canonical/variant. - UI consumes resolved fields and search_terms; SYNONYM_GROUPS and the variants[0] card-identity guess are gone. Signed-off-by: streamkit-devin <devin@streamkit.dev>
- Route Convert->Design handoff through guarded handleLoadSample (unsaved-work modal + measured auto-layout), deriving name/description from the edited YAML rather than the pristine sample. - Include the sample id in backend search_terms so slug-fragment queries match again; precompute per-template search haystacks once. - Variant pill accessible name now matches its visible label (WCAG 2.5.3); compute facet chips over the origin-filtered set so chips never match zero items. - Tighten the metadata contract test: reject any blank tag and duplicate variant labels within a group. - Collapse the duplicated Steps/Dag discovery fields behind a flattened PipelineMeta struct. - Clear the Customize/editor view when the selection is hidden by active filters; smaller facet chips in a single grouped bar. Signed-off-by: streamkit-devin <devin@streamkit.dev>
The variant pills' accessible name now equals their visible variant label (WCAG 2.5.3), so selecting a grouped pipeline by its sample name no longer matches. Scope grouped selections to the card's variant group and click by variant label. Signed-off-by: streamkit-devin <devin@streamkit.dev>
Summary
group/variant/canonical/category/tags/keywordsdirectly; there is no runtime derivation from filenames or node-kind substrings. The card title/description come from the group'scanonicalmember, never a guessedvariants[0].search_termsdocument (name + description + category + tags + authored keywords + flattened node kinds). The UI does plain substring matching — the old TSSYNONYM_GROUPStable is gone.apps/skit/tests/sample_discovery_metadata_test.rs): bundled samples must carrycategory+tags; grouped samples must have exactly onecanonicalmember and avariantlabel on every member; ungrouped samples must not setcanonical/variant. Missing or inconsistent metadata fails the build instead of silently degrading the UI.Review & Validation
canonicalmember, not an arbitrary variant.transcribe→STT,colorbars→colorbars) viasearch_terms; capability facets exclude codec/format/hardware tags.just lint,just test-ui,cargo test -p streamkit-servergreen — including the newsample_discovery_metadata_test.Notes
This replaces the earlier heuristic derivation (filename tokenization, substring category/tag inference, codec sniffing) that #551 had been tracking — the explicit-contract rewrite is now in this PR rather than deferred. Adding a new bundled sample now requires authoring its discovery metadata, which the validation test will demand.
Verified live against the local MoQ setup:
Convert — scenario grouping with codec variant pills and faceted chips:
Open in Design view — Customize YAML handed off to the node graph:
Stream — grouped MoQ colorbars card:
Link to Devin session: https://staging.itsdev.in/sessions/da773a0e70084000b42a86c2ed6664d9
Requested by: @streamer45
Devin Review
1e37e36(HEAD is9740783)