You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Capture the architectural direction for the explorer's filter semantics, sequenced as a roadmap, with explicit decision rationale. Filed for review (Codex, Gemini, and human collaborators). The goal is alignment before implementation begins on the substantive steps.
Context
The explorer has five surfaces that show numbers about samples — map dots, "Samples in View" stat, "Samples Rendered" stat, samples table count, facet-legend counts, and the search-results line. Today each surface applies a different combination of constraints:
Constraint
Map / Stats / Table
Facet counts
Search results line
Source checkboxes
✅
✅
✅
Material / Sampled Feature / Specimen Type
✅
✅
✅
Bbox (viewport)
✅
❌
optional (area-scope)
Search text
❌
❌
✅
Result: three different "filter" semantics on one page. The 2026-05-22 investigation session (see #229 closure note, and the design briefing in ~/dev-journal/projects/isamples-facets.md) hit three concrete confusions stemming from this:
"I have pottery Cyprus in the search box but the facet counts and 'Samples in View' don't reflect it."
"I filtered to material=soil but the cluster dots include non-SESAR colors even though most soil is SESAR."
"What does '5,451 samples match the current filters' actually mean?"
The decision space
Three orthogonal axes (named for cross-reference):
A1: search is a global filter — restricts map, table, stats, facet counts. A2: search is a side-panel lookup — restricts only the search-results list. (current)
B1: facet counts reflect viewport bbox — pan, counts change. B2: facet counts stay global regardless of viewport. (current)
C1: cluster mode honestly reflects facet filter — H3 dots per filtered subset (expensive — pre-bake per facet, or live aggregate). C2: cluster mode ignores filter, surface this loudly with #facetNote. (current — but the note is bugged on URL-load). C3: auto-switch to point mode when any facet is active — no cluster dishonesty, but point density problems (see explorer: dense point overlap saturates to yellow, looks like Smithsonian dots #231).
Direction picked
A1 + B1 + (C3-when-feasible, C2-with-prominent-warning-when-too-dense), with progressive refinement (sampled-fast then full-when-idle) underlying every dynamic surface, and issue #233's progressive heatmap as the eventual unifier that retires the cluster-vs-point dichotomy.
Mental model the user gets
"The explorer is a single coherent answer to: what samples match my current intent? Every number on the page tells me the size of that intent. Every dot tells me where one of those samples is. If there are too many dots to draw individually, the page tells me so and falls back to cluster mode with a visible warning that it's an approximation."
Why this combination
A1 (search global): the search box stops being decorative. Users naturally assume what they type restricts what they see; the page should honor that.
B1 (counts viewport-aware): legend becomes "what's in front of me" — agreeing with the table and the stats. The legend stops being a global pivot tool (which is conceptually clean but in practice confused users in the 2026-05-22 session).
C3-then-C2 fallback: cluster mode is treated as a perf optimization, not a feature. When it's feasible to draw individual dots, draw them. When the count exceeds a threshold (still TBD), keep cluster but warn loudly that what you see isn't the filtered set.
Progressive refinement: addresses the "want both snappy and honest" tension. Counts/dots show a coarse approximation during active panning, refine to honest values when the user sits still for ~500ms. Cancellation on any new pan. The facetCountsReqId and requestId patterns already in the codebase generalize directly.
Why NOT the "cleanest" earlier framing (A1 + B2)
An earlier version of this briefing recommended A1 + B2 — keep the legend global as a pivot tool ("what could I navigate to"). Decision made to go with B1 instead because:
The explorer's primary user is studying data, not navigating around. "What's here" matters more than "where could I go."
All other numbers on the page reflect the viewport; making the legend the lone exception causes silent disagreement.
Progressive refinement makes B1's perf cost (100-300 ms recompute per pan) feel acceptable — the user sees stale counts go italic instantly, then update.
B1: facet counts viewport-aware, with .recomputing italic state during query
1-2 days
Add bbox predicate to updateCrossFilteredCounts live-query path; cube fast-path falls back when bbox is non-global
Legend agrees with table and stats
4
A1: search as global filter — add ILIKE search predicate to facet counts, table query, and loadViewportSamples
2-3 days
Touches every count surface; biggest behavior change
Search box becomes a real filter
5
C3: auto-promote to point mode when any facet active, with density-cap fallback to cluster + prominent "showing cluster — too dense for individual dots" warning
2-3 days
Mode-selection logic now considers facet state, not just zoom
Map dots honestly reflect filter
6
#233: progressive heatmap spike — third visualization that replaces the cluster apology with an actually-filter-honest density layer
~1 week
New visualization mode; reuses DuckDB-WASM + wide-parquet stack
Retires the cluster-vs-point tradeoff for high-density filtered views
Steps 1-2 are quick-win, independent of the architectural direction. Steps 3-4-5 are the substantive coherence work. Step 6 is the long-term answer that makes the cluster-mode apology obsolete.
Progressive refinement pattern (applies to steps 3, 5, 6)
A single debounce-+-cancel-+-progressive scaffold reused across surfaces:
moveStart:
- snapshot current values
- apply `.recomputing` italic state
moveEnd + 250 ms (debounce — cancels if another move comes):
- kick off coarse-pass query (10% TABLESAMPLE for counts; cube for legend single-axis case)
- apply result; keep `.recomputing` if there's a refine pass pending
moveEnd + 1-2 s still idle:
- run full-scan query
- apply result; drop `.recomputing`
any new move / filter change:
- bump request token; in-flight queries discard their result via stale guard
The codebase already has the cancellation primitives (facetCountsReqId, requestId, freshSelectionToken) and the .recomputing CSS class.
Open questions for review
Is A1 + B1 actually the right call? The earlier framing argued B2 keeps the legend stable and avoids per-pan jitter. We chose B1 because it makes the page coherent and the user is studying data. But B2 + A1 is also defensible — does the review prefer it?
Density-cap threshold for C3 → C2 fallback? When should auto-point-mode give up and revert to cluster? 5,000 dots? 50,000? Empirically test, or pick a number?
"Snappy vs accurate" explicit toggle? Or is progressive refinement enough that the user doesn't need to choose? The current lean is no toggle — the page just behaves fast-while-moving and honest-when-still.
Purpose
Capture the architectural direction for the explorer's filter semantics, sequenced as a roadmap, with explicit decision rationale. Filed for review (Codex, Gemini, and human collaborators). The goal is alignment before implementation begins on the substantive steps.
Context
The explorer has five surfaces that show numbers about samples — map dots, "Samples in View" stat, "Samples Rendered" stat, samples table count, facet-legend counts, and the search-results line. Today each surface applies a different combination of constraints:
Result: three different "filter" semantics on one page. The 2026-05-22 investigation session (see
#229closure note, and the design briefing in~/dev-journal/projects/isamples-facets.md) hit three concrete confusions stemming from this:pottery Cyprusin the search box but the facet counts and 'Samples in View' don't reflect it."material=soilbut the cluster dots include non-SESAR colors even though most soil is SESAR."The decision space
Three orthogonal axes (named for cross-reference):
A2: search is a side-panel lookup — restricts only the search-results list. (current)
B2: facet counts stay global regardless of viewport. (current)
C2: cluster mode ignores filter, surface this loudly with
#facetNote. (current — but the note is bugged on URL-load).C3: auto-switch to point mode when any facet is active — no cluster dishonesty, but point density problems (see explorer: dense point overlap saturates to yellow, looks like Smithsonian dots #231).
Direction picked
A1 + B1 + (C3-when-feasible, C2-with-prominent-warning-when-too-dense), with progressive refinement (sampled-fast then full-when-idle) underlying every dynamic surface, and issue #233's progressive heatmap as the eventual unifier that retires the cluster-vs-point dichotomy.
Mental model the user gets
Why this combination
facetCountsReqIdandrequestIdpatterns already in the codebase generalize directly.Why NOT the "cleanest" earlier framing (A1 + B2)
An earlier version of this briefing recommended A1 + B2 — keep the legend global as a pivot tool ("what could I navigate to"). Decision made to go with B1 instead because:
Sequenced roadmap
#facetNote-on-URL-load bug.recomputingitalic state during queryupdateCrossFilteredCountslive-query path; cube fast-path falls back when bbox is non-globalILIKEsearch predicate to facet counts, table query, andloadViewportSamplesSteps 1-2 are quick-win, independent of the architectural direction. Steps 3-4-5 are the substantive coherence work. Step 6 is the long-term answer that makes the cluster-mode apology obsolete.
Progressive refinement pattern (applies to steps 3, 5, 6)
A single debounce-+-cancel-+-progressive scaffold reused across surfaces:
The codebase already has the cancellation primitives (
facetCountsReqId,requestId,freshSelectionToken) and the.recomputingCSS class.Open questions for review
ILIKEagainst three text columns. Performance-acceptable for the search-results list at LIMIT 50, but folding it into every count query means scanning the same columns much more often. Might need the BM25 index to be ready before A1 is shippable at scale.What's NOT in this issue
explorer.qmd(Interactive Explorer rethink: architecture review + UX/feature backlog #163 territory)Cross-refs
~/dev-journal/projects/isamples-facets.md(private to rdhyee, contains this same content with rougher framing)Acceptance for this issue (not the implementation)
Once those are settled, individual PRs follow against the existing tracking issues (#230, #231, #232, #233 plus the
#facetNotebug to be filed).