Skip to content

explorer: B1 viewport-aware facet counts (#234 step 3)#237

Open
rdhyee wants to merge 2 commits into
isamplesorg:mainfrom
rdhyee:feat/b1-viewport-aware-facets
Open

explorer: B1 viewport-aware facet counts (#234 step 3)#237
rdhyee wants to merge 2 commits into
isamplesorg:mainfrom
rdhyee:feat/b1-viewport-aware-facets

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 25, 2026

Summary

The facet legend now scopes per-source / per-material / per-context / per-object_type counts to the current map viewport bbox instead of the whole dataset. Pan or zoom → counts recompute. This resolves the silent disagreement between the legend ("global pivot tool") and the rest of the page ("Samples in View", table count, map dots) that the 2026-05-22 design session called out and that #234 picks as direction B1.

This is step 3 of the #234 roadmap, after #235 (facetNote URL-load, merged) and #236 (search real-count, merged). Subsequent steps (A1 search-as-global-filter, C3 auto-promote to point mode) are independent and build on this.

Decisions locked with @rdhyee before implementation

Q Decision
Q1 — where do coordinates live for the bbox JOIN? No existing parquet carries both the four facet columns and lat/lon. JOIN facets_url to lite_url on pid for this PR; follow-up to bake a pre-joined parquet if real-world latency proves bad.
Q2 — progressive refinement now? Single-pass full query, debounced 250 ms via the existing facetCountsReqId infrastructure. Defer coarse-then-fine scaffold until latency shows it's needed.
Q3 — error-path semantics? Fall back to global baseline via existing applyFacetCounts(key, null) — matches current no-filter behavior; user can re-pan to retry.

Architectural changes

  • New isGlobalView() helper (altitude shortcut + rect-saturation check) decides when the cube fast-path stays valid. At alt > 10,000 km the camera is treated as global regardless of computeViewRectangle per-angle quirks, so legend totals don't jiggle on tiny camera rotations near the default zoom-out.
  • updateCrossFilteredCounts snapshots bboxSQL once at function entry. Cube fast-path gated additionally on bboxSQL === null. The no-filter + non-global case now falls through to the slow path (B1's whole point: counts reflect what's in view even with zero facet filters).
  • Slow path (when bboxSQL set): JOIN facets_url f ↔ lite_url l ON l.pid = f.pid and apply the bbox predicate against (l.latitude, l.longitude). buildCrossFilterWhere takes an optional column-prefix arg so the JOIN-shape WHERE qualifies columns with f. to avoid ambiguity with lite_url.source.
  • viewer.camera.moveStart → synchronous markFacetCountsRecomputing(). Italic-stale legend appears the instant the user starts panning, cleared when applyFacetCounts writes the post-moveEnd result.
  • viewer.camera.moveEndrefreshFacetCounts(). Reuses the existing 250 ms debounce + facetCountsReqId stale guard — bursts coalesce to one query, superseded in-flight queries discard their result on await resume.

Test plan

  • quarto render explorer.qmd — passes
  • New tests/playwright/facet-viewport.spec.js — 3 specs pass locally
    • zoom-in → counts shrink relative to global, per-source counts ≤ global; zoom-out → restore to within 1 %
    • moveStart marks .recomputing synchronously (verified via in-page listener that runs after the explorer's)
    • negative control: cold global-view boot leaves no .recomputing stuck
  • Regression suite — facetnote-url-load.spec.js (2), search-real-count.spec.js (3), url-roundtrip.spec.js (5) — all pass (10/10)
  • git diff --check upstream/main...HEAD — clean
  • Pre-deploy smoke gate on CI

Known follow-ups

  • Latency not formally benchmarked. The JOIN turns 4 parallel GROUP BY queries into 4 parallel JOIN+GROUP BY queries against the ≈300 MB lite_url parquet. If UX feels slow in real use, follow-up to bake a pre-joined facets-with-coords parquet and route the bbox path through it (Q1 fallback).
  • Cancellation-race spec deferred. The facetCountsReqId stale guard is in place but not exercised by a Playwright test (two pans back-to-back inside the 250 ms window). Add if Codex flags it as a coverage gap.
  • Progressive refinement (coarse-then-fine) explicitly out of scope — Q2 locked single-pass. File as a separate issue if latency benchmarks demand it.

Cross-refs


Roadmap context: #234. Direction is A1 + B1 + (C3-when-feasible, C2-with-warning) with progressive refinement underlying the dynamic surfaces.

rdhyee added 2 commits May 24, 2026 22:50
The facet legend now scopes per-source / per-material / per-context /
per-object_type counts to the current viewport bbox instead of the whole
dataset. Pan or zoom → counts recompute. Resolves the silent disagreement
between the legend and the "Samples in View" / table-count surfaces.

What changed
- New `isGlobalView()` helper: altitude shortcut (h > 10,000 km) plus a
  rect-saturation check on `paddedViewportBounds`. Returns true on the
  initial zoomed-out view so the legend doesn't jiggle on tiny rotations.
- `updateCrossFilteredCounts` snapshots `bboxSQL` once at function entry.
  Cube fast-path is gated additionally on `bboxSQL === null` — the
  pre-aggregated cube has no spatial granularity, so it's invalid for
  any non-global view. The no-filter + non-global case now falls through
  to the slow path (B1's whole point: "what's in view" even with zero
  facet filters).
- Slow path: when `bboxSQL` is set, JOIN `facets_url f` to `lite_url l`
  on `l.pid = f.pid` and apply the bbox predicate against
  `(l.latitude, l.longitude)`. `buildCrossFilterWhere` accepts an
  optional column prefix so the JOIN-shape WHERE qualifies columns with
  `f.` to avoid ambiguity with `lite_url.source`.
- `viewer.camera.moveStart` → synchronous `markFacetCountsRecomputing()`
  so the legend shows italic-stale state the instant the user starts
  panning, instead of waiting for the 250 ms debounce. Cleared when
  `applyFacetCounts` writes the post-moveEnd query result.
- `viewer.camera.moveEnd` → `refreshFacetCounts()`. Reuses the existing
  `facetCountsReqId` + 250 ms debounce, so bursts of moveEnd coalesce
  to one query and any superseded in-flight query discards its result
  on `await` resume via the stale guard.
- On query error: existing `applyFacetCounts(key, null)` baseline
  fallback covers the Q3 decision (show globals, not error indicator).

Q1 / Q2 / Q3 plan decisions (locked with rdhyee before implementation)
- Q1: JOIN to `lite_url` for this PR; no existing parquet carries both
  the four facet columns and lat/lon. If real-world latency is bad,
  follow-up to bake a pre-joined facet+coords parquet.
- Q2: Single-pass full query (no coarse-then-fine progressive scaffold
  yet — defer until latency is shown to need it).
- Q3: On bbox-query throw, fall back to global baseline counts.

Tests
- New `tests/playwright/facet-viewport.spec.js`: 3 specs covering
  zoom-in-then-out (counts shrink, then restore to within 1 %),
  moveStart-sync `.recomputing` affordance, and the negative
  control that a cold global-view boot leaves no `.recomputing` stuck.
- Regression: facetnote-url-load (2), search-real-count (3),
  url-roundtrip (5) all pass — 13 tests green locally.

Known follow-ups
- Latency was not formally benchmarked; the JOIN turns 4 parallel
  GROUP-BY queries into 4 parallel JOIN+GROUP-BY queries against the
  lite parquet (≈300 MB). If real-world UX feels slow, bake a
  pre-joined facets-with-coords parquet and route the bbox path
  through it (Q1 fallback).
- A cancellation-race Playwright spec (two pans back-to-back inside
  the 250 ms window) is conceptually covered by the existing
  `facetCountsReqId` stale guard but not yet exercised by a test —
  add if Codex review flags it.

Closes the B1 row in isamplesorg#234.
…#237

Codex caught three real issues + one documentation polish.

1. (BLOCKING) moveStart did not supersede in-flight or debounced facet
   count work. An older `updateCrossFilteredCounts` could land mid-pan,
   pass the stale guard at its `await` resume, call `applyFacetCounts`,
   and CLEAR `.recomputing` while writing pre-move counts — leaving the
   legend looking authoritative but stale until moveEnd's 250 ms
   debounce fired the new query. Fix: the moveStart listener now
   `clearTimeout(facetCountsDebounce); ++facetCountsReqId;` in addition
   to marking the italic-stale class. moveEnd's `refreshFacetCounts()`
   bumps the id AGAIN, harmlessly — supersession of already-superseded
   work, debounce coalesced.

2. (TEST GAP) The PR exercised the no-filter bbox path but NOT the
   filter-active JOIN path — exactly the new SQL shape this change
   introduces. Added a 4th Playwright spec
   `bbox-aware count under an active source filter (JOIN + filter)`:
     - activate a single source filter (cube fast-path at global view)
     - capture filtered material counts at global view (cube-derived)
     - fly to Cyprus → forces slow path with `f.source IN (…)` + JOIN
     - assert material counts shrink (a silent fallback to baseline on
       a column-ambiguity SQL error would write global counts, which
       would GROW back to the un-filtered globals — caught by the
       `cyprusTotal < filteredTotal` assertion)

3. (TOLERANCE TOO LOOSE) `zoom in → zoom out → restore` accepted a 1 %
   delta on the restored totals, but at `alt=15e6` the altitude
   shortcut in `isGlobalView()` forces the no-bbox baseline path → the
   counts should be value-by-value EXACTLY equal to the first global
   capture. Switched to `expect(restoredCounts).toEqual(globalCounts)`.
   A loose tolerance could hide a stale-read race writing a bbox-scoped
   count that just happens to look "close enough."

4. (POLISH) Sharpened the `moveStart` listener comment to explain why
   bumping the id is necessary (was missing the rationale).

Test results: 4/4 new + 10/10 regression (facetnote-url-load,
search-real-count, url-roundtrip) — 14 green locally.

Codex's point about the altitude-shortcut UX (high-altitude pivot
returns "global" even if camera angle doesn't show the full data
extent) is an intentional product decision per the design rationale
in `isGlobalView()`'s comment; nothing to change there.
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 25, 2026

Codex review iteration log (2 rounds → LGTM)

Round Codex finding Resolution Commit
1 (BLOCKING) moveStart only marked .recomputing; did not bump facetCountsReqId or clear the pending debounce. An older updateCrossFilteredCounts could resume mid-pan, pass the stale guard, write pre-move counts, and clear .recomputing — leaving the legend looking authoritative but stale until moveEnd's debounce fired the new query. moveStart now clearTimeout(facetCountsDebounce); ++facetCountsReqId; in addition to marking the italic state. moveEnd's refreshFacetCounts() bumps the id AGAIN — supersession of already-superseded work, debounce coalesced. 63b0457
1 Test coverage gap: the spec exercised the no-filter bbox path but NOT the filter-active JOIN path — exactly the new SQL shape this PR introduces. Added bbox-aware count under an active source filter (JOIN + filter): activate a single source filter (cube fast-path at global view), capture filtered material counts, fly to Cyprus → forces slow path with f.source IN (…) JOIN. Assertion cyprusTotal < filteredTotal catches the silent fallback-to-baseline case (column-ambiguity SQL errors fall back to unfiltered globals, which would GROW back instead of shrinking — caught). 63b0457
1 1 % restore tolerance was too loose. At alt=15e6 the altitude shortcut forces the no-bbox baseline path, so restored counts should be value-by-value EXACTLY equal to the first global capture. expect(restoredCounts).toEqual(globalCounts) — strict equality. 63b0457
1 Polish: moveStart comment missed the rationale for the id bump. Comment expanded with the race-condition explanation. 63b0457
1 Documentation polish — high-altitude shortcut can return "global" for a non-fully-data-covering camera angle. Intentional per the design rationale; no change. (Existing comment in isGlobalView() already explains the UX call.) n/a

Round 2: LGTM (Codex)

Findings: none. I don't see a new correctness issue in 63b0457. […] moveStart double-bump: safe. […] JOIN + filter test exercises the intended SQL shape because the checkbox state is changed synchronously before the Cyprus fly. […] Handler mechanics: clearTimeout(null) is safe, facetCountsReqId is initialized, and the mark/clear/bump ordering is fine under JS's single-threaded event execution.

Final test count: 14 passed locally (4 new B1 specs + 10 regression: 2 facetnote-url-load + 3 search-real-count + 5 url-roundtrip).

Known follow-ups deferred per Q1/Q2 plan decisions, not raised by Codex:

  • Latency was not formally benchmarked; baking a pre-joined facets+coords parquet (Q1 fallback) waits on real-world UX feedback.
  • Cancellation-race Playwright spec (two pans back-to-back inside the 250 ms window) — not added; the facetCountsReqId stale guard is now exercised structurally by moveStart's invalidation, which Codex round 2 walked through and accepted.

Thanks codex for the round-1 catch on moveStart — the stale-and-not-italic interval was a real bug, not a theoretical one.

rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant