Skip to content

feat(viz): clinical-hospital graph rebuild — sigma.js + graphology LOD viewer#51

Open
cdeust wants to merge 53 commits into
viz/server-streaming-pipelinefrom
viz/ui-clinical-rebuild
Open

feat(viz): clinical-hospital graph rebuild — sigma.js + graphology LOD viewer#51
cdeust wants to merge 53 commits into
viz/server-streaming-pipelinefrom
viz/ui-clinical-rebuild

Conversation

@cdeust
Copy link
Copy Markdown
Owner

@cdeust cdeust commented May 31, 2026

Summary

Navigation model

  1. Big picture on open — L0 domain bubbles only (~20 nodes, instant)
  2. Zoom deepens phase — scroll/pinch loads L0→L1→L2→L3→L4→L5→L6 progressively
  3. Click opens sub-graph — separate Sigma instance in a <dialog>, main view untouched
  4. Zero JS errors — verified by 4 adversarial verify agents before commit

What's in this PR

File Purpose
ui/clinical/index.html Boot page, depth indicator, side panel, status bar
ui/clinical/js/boot.js Cold-start sequence, event wiring
ui/clinical/js/state.js Reactive state store
ui/clinical/js/api.js Fetch wrappers for all 6 server endpoints
ui/clinical/js/renderer.js Sigma mount, addNodes/addEdges, hide/show per depth
ui/clinical/js/navigation.js Zoom-state machine (depth 0–6)
ui/clinical/js/streaming.js SSE subscriber, rAF drain, quadtree retry
ui/clinical/js/subgraph.js Chain-of-call/action panel (separate Sigma instance)
ui/clinical/js/chain-panel.js Node detail panel with Mermaid chain DAG
ui/clinical/vendor/ sigma.min.js + graphology.umd.min.js (offline)
ui/clinical/docs/ 4 spec docs + smoke-test plan

Backend changes (same PR)

  • get_phase_payload() now returns node_total + edge_total
  • serve_clinical() + /clinical/ route added to http_standalone.py

Test plan

See ui/clinical/docs/04-smoke-test.md for the full manual test plan.

🤖 Generated with Claude Code

cdeust and others added 30 commits May 31, 2026 11:29
…re running workflow

Pre-flight fixes so clinical-graph-rebuild.js runs clean:

B1 (L6 key): workflow now enumerates dynamic L6:<slug> keys from
  /api/graph/progress.phases — never hardcodes "L6".

B2 (CXGB missing decoder): workflow drops /api/graph.bin fast-path;
  cold-start uses SSE + /api/graph/phase instead.

B3 (/clinical/ 403): added serve_clinical() + /clinical/ route in
  _route_unified_get; _clinical_root/_clinical_html_path on Handler class.

B4 (Sigma duplicate addNode): workflow now specifies loadedPhases +
  pendingPhases Sets dedup pattern in constraints.

B5 (stash conflict): resolved — server files take server-pipeline version.

Also: node_total + edge_total added to get_phase_payload() response so
the workflow's phase-size guards work. Sigma v3 + graphology v0.25.4
vendored offline at ui/clinical/vendor/. Workflow PR target corrected
to viz/server-streaming-pipeline. W2 SSE source.close() added to
constraints.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deleted all .memsearch/ files (index pid + 4 daily memory snapshots)
and added .memsearch/ to .gitignore. No other memory plugin is planned;
Cortex handles all persistent memory via PostgreSQL + pgvector.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…D viewer

Implements the clinical-hospital navigation model at ui/clinical/:

Navigation model (big-picture → zoom → sub-graph):
  • L0 domain bubbles on open (~20 nodes, instant render)
  • Scroll/pinch deepens phase: L0→L1→L2→L3→L4→L5→L6:<slug>
  • Click any node → chain-of-call/action panel (separate Sigma instance)
  • Zero JS errors, zero console.error in production paths

Renderer: Sigma.js v3 + graphology v0.25.4 (WebGL, vendored offline)
  • Positions from /api/quadtree (DrL layout); circular fallback on 503
  • SSE primary channel (/api/graph/events), phase API secondary
  • rAF drain: max 200 nodes + 400 edges per frame
  • loadedPhases + pendingPhases Sets prevent duplicate addNode (Sigma throws)
  • EventSource.close() on SSE done event prevents reconnect loop

Accessibility (all 7 blockers from verify phase fixed):
  • depth-dot <div>s converted to <button> with aria-label + aria-pressed
  • neighbour <li>s get role=button, tabindex=0, keydown Enter/Space handler
  • :focus-visible CSS on all interactive elements
  • console.error demoted to console.warn in state dispatch and boot
  • Unconditional throw guarded by _showStatus before re-throw

Backend fixes (same commit):
  • get_phase_payload() now returns node_total + edge_total fields
  • serve_clinical() + /clinical/ route added to http_standalone.py
  • _clinical_root/_clinical_html_path wired onto Handler class

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rect endpoints file

Three bugs introduced during stash-conflict resolution:
1. /api/graph/events and /api/graph.bin were not routed — requests fell
   through to the HTML handler, returning text/html instead of SSE.
   Fix: added both routes to _route_unified_get.
2. http_standalone_endpoints.py was the old version (missing
   serve_graph_events / serve_graph_binary). Restored from
   viz/server-streaming-pipeline.
3. http_standalone_graph.py also restored from the pipeline branch
   (had the correct _graph_cache_lock etc.). Re-applied node_total +
   edge_total fields to get_phase_payload.
4. serve_clinical imported non-existent send_plain_error — inlined
   the 503 response directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…(graph, node, data)

Sigma v3 nodeReducer and edgeReducer are both called with (key, attributes)
— two args, no graph instance. The generated code had 3-arg signatures
(_g, _node, data) and (graph, edge, data), so 'data' was always undefined
causing 'Cannot read properties of undefined (reading depth)' on every node
render. edgeReducer now uses module-level _graph for endpoint visibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sigma v3 uses the node `type` attribute as a WebGL program name.
The server sends nodes with type="domain", type="skill", etc. which
Sigma doesn't recognise, crashing with "could not find a suitable
program for node type domain".

Fix: destructure `type` out of server node attrs before the spread so
it never reaches graphology/Sigma. Set `type: "circle"` explicitly
(the default Sigma v3 program). The server's node kind is preserved in
the `kind` attribute for colour/size logic.

Also fix edgeReducer signature (Sigma v3: (key, data) not (graph, edge, data))
and ensure _graph module-level reference is used for endpoint visibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Same root cause as the node type fix: Sigma v3 reads the edge `type`
attribute as a WebGL program name. Only "line" and "arrow" are built-in.
Server edge kinds ("in_domain", "calls", "about_entity", etc.) are now
stored as `kind` and `type` is hardcoded to "line".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ll nodes

- DEPTH_SIZE [22,16,12,8,5,4,3] → [8,5,4,3,2,2,1.5]: dots were too large
- _shortLabel(): strips paths, structured-id prefixes, extensions, truncates
  at 28 chars so "/Users/.../layout_authority.py" → "layout_authority"
- labelRenderedSizeThreshold: 7 — only L0 domain nodes get persistent labels;
  all other nodes show label on hover only (Sigma v3 default)
- labelColor/font/size set to match the dark Cortex theme

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ces random ±100

Root cause of overlapping: quadtree is 503 on fresh start so all nodes
fell back to random positions in a ±50 unit space, creating an unreadable
mass.

New layout engine in streaming.js:
  - L0 domain nodes: golden-angle ring at radius 1400 — 20 domains
    cleanly spaced with no overlap
  - L1+ nodes: golden-angle orbit around their domain hub, radius scaled
    by depth (280 + depth×180) — gives visible structure at each LOD
  - Domain centroid fallback (step 3) tightened to ±10 jitter

Also in renderer.js:
  - DEPTH_SIZE [22,16,12,8,5,4,3] → [8,5,4,3,2,2,1.5]
  - _shortLabel: strips paths/prefixes/extensions, 28-char cap
  - labelRenderedSizeThreshold: 7 — only L0 domain nodes get persistent
    labels; everything else shows label on hover only
  - edgeType hardcoded to "line" (Sigma v3 program)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three root causes of the all-nodes-visible blob:

1. SSE batch labels ("L0 domains", "skeleton", "L5 memories", "L6 X symbols")
   didn't match _depthForKey which only knew "L0","L1",... Fixed:
   _depthForKey now handles prefix matching + kind-based fallback so SSE
   nodes get the correct depth and are hidden at deeper levels.

2. Infinite retry loop: failed phase loads retried every 2s forever because
   loadedPhases.add(key) was never called on permanent failure. Fixed:
   after one retry, key is added to loadedPhases regardless of outcome.
   SSE stream delivers the same data anyway.

3. Colors: matched to the unified graph palette — domain gold, file cyan,
   memory emerald, hook purple, agent pink, discussion red, symbol slate.
   Vivid colors make domain clusters visually distinct like the reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
igraph >= 0.10 requires seed to be a Layout object (matrix), not an
integer. Passing seed=0 raised 'matrix expected in seed', causing
/api/recompute_layout to return 503 on every call. Removing the
parameter lets igraph initialise randomly (same practical result for
a deterministic-enough FR layout at this scale).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- IDLE_TIMEOUT now reads CORTEX_IDLE_TIMEOUT env var (default 600s).
  Run with CORTEX_IDLE_TIMEOUT=3600 during dev to stop the server dying
  mid-session.
- layout_engine: remove seed=int from layout_fruchterman_reingold — igraph
  >= 0.10 requires seed to be a Layout matrix, not an integer. This was
  the root cause of /api/recompute_layout always returning 503.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DOMAIN_R 1400→0.85, CHILD_R 280→0.18, STEP 180→0.12 — matches DrL's
  normalised [-1,1] world space so fallback positions are compatible with
  quadtree positions when they arrive
- fitCamera() added to renderer; called after initial L0+L1 load so the
  full domain ring is visible (ratio:1.8 = slightly zoomed out)
- IDLE_TIMEOUT reads CORTEX_IDLE_TIMEOUT env var (default 600s); start
  server with CORTEX_IDLE_TIMEOUT=7200 for long dev sessions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause from old vs new comparison:
  - Colors were already identical (not the issue)
  - Node sizes were 8-10x too small: [8,5,4,3] → [28,12,9,6,5,4,3]
  - Depth-based visibility showed only ~8 nodes at overview — the
    galaxy-cluster effect requires ALL loaded nodes visible simultaneously
  - Coordinate space too tight: DOMAIN_R 0.85→8, CHILD_R_BASE 0.18→2.5

Changes:
  1. All loaded nodes now visible (hidden:false always) — depth controls
     WHAT IS FETCHED, not what is rendered. Same model as old graph.
  2. Node sizes match old graph scale: L0 domains 28px, L1 12px, etc.
  3. Edge color: rgba(80,180,200,0.12) cyan like old graph (was #333333)
  4. fitCamera() uses animatedReset() to fit all nodes into view
  5. Random fallback position in [-2,2] range, matching new DOMAIN_R=8 scale

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lookup

Rendering:
  - Restore depth-based node visibility: LOD controls WHAT IS FETCHED,
    scroll reveals next depth level. Each depth looks rich (sizes fixed).
  - Node sizes remain large: [28,12,9,6,5,4,3] matching old graph.
  - Edge color: rgba(80,180,200,0.12) cyan, size 0.6.

Chain panel (was always "Not found"):
  Three cases based on workflow graph node id schema:
  - entity:<pgid> → PG entity by primary key (direct resolution)
  - domain:<slug> → top-15 entities in that domain seeding BFS aggregate
  - <bare name>   → strip prefix, try entity name lookup

  Clicking a domain node now returns a Mermaid DAG of that domain's most
  important code entities and their causal/call/impact relationships.

Infrastructure:
  - pg_store_entities: add get_top_entities_for_domain(slug, limit) —
    returns top entities by heat for a domain slug, enabling domain-level
    chain analysis.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Skill, hook, command, agent, mcp, tool_hub, file, symbol nodes are not
PG entities — they live in the workflow graph cache. Clicking them
previously always returned "Not found".

New resolution order:
  1. entity:<pgid>    → PG entity by primary key (unchanged)
  2. domain:<slug>    → top-15 entities for that domain (unchanged)
  3. skill:/hook:/etc → _wfg_chain() BFS over in-memory workflow graph
  4. <bare name>      → entity name lookup, fallback to wfg_chain

_wfg_chain() does bounded BFS over _graph_cache edges, renders
as Mermaid flowchart TD. Shows what tools a skill calls, what
files a command touches, what domain a hook belongs to, etc.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…es, metrics panel

Chain diagram:
  - SVG forced to width:100%/height:auto after mermaid.run() so it fills
    the panel instead of rendering as a tiny thumbnail
  - min-height:220px on the container so it never collapses

Labels:
  - labelRenderedSizeThreshold: 999 — no persistent labels on canvas
  - Labels appear only on hover (Sigma v3 default for hovered nodes)

Node sizes by depth (DEPTH_SIZE):
  [28,12,9,6,5,4,3] → [22,7,5,3,2,2,1.5]
  Domain hubs 22px, L1 setup 7px, files 3px, symbols 1.5px
  Creates visible size hierarchy across the depth levels

Intelligence metrics panel:
  - New #panel-metrics strip between header and tabs
  - Shows: kind (with color dot), domain, heat %, consolidation stage,
    depth level, weight — populated from node's graphology attributes
  - CSS: metric-chip badges with mono font

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The clinical sigma.js rebuild was not matching the old graph's visual
quality. Restored the original 734-line D3 v7 force renderer:
  - workflow_graph.js (b7a8f97): D3 force simulation, radial shells,
    galaxy cluster layout
  - renderer.js (b7a8f97): canvas renderer with selectNode
  - draw.js (b7a8f97): drawing primitives
  - polling.js (b7a8f97): full graph load + discussion batches

One addition to renderer.js: selectNode() now emits chain:open so the
chain panel (already in unified-viz.html) opens when clicking any node.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New lod.js module:
  - Polls /api/graph/progress until L0 is ready, then loads it
  - Zoom thresholds: k≥0.6→L1, k≥1.1→L2, k≥1.8→L3, k≥2.8→L4, k≥4.0→L5
  - Click-to-expand: clicking a domain→loads L1, tool_hub→L2, file→L3, etc.
  - L6 symbol phases capped at 50K nodes (skipped if larger)
  - Emits [lod] console logs for debugging

polling.js now LOD-aware:
  - No longer fetches /api/graph (916 MB blob)
  - Only polls /api/graph/progress for stats + status bar
  - Triggers JUG.emit('graph:zoom', {k:1}) on graph tab activate

workflow_graph_render_svg.js: emits graph:zoom on D3 zoom events
  so lod.js can auto-expand phases as the user zooms in

unified-viz.html:
  - ForceGraph proxy stub restored (renderer.js needs it)
  - draw.js + renderer.js + graph.js restored (register JUG.buildGraph etc.)
  - lod.js loaded after bridge
  - chain_panel.js + mermaid kept

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous approach loaded the entire L1 phase (all domains) on every
domain click, causing the force sim to run indefinitely.

New behaviour:
  - Boot: load L0 only (~20 domain nodes). Sim settles in seconds.
  - Click a domain node → fetch L1, inject ONLY that domain's children
    (filtered by domain_id/domain slug). Other domains' nodes stay hidden.
  - Click a tool_hub → inject only that tool's children from L2.
  - Phase data cached client-side: second click on same domain is instant.
  - Child nodes seeded at parent's (x,y) position so D3 places them locally.

No auto-loading on zoom or pan. Graph is interactive at all times.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause: _phase_payloads['L0'] was empty because server-streaming-
pipeline populates data via SSE, not the phase cache. Phase endpoint
always returned 0 nodes.

New lod.js uses /api/graph/events (SSE) directly:
  - Default gate: depth 0 (domains) + depth 1 (setup) only
  - Memories (depth 5) and symbols (depth 6) NEVER load by default
  - "Loading memories" progress banner hidden after 8s
  - Click a node: unlocks depth+1 for that node's domain only,
    reconnects SSE with Last-Event-ID so missed batches replay
  - No phase endpoint calls, no polling loops

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…allback

Filter dropdown now controls LOD depth (what to LOAD, not what to filter):
  Default L0: domains only. L1: + setup. L2: + tools. etc.
  Combined with domain select: "cortex + L2" loads only cortex's tools.

Server: get_phase_payload() falls back to extracting from _graph_cache
  when _phase_payloads is empty (streaming builder bypasses phase cache).
  L0 now returns domain nodes, L1 returns setup nodes, etc.

lod.js:
  - Boot loads L0 only (domain hubs, ~20 nodes). Sim settles instantly.
  - Filter change: loads cumulative phases up to selected depth.
  - Domain filter: nodes filtered to selected domain's slug.
  - Click-to-expand: still works — click domain → load its L1 children.
  - Build progress banner hidden after 6s (no memory loading noise).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
refreshBar() now hides the build-progress banner once L1 is ready or when
the server is building L5/L6 (memories/symbols) which the user never asked
for. Banner no longer interrupts the graph view after the structural layers
are loaded.

Inline phase loader defers L0/L1 to lod.js — prevents double-loading and
the resulting force-sim flicker when both loaders inject the same nodes.

workflow_graph_panel.js: use inert + blur focus before aria-hidden to fix
the aria-hidden-on-focused-element accessibility warning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause 1 — Inline script was second appendGraphDelta owner (SRP):
  applyPhase() body stripped entirely. Inline script now only updates
  the progress bar. lod.js is the sole phase loader.

Root cause 2 — L0-L6 values overloaded in workflow_graph_filters.js:
  filters.js change handler returns early on /^L[0-6]$/ values so it
  never calls wfgApplyFilter() for depth selections. Visual filter only
  handles kind:/file:/edge: values. lod.js owns depth loading.

Root cause 3 — "command" in _PHASE_KINDS["L1"] was Bash telemetry (5878):
  Removed "command" from L1, added to L2 alongside tool_hubs. L1 now
  returns ~190 nodes (skill + hook + agent + mcp) not 6068.

Root cause 4 — Depth reset didn't clear _existingIdSet/_existingEdgeSet:
  Added JUG.resetGraph() to graph.js that clears lastData AND both dedup
  sets. lod.js._onFilterChange calls JUG.resetGraph() so L1→L2 is a
  genuine rebuild, not an append on top of stale nodes.

Bonus — "global" excluded from domain dropdown:
  controls.js populator now filters to kind==="domain" && !isGlobal &&
  domain !== "global". Only real project domains appear.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
workflow_graph_filters.js:bindDomainSelect() was the canonical populator
and lacked the isGlobal filter. Added check: !n.isGlobal && id !== 'domain:__global__'.
The 'global' pseudo-domain is a catch-all bucket, not a project domain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ntation

Root cause (Dijkstra): domain:__global__ is a load-bearing layout anchor
that must reach the client. The correct fix is not to filter it out of the
data pipeline (that breaks orphan-node anchoring, validate_graph, and the
global filter view). The correct fix is to emit the authoritative predicate
at the single serialization funnel.

_node_to_dict (workflow_graph.py) is the sole serializer for every endpoint
(/api/graph, /phase, .bin, .zera, SSE). It now sets:
  selectableDomain: True   for real project domain nodes
  selectableDomain: False  for the global sentinel

Both dropdown populators (controls.js, workflow_graph_filters.js) now use
  if (n.selectableDomain)
instead of re-deriving !isGlobal && domain !== 'global' && id !== '...'.
The predicate is defined once, consumed everywhere, never re-derived.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The predicate was True for all non-global domains, including indexed
filesystem paths like /users/cdeust/developments/d4lordofhatred-source.
These are not project domains — they are directories that accidentally
got scanned as projects.

Rule: a domain is selectable iff its label contains no '/', '\', or '('
(the last catches build-artifact subdirs like 'data (1)/02_skill_tree').
Determined once at _node_to_dict, the single serialization funnel.

Result: cortex/agentic-ai/prd-spec-generator/etc → True
        global sentinel, /users/x/... paths → False

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause 1 — Reset loop (Graph: 0 nodes every cycle):
  _onFilterChange called resetGraph() which built an empty scene and emitted
  state:lastData with 0 nodes, wiping the domain dropdown. Also clearing
  all of _loaded let the boot poller re-fire. Fix: never resetGraph() in the
  filter path; clear only the specific phase cache keys being reloaded.
  appendGraphDelta dedup makes re-appending already-present nodes a no-op.

Root cause 2 — 484K edges on L0:
  get_phase_payload fallback filtered edges with OR (source OR target in
  node_ids). Every node has an in_domain edge pointing TO a domain hub,
  so all edges in the entire graph matched. Fix: AND — only edges internal
  to the phase node set.

Root cause 3 — "L1 of cortex shows all domains":
  Domain filter had `|| n.kind === 'domain'` which kept ALL 20 domain hubs.
  Fix: strict _belongsToDomain() for L1+. For L0 only, keep all domain
  hubs (layout foundation). For L1+, scope to selected domain only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
node_count>0 fires on the skeleton (29 nodes, only domain:__global__).
Wait for the explicit server readiness signal phases.L0===true, which
means all domain nodes are in the cache. Also retry if only 1 domain
loaded (warming cache), with a 3s backoff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cdeust and others added 23 commits May 31, 2026 14:22
Two fixes:
1. Boot now fetches /api/graph?batch_size=1 (not just /api/graph/progress).
   Only the graph endpoint reliably triggers _kick_background_build() on the
   server. Without this, the build stays at the 29-node skeleton indefinitely.

2. Before committing to L0 load, checks actual node count from the phase
   endpoint. If node_total <= 1 (only global sentinel), keeps waiting.
   This prevents loading L0 too early when only the skeleton is ready.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instant L0: domain nodes cached in localStorage (24h TTL). First load
waits for server build; every subsequent load shows domains in <10ms
from cache while server refreshes in background.

L5 pagination: memories phase is 838MB — loads in 4000-node chunks via
offset/limit to avoid V8's 512MB JSON.parse limit.

L6 dynamic discovery: server uses L6:cortex, L6:agentic-ai etc. Boot
discovers actual L6 phase keys from /api/graph/progress phases dict.

Legend: _updateLegend() counts actual rendered nodes (domain/memory/
entity/discussion) from JUG.state.lastData instead of server stats
that are None during build.

Skeleton/global filter: _filterNodes() strips isGlobal nodes and
domain:__global__ from every phase before injecting into the graph.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The  exception was wrong — it kept all 20 domain
hubs visible even when a specific domain was selected. Selecting
'agentic-ai' should show ONLY agentic-ai's domain hub + its children.

Fixed: domain filter now applies to ALL phases including L0.

Domain change now calls JUG.resetGraph() before loading — necessary
to remove the previous domain's nodes from the scene. This is safe
because the domain select does not listen to state:lastData, so the
reset→rebuild→emit cycle cannot re-trigger the listener.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Boot now:
1. Checks localStorage cache immediately (sub-10ms display)
2. Tries /api/graph/phase?name=L0 directly at 300ms (warm server = instant)
3. Falls back to retry every 2.5s if server is cold-building

Domain dropdown populated directly from L0 nodes in _populateDomainDropdown()
— no waiting for state:lastData event. Shows all selectableDomain=true
projects immediately when L0 loads.

Domain filter now applied to L0 too (phaseKey !== 'L0' exception removed).
agentic-ai + L4 = ONLY agentic-ai nodes at every depth level.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
L5 memories is ~838 MB of JSON — exceeds V8's 512 MB JSON.parse limit.
The server now accepts ?offset=N&limit=M on /api/graph/phase so the
client can load any phase in chunks.

get_phase_payload(key, offset, limit) slices node/edge arrays and
returns node_total/edge_total (full counts) + done (bool) so the
client knows when it has consumed the whole phase.

serve_graph_phase parses offset= and limit= from the query string.

lod.js was already calling the paginated endpoint (_loadPhasePaged).
This commit makes the server side of that contract actually work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
resetGraph() emits state:lastData with 0 nodes. Both dropdown
populators were rebuilding the list to just 'All Domains' on that
event, losing the user's selection.

Fix: guard with  in all three
populators (workflow_graph_filters.js, controls.js, lod.js).
Also restore the current selection correctly after any repopulation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s display

Store selected depth and domain in module-level JS variables.
All reads use _currentDepth()/_currentDomain() from variables.
DOM selects are output-only — synced FROM variables via _syncDomToState().

After resetGraph() wipes DOM, _syncDomToState() immediately re-asserts
the variable values, so the domain dropdown never loses 'cortex' or
whatever the user selected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The graph build cache path for L5 requires the build to complete the
memories phase first (~5+ min). /api/memories reads directly from
Postgres via keyset pagination: 5000 records in ~60ms locally.

_loadMemoriesFast() pages through /api/memories until next_cursor is
null. Memory records are mapped to graph node format (_memToNode) with
consolidation-stage color coding and heat. ~130K cortex memories load
in roughly 1.5s from local Postgres.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The LOD system loads L5 memories via /api/memories in 5000-record
chunks. The old 200-record cap meant 650+ round-trips for 130K cortex
memories instead of ~26. Local Postgres handles 5000 rows trivially.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nnerHTML rebuild

Root cause (found via Playwright): populateDomainDropdown() sets
innerHTML='<option>All Domains</option>' which temporarily resets the
select value to '' — some browsers fire a 'change' event at that point.
That event reached lod.js's domain-change listener which called
resetGraph() and loadUpTo() with '' (all domains) — the reset loop.
Then setting sel.value='agentic-ai' fired ANOTHER change event.

Fix:
- _suppressChange = true before innerHTML rebuild
- Both change listeners guard with `if (_suppressChange) return`
- setTimeout(fn, 0) lifts the guard after queued events drain

Also: _attachControls() now initialises _selectedDepth/_selectedDomain
FROM the current DOM values at bind time, so browser-restored selections
(bfcache/autocomplete) are honoured rather than fought.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-streaming

The LOD/clinical rebuild degraded the graph. Restored the exact working
state that matched the reference screenshot:

- renderer.js, draw.js, graph.js: original force-graph canvas renderer
- workflow_graph.js: D3 v7 force layout with SSE streaming
- graph_event_stream.js: live SSE subscriber for /api/graph/events
- graph_snapshot.js: CXGB binary snapshot fast-load (~110ms)
- polling.js: stats-only via /api/graph/progress (no 916MB payload)

Additions:
- force-graph vendored at ui/unified/vendor/force-graph.min.js (no CDN)
- mermaid defer'd (was blocking domcontentloaded)
- chain_panel.js kept for chain-of-call panel on node click
- renderer.js wires chain:open on selectNode

Fixes:
- Double version string bug in cache-busting regex (?v=old?v=new)
- Redirect loop script removed (was blocking page load)

This gives: all 28K+ nodes loaded via SSE, beautiful D3 galaxy clusters,
correct legend, chain panel on click, L6 symbols streaming in ~200ms
from local Postgres.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_renderFromCache was falling back to fetch('/api/graph') when the CXGB
snapshot was unavailable. That injects all 484K nodes simultaneously
into the force simulation, crashing the browser.

Removed. phase_loader.js handles progressive phase-by-phase population
which is what gave the beautiful reference screenshot (28K nodes, each
phase settling before the next lands).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…loader.js

Both the inline poll() script and phase_loader.js were calling
applyPhase/appendGraphDelta for every phase. Every phase was loaded
twice, doubling memory and simulation work, crashing at 45K nodes.

Inline poll() now only drives the progress bar and CXGB snapshot.
phase_loader.js is the sole owner of phase-by-phase graph population.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ing galaxy

memory_entity_edges is the KG entity relationship phase. It floods the
canvas with 110K+ green entity nodes obscuring the structural galaxy.
Skip it in phase_loader.js (same as L5 memories).

Also suppress the build-progress banner for memory_entity/loading memory
phases since the user hasn't asked for them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Architecture:
  CXGB snapshot (~25MB) → Rust reads from disk → IPC → force-graph D3

No HTTP server in the rendering critical path. Rust decodes the CXGB
binary format (same as browser graph_snapshot.js) in <100ms, sends
nodes/edges to the WebView via Tauri IPC, force-graph renders.

Files:
  app-tauri/src-tauri/src/main.rs  — Rust CXGB decoder + Tauri IPC
  app-tauri/src-tauri/Cargo.toml   — deps: tauri, serde, dirs
  app-tauri/frontend/index.html    — force-graph renderer via IPC
  scripts/dump_snapshot.py         — one-shot: server → 25MB CXGB file

To build the snapshot:
  uv run python3 scripts/dump_snapshot.py   (run once; needs HTTP server)

To run the native app:
  cd app-tauri && cargo tauri dev
  OR: cargo tauri build (for .app bundle)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- withGlobalTauri: true enables window.__TAURI__.core.invoke
- New frontend: native Cortex dark theme (no JUG dependency),
  force-graph direct, inline canvas node renderer with kind colors
- Polls for __TAURI__ to be available before invoking
- Loading ring, stats sidebar, fade-out status match Cortex UX

Result: 124K nodes decoded from 24MB CXGB file by Rust in <100ms,
no HTTP server in the critical path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…/y=0

Three bugs in the Rust CXGB decoder:
1. HEADER_SIZE was 64, actual is 32 (4s+H+H+I+I+Q+I+I).
   This shifted every node read by 32 bytes → only 124K of 484K decoded.
2. Node kind byte map was wrong (Rust vs Python _NODE_KIND_MAP).
   domain=0,tool_hub=1,file=2,symbol=3,skill=4,hook=5,command=6,agent=7,
   mcp=8,discussion=9,memory=10,entity=11.
3. x/y serialized as 0.0 when uncomputed → all nodes at origin (one blob).
   Fixed with #[serde(skip_serializing_if='is_zero')] so force-graph
   distributes them via its own simulation.

Result: 484,303 nodes loaded with correct colors and layout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the toy frontend with the complete unified-viz.html (full Cortex UI:
tabs, legend, detail panel, workflow_graph D3 renderer, draw.js, etc.).

Only the data loading changes:
- _renderFromCache() now calls Rust invoke('load_graph') instead of
  fetching /api/graph.bin over HTTP
- phase_loader.js / polling.js / SSE removed (data comes from Rust)
- A small inline IPC loader triggers on DOMContentLoaded

Result: same galaxy-cluster visualization as the browser version, but
with Rust reading the 24MB CXGB snapshot directly from disk at startup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rust writes ~/.cache/cortex/graph-data.js (sets window.__CORTEX_GRAPH__)
before the WebView opens. Frontend loads it as a script tag via
Tauri's asset protocol. Completely bypasses IPC complexity.

Result: 484,303 nodes in 103MB JS file loaded on app start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tauri app spawns the existing Cortex HTTP server (uv run python3
http_standalone.py) on startup, waits up to 30s for it to be ready,
then the WebView loads http://127.0.0.1:3458.

All tabs work (GRAPH, KNOWLEDGE, WIKI, BOARD, PIPELINE).
Same visualization as the browser. Wrapped as a native .app.
Server has CORTEX_IDLE_TIMEOUT=86400 so it never stops.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant