diff --git a/docs/planning/ralph-roadmap.html b/docs/planning/ralph-roadmap.html index bbb22fde..853006b5 100644 --- a/docs/planning/ralph-roadmap.html +++ b/docs/planning/ralph-roadmap.html @@ -32,58 +32,63 @@

Breenix Roadmap β€” Backlog

-

Canonical orchestrator backlog (maintained by Claude). Last updated 2026-05-30 06:35. Shipped + in flight + queued. Per-turn execution detail lives in the Ralph inbox.md; version-tracked copy in-repo at docs/planning/ralph-roadmap.html.

+

Canonical orchestrator backlog (maintained by Claude). Last updated 2026-06-01. ARM64 / Parallels focus. Refreshed at merge milestones (not per-turn β€” per-turn churn caused the prior revert). Per-turn execution detail lives in the Ralph inbox.md; version-tracked copy in-repo at docs/planning/ralph-roadmap.html.

-
7
Shipped
+
12
Shipped (session)
1
In progress
-
6
Queued
-
1
Parked
+
3
Queued
+
8
Signoff-gated

🚧 In progress

-

E5-1 β€” execv child returns to kernel RIP under Ring 3 attempt reverted Β· unproven x86_64

-

Blocked-in-syscall thread mid-execv is timer-preempted; its saved kernel frame is later restored under Ring-3 selectors β†’ err 0x15 instruction-fetch fault at the saved kernel RIP.

-

Turn 28 (2026-05-30): selector-only privilege-conditional restore built clean but could not be proven (GDB timed out pre-userspace; boot-stages stalled at ARP stage-22). A broader candidate (kernel-frame restore + CR3 switch) reached userspace and cleared the original 0x15 signature, but panicked on a new fault (0x100003006a8, PML4[2] process-manager data). All code reverted β€” no fix retained, nothing merged. New understanding: not a selector swap β€” a process/thread context-sync bug (how process.main_thread vs scheduler Thread diverge/reuse across blocked-in-syscall wake/restore). Proof also gated on NB-2 (ARP stage-22).

-

operator-approved Tier-2 edit Β· beads breenix-ql2

+

#404 residual post-spawn crash β€” root-cause re-verify failed ARM64 Β· Parallels

+

The user-stack frame-aliasing lockup fix (PR #404) reduces but does not eliminate the fault: a clean re-verify on current main faulted 1 of 2 stress boots with a post-spawn UNHANDLED_EC β†’ PANIC on the freshly spawned child (PID 5) after exec.

+

Key finding: the prior "assertion-fired 3/3, 0 crashes" proof was contaminated by an SMP-serial byte-interleaving blind spot β€” naive FATAL/PANIC line-grep reads 0; you must de-interleave the two CPU streams to see [FATAL] bug=UNHANDLED_EC. The fault sits near the stack/spawn path β†’ may overlap the #406 kstack-reuse area. PR #404 held, not merged. Next: de-interleave + root-cause; a gold-master file may be required β†’ operator signoff first. beads breenix-oia family adjacent.

-

πŸ“‹ Queued backlog

+

πŸ“‹ Queued β€” non-gold-master (ARM64)

-
-

NB-1 β€” CPU%-accounting bug + real 2-core burn P0 ARM64 Β· Parallels

-

Investigated 2026-05-30 (code hypothesis): heartbeat's sleep path is correct (nanosleep genuinely blocks β€” not a busy-loop). The 189.9% is an accounting artifact.

-

NB-1a (confirmed by code): btop's % = per-process ticks (summed across cores/threads per-PID) Γ· global_ticks, but global_ticks increments only on CPU0 β†’ blows past 100%. Sites: btop.rs:511-518, timer_interrupt.rs:734-737, procfs/mod.rs:998-1003. Fix: core-aware denominator + single-snapshot sampling. NB-1b (runtime-confirm): which process truly pins CPU0+CPU2 β€” GDB PC-sample; SMP double-scheduling ruled out (guards + gold-master CPU0 alarm intact).

-
-
-

NB-2 β€” ARP stage-22 boot-stages stall P0 x86_64

-

Boot-stages hangs at [22/252] ARP reply received β€” net RX reply not delivered on QEMU x86_64. Blocks the E5-1 boot-stages proof; likely a net RX interrupt/descriptor issue.

-
-

NB-3 β€” ls hangs in bsh P1 ARM64

ls still hangs in the bsh shell.

-

NB-4 β€” Window clipping on overlap P1 ARM64 Β· BWM

Overlapping windows clip in the BWM / VirGL compositor occlusion path.

-

NB-5 β€” Cornflower-blue ~30s boot hang P1 ARM64 Β· Parallels

Screen stays cornflower blue ~30s before first composited frame on Parallels boot. First-paint / boot-latency investigation.

-

NB-6 β€” virtio-blk self-test fails P2 ARM64 Β· Parallels

VirtIO block test failed: Block device not initialized; falls back to AHCI for ext2 root.

+

SOFT_LOCKUP_VIRGL Parallels failure class P1 ARM64 Β· Parallels

Investigate + classify the SOFT_LOCKUP_VIRGL failure class on Parallels. beads breenix-ha9.

+

F15 β€” ARM64 AHCI timeout corridor after GICR discovery P1 ARM64

Remaining AHCI timeout corridor after GICR discovery; verify whether the fix is storage-driver-only (non-gold-master) or touches gic.rs β†’ signoff. beads breenix-xk8.

+

bsshd SSH exit-status / close for exec P2 ARM64

bsshd should send SSH exit-status + channel close for exec requests. beads breenix-72x.

+
+ +

πŸ”’ Signoff-gated β€” gold-master / CPU0 cluster (awaiting operator)

+
+

CPU0 timer-death + scheduler cluster signoff ARM64 Β· Parallels

CPU0 vtimer death on Parallels + remote-wake / resched scheduling. Fixes live in frozen gold-master timer_interrupt.rs / gic.rs / context_switch.rs β†’ need operator signoff (this cluster burned ~a week before; investigate-only without go). beads oia, 9f1, cb7, 6f4, e43, k16, eh4.

+

BusyBox applet DATA_ABORT β€” ARM64 musl TLS signoff ARM64

BusyBox applet faults on the ARM64 musl TLS path (errno read from a zero TPIDR_EL0). User-facing symptom already fixed (native bls as /bin/ls); the principled TLS fix touches gold-master context-switch β†’ signoff. beads breenix-b7u.

+
+ +

⏸ Deprioritized β€” x86_64 (future)

+
+

E5-1 β€” execv child returns to kernel RIP under Ring 3 x86 future x86_64

Process/thread context-sync bug on blocked-in-syscall wake/restore mid-execv. ARM64-focus β†’ deprioritized. beads breenix-ql2.

+

NB-2 β€” ARP stage-22 boot-stages stall x86 future x86_64

x86 boot-stages hang at [22/252] ARP reply received; ARM64-focus β†’ deprioritized (x86 boot-stages is not used as a gate).

-

⏸ Parked β€” awaiting operator

+

βœ… Shipped β€” merged to main (this session)

-

CPU0 timer death parked ARM64 Β· Parallels

CPU0 vtimer stops ~6 ticks into boot on Parallels; registers verified correct; diagnostics built. Separate Ralph; awaiting operator decision since 2026-05-26.

+

oi6 β€” ARM64 multi-window virtio-gpu marshalling merged

The real "window clipping". PR #382 Β· re-verified clean on a fresh multi-window boot (0 VIRTGPU_FAIL).

+

244 β€” ARM64 kstack reuse / bsshd inbound recv EIO merged

PR #406 Β· bitmap kstack reuse + Drop-free (leaked fork-child stacks exhausted the pool at ~90 conns); 50/50 inbound SSH with real host-key verify.

+

4yu β€” ARM64 AHCI uninterruptible exec-reads merged

PR #405 · AHCI slot-0 waits made uninterruptible (SIGCHLD→EINTR was abandoning in-flight DMA); F19 serialization workaround relaxed.

+

45i β€” ARM64 CLONE_VM β†’ exec use-after-free merged

PR #407 Β· exec returns EAGAIN while a live CLONE_VM sibling still holds the old CR3 (scoped stopgap; full sibling-teardown later).

+

NB-1 β€” CPU%-accounting fix merged

PR #396 Β· core-aware denominator + single-snapshot sampling (the 189% was an accounting artifact, not a real burn).

+

bssh client + host-auth suite merged

client channel #397, publickey #398, known-hosts verify #399, host-auth #403.

+

net RX stall + ARP pending queue merged

#400 net-rx-stall fix, #401 ARP pending queue.

+

crash-trace instrumentation merged

PR #402 Β· trace-ring crash diagnostics.

-

βœ… Shipped (merged to main)

+

βœ… Resolved without a fix (closed with evidence)

-

E5-2 β€” net RX MSI-X completion merged

PR #365

-

B-3 β€” interrupt/IO fix merged

PR #366

-

B-4 β€” interrupt/IO fix merged

PR #368

-

#367 β€” defensive ELF PT_LOAD perm merge merged

PR #367

-

E3 β€” runtime-polling gate closed

Polling-elimination gate satisfied.

-

Roadmap docs consolidation merged

#370, #371, #372

-

B-1 / B-2 β€” ARM64 DATA_ABORT non-repro

Not reproducible across 3 healthy boots; opportunistic capture only.

+

0wf β€” BWM spawn-wedge dismissed

6/6 fresh boots reached bwm create_process_with_argv ENTRY + full ELF load β€” harness/classification noise, not a real wedge. beads closed.

+

c5d β€” ARM64 BWM GPU compositor for animated windows already fixed

Live counters: SUBMIT_3D climbs ~154/s under Bounce while CPU full-frame composite stays flat β€” already GPU-composited by merged #381. beads closed.

+

NB-3 β€” ls hang in bsh fixed

Native bls installed as /bin/ls (the BusyBox applet musl-TLS fault is tracked separately as b7u).

+

NB-4 β€” window clipping = oi6

Resolved as the oi6 multi-window virtio-gpu fix (#382).

-

Legend: P0 urgent Β· P1 Β· P2 Β· arch. NB-* = surfaced by operator 2026-05-30.

+

Legend: P0 Β· P1 Β· P2 Β· signoff = gold-master / operator-gated Β· arch. Gold-master frozen files (edits require operator signoff): context_switch.rs, timer_interrupt.rs, gic.rs.