RTL8814AU: drop REG_CR=0 post-fwdl write that wedges bulk-OUT#49
Merged
Conversation
FirmwareDownload_8814A's post-fwdl CPU kick zeroes REG_CR (0x0100) just after MCUFWDL=0x79. This clears all 8 enable bits in byte 0 (HCI TX/RX DMA, TXDMA, RXDMA, PROTOCOL, SCHEDULE, MACTXEN, MACRXEN). The later `REG_CR |= MACTXEN|MACRXEN` at HalModule.cpp:241 only re-sets bits 6+7, leaving the DMA-enable bits 0..5 at zero — so the chip's TX/RX DMA engines never come up. bulk-OUT URBs queue at EP 0x02 but the FIFO never drains; URBs sit until libusb's 500 ms async timeout cancels them (-ENOENT), producing the catastrophic submit-failure pattern reported in #36. Kernel rtw88_8814au never writes REG_CR=0 during post-fwdl. The "byte-for-byte rtw88-mirror" comment block above this code was wrong about this specific address. Bisected today by gating the 7 divergent post-fwdl writes individually behind env vars; only 0x0100 reproduces the wedge. Verification: - Local devourer-TX 12 s on 8814AU: 2203/2203 OK (was 0.4% completion) - 8812AU + 8821AU sanity: unchanged (different code path) - tests/regress.py --full-matrix: 8814 devourer-TX cells [2,4,6,8] now show 0 fail annotation (was 4700+ failures each) The fix is sufficient for #36 but does not restore 8814AU on-air emission — chips ACK URBs cleanly but no frames hit air. That is a separate gate (TX descriptor or rate config) and out of scope here. Closes #36. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 26, 2026
josephnef
added a commit
that referenced
this pull request
May 26, 2026
In RtlJaguarDevice::send_packet the SET_TX_DESC_*_8812 macros are
bit-identical to the SET_TX_DESC_*_8814A macros (verified against
hal/rtl8814a_xmit.h), so devourer can keep using the 8812 macro set
on 8814A. But a usbmon byte-diff against a working VM-passthrough
88XXau monitor-injection session (qemu USB-host-passthrough → VM
kernel 88XXau → bulk-OUT URBs back through host xhci) shows three
field-value mismatches on 8814A:
Dword 0 bit 31 — 8812 calls it OWN, 8814A calls it DISQSELSEQ.
88XXau leaves bit 31 = 0 for monitor-injected frames; devourer's
SET_TX_DESC_OWN_8812(..., 1) sets it to 1, which on 8814A means
DISQSELSEQ=1 (disable queue-select-based sequence numbering).
Dword 2 bits 24-29 (GID) — 88XXau leaves at 0 for injection;
devourer writes 0x3F.
Dword 4 bits 18-23 (DATA_RETRY_LIMIT) — 88XXau leaves at 0 for
injection; devourer writes 12 (RETRY_LIMIT_ENABLE stays 1 in both).
Skip those writes on 8814A so the emitted descriptor byte-matches
aircrack-ng's reference monitor-injection format. Add a
DEVOURER_TX_LEGACY_8812_DESC=1 env-gate to restore the old behaviour
without rebuilding, in case anything downstream depends on it.
This does NOT resolve #50 (8814AU on-air silence has a separate root
cause that vendor-control-write replay cannot reach — both sessions on
2026-05-26 ruled out 9 distinct hypotheses including a binary
URB-flag diff, see comment-4546974748). The change is purely about
descriptor correctness — aligning devourer's TX descriptor format
with the byte-level reference that the working kernel driver produces.
8812AU and 8821AU paths are bit-for-bit identical to current master
(is_8814a is false there and all writes fire as before). Smoke-tested
on the live bench:
8812AU: 760 submits / 760 complete / 0 fail
8814AU (new): 3572 submits / 3572 complete / 0 fail (vs current
master's behaviour, which is identical at libusb level
because devourer's descriptor differences from 88XXau
are no-ops at the bulk-OUT path post-PR-#49)
8814AU (DEVOURER_TX_LEGACY_8812_DESC=1): same as without env
Refs #50 (partial — descriptor alignment only, not the on-air gate).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef
added a commit
that referenced
this pull request
May 26, 2026
In RtlJaguarDevice::send_packet the SET_TX_DESC_*_8812 macros are
bit-identical to the SET_TX_DESC_*_8814A macros (verified against
hal/rtl8814a_xmit.h), so devourer can keep using the 8812 macro set
on 8814A. But a usbmon byte-diff against a working VM-passthrough
88XXau monitor-injection session (qemu USB-host-passthrough → VM
kernel 88XXau → bulk-OUT URBs back through host xhci) shows three
field-value mismatches on 8814A:
Dword 0 bit 31 — 8812 calls it OWN, 8814A calls it DISQSELSEQ.
88XXau leaves bit 31 = 0 for monitor-injected frames; devourer's
SET_TX_DESC_OWN_8812(..., 1) sets it to 1, which on 8814A means
DISQSELSEQ=1 (disable queue-select-based sequence numbering).
Dword 2 bits 24-29 (GID) — 88XXau leaves at 0 for injection;
devourer writes 0x3F.
Dword 4 bits 18-23 (DATA_RETRY_LIMIT) — 88XXau leaves at 0 for
injection; devourer writes 12 (RETRY_LIMIT_ENABLE stays 1 in both).
Skip those writes on 8814A so the emitted descriptor byte-matches
aircrack-ng's reference monitor-injection format. Add a
DEVOURER_TX_LEGACY_8812_DESC=1 env-gate to restore the old behaviour
without rebuilding, in case anything downstream depends on it.
This does NOT resolve #50 (8814AU on-air silence has a separate root
cause that vendor-control-write replay cannot reach — both sessions on
2026-05-26 ruled out 9 distinct hypotheses including a binary
URB-flag diff, see comment-4546974748). The change is purely about
descriptor correctness — aligning devourer's TX descriptor format
with the byte-level reference that the working kernel driver produces.
8812AU and 8821AU paths are bit-for-bit identical to current master
(is_8814a is false there and all writes fire as before). Smoke-tested
on the live bench:
8812AU: 760 submits / 760 complete / 0 fail
8814AU (new): 3572 submits / 3572 complete / 0 fail (vs current
master's behaviour, which is identical at libusb level
because devourer's descriptor differences from 88XXau
are no-ops at the bulk-OUT path post-PR-#49)
8814AU (DEVOURER_TX_LEGACY_8812_DESC=1): same as without env
Refs #50 (partial — descriptor alignment only, not the on-air gate).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
FirmwareDownload_8814Awas writingREG_CR (0x0100) = 0immediately afterMCUFWDL=0x79. This clears all 8 enable bits in byte 0 — including the DMA-enable bits (0..5).REG_CR |= MACTXEN | MACRXENatHalModule.cpp:241is a 2-bit OR; it sets bits 6+7 but leaves bits 0..5 at zero. So the chip's TX/RX DMA engines never come up: bulk-OUT URBs queue at EP0x02but the FIFO has no drain path. URBs sit at the chip until libusb's 500 ms async timeout cancels them (-ENOENT), giving the catastrophic submit-failure pattern reported in RTL8814AU: devourer TX degrades to LIBUSB_ERROR_IO after USB passthrough cycles #36.rtw88_8814aunever writesREG_CR=0during post-fwdl. The "byte-for-byte rtw88-mirror" comment block above this code is wrong on this specific address.0x010d,0x0100,0x1330,0x0230,0x022c,REG_BCN_CTRL,0x0210) behind env vars; only0x0100reproduces the wedge.Scope
LIBUSB_TRANSFER_TIMED_OUTsubmit failures on devourer-TX 8814AU after USB cycling).RTL8814AU devourer-RXin matrix is also still broken (cells 11/12/19/20/23/24 = 0 hits) — pre-existing, unrelated.Test plan
WiFiDriverTxDemo12 s on0bda:8813: 2203/2203 OK, 0 fail (was 815 submits / 575 fail = 0.4% completion on master).RTL8812AUWiFiDriverTxDemosanity: 796/796/0 unchanged (different code path).RTL8821AUWiFiDriverTxDemosanity: 991/991/0 unchanged (different code path).sudo python3 tests/regress.py --full-matrix --channel 100 --vm-name devourer-testrig --vm-ssh josephnef@...(the original RTL8814AU: devourer TX degrades to LIBUSB_ERROR_IO after USB passthrough cycles #36 repro): 8814 devourer-TX cells[2,4,6,8]now show0 hits / 4500 TX(no(N fail)annotation, indicatingtx_failures == 0perregress.py:494-495). Before fix: each cell showed(4700+ fail). 8812/8821 devourer-TX cells unchanged (5927–6884 hits, identical to pre-fix).🤖 Generated with Claude Code