DroneCAN: Fix H7 FDCAN and F7 bxCAN driver configuration#11607
Draft
daijoubu wants to merge 39 commits into
Draft
DroneCAN: Fix H7 FDCAN and F7 bxCAN driver configuration#11607daijoubu wants to merge 39 commits into
daijoubu wants to merge 39 commits into
Conversation
SJW=8 was overly conservative (80% of bit time at 1Mbps with 10 quanta). SJW=3 is the standard value also used by the F7 driver. Tested with 6037 arm/disarm cycles at 500kbps: TEC=0, REC=0, zero errors.
…n, and PLL2 clock source - Move FDCAN kernel clock config (RCC_FDCANCLKSOURCE_PLL) before canardSTM32ComputeTimings() so HAL_RCCEx_GetPeriphCLKFreq returns the correct frequency for timing calculation. - Reduce SJW from 8 to 1 for tighter synchronization. - Set TxBuffersNbr=0 (unused with TX FIFO, TxFifoQueueElmtsNbr=32). - Enable FDCAN auto-retransmission to match F7 behavior and avoid escalating transient TX errors into bus-off. - Reduce bus-off recovery delay from 100ms to 1ms (auto-retransmission makes bus-off rare; near-instant recovery is safe). - Make PLL2M dynamic from HSE_VALUE and set PLL2Q=10 for 80 MHz FDCAN kernel clock. Add USE_DRONECAN guard to configure FDCAN clock from PLL2, keeping VCO=800MHz for all HSE frequencies.
…pport Remove redundant PeriphClkInitStruct clock config from canardSTM32CAN1_Init. system_stm32h7xx.c already configures FDCAN to use PLL2Q (80 MHz) when USE_DRONECAN is defined; duplicating it in the driver overwrites with PLL1. Also add CAN1 pin definitions and USE_DRONECAN to KAKUTEH7WING target (PD0/PD1, CAN1_STANDBY PD3 disabled by default).
Use HAL_RCCEx_GetPeriphCLKFreq(RCC_PERIPHCLK_FDCAN) instead of HAL_RCC_GetPCLK1Freq() for bit timing calculation. FDCAN is clocked from PLL2Q (80 MHz) configured in system_stm32h7xx.c; using PCLK1 (100 MHz) produced a ~25% baud rate error causing immediate bus-off. Restore SJW to 3 for better synchronisation tolerance.
Remove high-frequency LOG_DEBUG messages from GNSS Fix/Fix2/Auxiliary handlers, onTransferReceived, dronecanInit, and gps_dronecan HDOP path that fired at 25 Hz and flooded the log. Fix PLL2 VCO input to target 1.6 MHz (PLL2M = HSE/1600000, PLL2N = 500) rather than 2.0 MHz, keeping the operating point clearly within VCIRANGE_0 (1-2 MHz) as the original SDCARD-only code did with PLL2M=5. VCO output remains 800 MHz; FDCAN (80 MHz via PLL2Q=10) and SDMMC (200 MHz via PLL2R=4) outputs are unchanged.
Drop high-frequency and verbose-but-low-value LOG_DEBUG(CAN messages: - dronecan.c: Battery Info (x2), GetNodeInfo, NodeStatus, TX success, RX loop, commented-out debug blocks - canard_stm32h7xx_driver.c: timing computation intermediates (Baudrate, Max Quanta, Prescaler BS, Prescaler, Timings summary) - canard_stm32f7xx_driver.c: same timing intermediates, TX success, In CAN Init, commented-out clock and RX blocks Retain error-path messages (decode failed, TX/RX error, init failures) and the single-line Prescaler/SJW/BS summary logged at init.
… operation Tested at 1 Mbps on KAKUTEH7WING hardware and confirmed bus operational.
…ivers Remove CubeMX boilerplate markers, commented-out dead code, and development-time question comments from both drivers.
Fix: DroneCAN GNSS messages were being applied to gpsSolDRV regardless of the configured GPS provider. Guard added in gps_dronecan.c where it belongs, keeping CAN transport layer unaware of GPS config.
…g difference F7 bxCAN HAL writes SJW directly to BTR register where hardware adds 1, so stored value 3 gives 4 tq. This wider SJW is needed for reliable bus operation on F7 targets and is different from H7 where SJW=1 is actual tq.
Prevents state machine from continuing in INIT state when the CAN peripheral fails to initialize.
Prevents out-of-bounds access when STATE_DRONECAN_FAILED is active.
…ilure Consistent with existing PLL1 error handling in the same file.
Prevents stale pre-bus-off frames from storming the bus on recovery.
With AutoRetransmission=ENABLE, frames that fail on a degraded bus occupy FIFO slots indefinitely. All 32 slots fill, HAL_FDCAN_AddMessage returns HAL_ERROR, and all outgoing traffic stalls permanently with no indication until full bus-off. DroneCAN reliability is handled at the application layer via periodic republishing.
Matches the H7 driver pattern. Previously the return value was silently discarded; if timing computation failed, uninitialized stack bytes were passed to HAL_CAN_Init.
The H7 FDCAN 128x11 recessive-bit recovery sequence takes up to 11.264ms at 125kbps. The 1ms delay was restarting the counter before it could complete, preventing the node from ever exiting bus-off. 20ms gives safe margin above worst-case and allows time to detect immediate re-entry.
Guard against non-DroneCAN GPS provider at the transport boundary (handle_GNSS* functions) rather than in each leaf function in gps_dronecan.c. Also adds the guard to handle_GNSSRCTMStream which had none. Removes stale UNUSED(pgnssAux) and placeholder comment from dronecanGPSReceiveGNSSAuxiliary.
…lculation HSE_VALUE / 1600000 silently truncates if HSE is not an exact multiple of 1.6MHz. All current H7 targets use 8MHz or 25MHz HSE (both exact multiples), but a future board with a non-standard crystal would compute a wrong PLL2M and get the wrong FDCAN/SDMMC clocks. The assert catches this at compile time.
canardSTM32GetProtocolStatus() was called on every dronecanUpdate() invocation (~500Hz) to detect bus-off. Moved into the existing 1Hz task block — bus-off detection latency of up to 1s is acceptable. Adds LOG_DEBUG to report BusOff and ErrorPassive flags each second for bench diagnostics.
AutoBusOff=ENABLE handles the 128x11 recovery sequence automatically, but ESR.BOFF is a sticky read-only flag that is NOT cleared when hardware recovery completes. GetProtocolStatus() reads this flag, so the state machine was permanently stuck in STATE_DRONECAN_BUS_OFF after any bus-off event on F7 targets. Stop/Start re-enters init mode which clears ESR.BOFF, allowing recovery detection to work correctly.
Comment incorrectly stated '25MHz' as a supported HSE value — 25MHz fails the assert. CMake always provides HSE_VALUE per-target via -DHSE_VALUE=<n> so the stm32h7xx_hal_conf.h fallback of 25MHz is never used. Current targets use 8MHz (default) or 16MHz (KAKUTEH7WING).
…adence GetProtocolStatus() was called every dronecanUpdate() cycle (~500Hz) in BUS_OFF state. Moved inside the 20ms recovery timer block so it runs at the same cadence as RecoverFromBusOff() — still detects recovery within 20ms but reduces MMIO reads from ~500/sec to ~50/sec.
HAL_CAN_Stop/Start called from the scheduler context with CAN interrupts active caused a full FC lockup. Reverted to empty stub pending investigation of a safe mechanism to clear the sticky ESR.BOFF flag on F7.
Unconditional 1Hz LOG_DEBUG was flooding the bootlog with healthy status messages. Now only logs when an error condition is actually present.
HAL_CAN_AddTxMessage returns non-OK when all mailboxes are busy — a normal transient condition at startup. The log was noise. Matches the H7 driver which already handles this path silently.
DroneCAN float16 optional fields encode NaN when unpopulated. Without a guard, NaN * 100 converts to 0 on Cortex-M (ARM VCVT saturation), permanently blocking the HDOP fallback path. Also passes values through gpsConstrainHDOP() to prevent uint16_t overflow for extreme DOP values.
…ompatible gpsSolDRV has hdop but no vdop field. VDOP and EPV are not interchangeable (different units, conversion requires receiver UERE). lastVDOP was a dead store with no valid consumer.
Project builds with -std=gnu99 and uses STATIC_ASSERT from common/utils.h throughout. The C11 static_assert keyword is a GCC extension under gnu99 and inconsistent with the project convention.
…ames size Guards the CLI state name array against future enum additions — if a new state is added without updating the array, the build fails immediately.
Guards against out-of-bounds read if dronecanState is ever corrupted to STATE_DRONECAN_COUNT or beyond.
ABOM (CAN_MCR bit 6) is enabled in canardSTM32CAN1_Init. Per RM0410 ss40.7.6, with ABOM=1 the hardware manages the full bus-off recovery sequence automatically: after 128x11 recessive bits it cycles INRQ and clears ESR.BOFF without any software intervention required. Removes an incorrect TODO asserting ESR.BOFF is sticky — it is a hardware-managed status bit that clears when bus-off state is left.
Cast both MIN() arguments to int to avoid -Werror=sign-compare between dronecanState_e (unsigned enum) and STATE_DRONECAN_COUNT - 1 (int).
clang treats uint32_t as unsigned int, not unsigned long. Use %u to match the actual type of BusOff and ErrorPassive in canardProtocolStatus_t.
|
Test firmware build ready — commit Download firmware for PR #11607 237 targets built. Find your board's
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes several configuration errors in the H7 FDCAN and F7 bxCAN DroneCAN drivers that cause degraded CAN bus behaviour under real-world conditions. Hardware tested on MATEKF765SE (F7) and KAKUTEH7WING (H7).
Changes
H7 FDCAN driver (
canard_stm32h7xx_driver.c)HAL_RCC_GetPCLK1Freq()(APB1), now correctly usesHAL_RCCEx_GetPeriphCLKFreq(RCC_PERIPHCLK_FDCAN)TxBuffersNbr = 0(driver uses FIFO queue only; dedicated buffers were unused)TXBCR) before clearingCCCR.INITon bus-off recoveryHAL_RCCEx_PeriphCLKConfigreturn value; callError_Handleron failurePLL2 clock configuration (
system_stm32h7xx.c)USE_DRONECAN(wasUSE_SDCARD_SDIOonly)STATIC_ASSERTfor HSE_VALUE divisibility by PLL2MHAL_RCCEx_PeriphCLKConfigreturn valueF7 bxCAN driver (
canard_stm32f7xx_driver.c)canardSTM32ComputeTimingsreturn valuecanardSTM32RecoverFromBusOffno-op: ABOM=1 handles full recovery automatically including clearing ESR.BOFF (RM0410 §40.7.6)DroneCAN state machine (
dronecan.c)STATE_DRONECAN_FAILED— set on CAN peripheral init failureSTATE_DRONECAN_COUNTsentinel withSTATIC_ASSERTprotecting CLI name arrayhandle_GNSSRCTMStreamGPS DroneCAN driver (
gps_dronecan.c)gpsSolDRVregardless of configured GPS providergpsConstrainHDOP()to prevent uint16_t overflowlastVDOP(dead store — no vdop in gpsSol, not EPV-compatible)KAKUTEH7WING target
USE_DRONECANwith CAN1 RX/TX pin definitions (PD0/PD1)CLI (
cli.c)FAILEDtodronecanStateNames; addSTATIC_ASSERTto keep array in sync with enumNotes on AutoRetransmission
This PR disables AutoRetransmission on both H7 and F7 to prevent TX FIFO/mailbox stall when the bus is degraded. PR #11560 re-enables it on H7 with a comment that DISABLE was "incorrect." This will need deliberate reconciliation when rebasing — the right answer depends on whether the ISR-driven TX queue in #11560 handles the stall scenario differently.
Testing