Skip to content

Batch MPP claims into single ChannelMonitorUpdate#4552

Open
Abeeujah wants to merge 1 commit into
lightningdevkit:mainfrom
Abeeujah:parallelize-mppclaims
Open

Batch MPP claims into single ChannelMonitorUpdate#4552
Abeeujah wants to merge 1 commit into
lightningdevkit:mainfrom
Abeeujah:parallelize-mppclaims

Conversation

@Abeeujah
Copy link
Copy Markdown
Contributor

@Abeeujah Abeeujah commented Apr 11, 2026

When multiple parts of an MPP payment arrive on the same channel, claim them via the holding cell so they are released as one combined ChannelMonitorUpdate (every PaymentPreimage step plus a single commitment_signed) instead of one round-trip per part.

claim_payment_internal now forces the holding-cell path whenever there is more than one source, records each touched channel, and flushes them after queueing. The single-source case keeps the existing inline fast path.

FundedChannel::get_update_fulfill_htlc gains a force_holding_cell flag that suppresses the per-claim ChannelMonitorUpdate so the flush can emit one combined update. When the holding cell cannot be freed immediately (awaiting RAA, monitor update in progress, disconnected, quiescent, ...), build_preimage_only_monitor_update writes a one-step preimage update so the preimage stays durable across restarts; the commitment_signed follows when the holding cell is naturally flushed.

Update test_single_channel_multiple_mpp, test_simple_mpp, and auto_retry_partial_failure to reflect the new single-round-trip behavior, and drop the threaded reproducer that targeted the old per-claim path.

Tests has been updated to reflect this new batching of MPP claims

  • test_single_channel_multiple_mpp
  • auto_retry_partial_failure
  • test_keysend_dup_hash_partial_mpp

closes #3986

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Apr 11, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz April 11, 2026 17:52
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/chanmon_update_fail_tests.rs
@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Apr 11, 2026

I've now thoroughly reviewed the entire PR diff and the surrounding codebase. Let me write my final review.

Review Summary

Prior Issues Status

Fixed since last review:

  • build_preimage_only_monitor_update now correctly calls monitor_updating_paused (channel.rs:7893-7902), preventing the panic I previously flagged
  • Completion actions and RAA blockers are now registered once per channel in the flush loop (channelmanager.rs:10313-10332), not once per HTLC
  • New test test_single_channel_multiple_mpp_awaiting_raa exercises the deferred flush path (the build_preimage_only_monitor_update code path when the channel is AWAITING_REMOTE_REVOKE)

Still open from prior review (not re-posted as they remain relevant):

  • Fund-safety regression: preimage loss window between claim loop and flush loop if channel closes in between (channelmanager.rs:10190-10203). With force_holding_cell = true, preimages exist only in the channel's holding cell until the flush persists them. force_shutdown discards ClaimHTLC entries silently.
  • DuplicateClaim startup replay: the batch path doesn't re-register RAA blockers or completion actions when all claims are duplicates during startup replay
  • Various nits about comments, code style, and documentation accuracy

New Observations

No major new bugs or security issues found beyond what was previously flagged. The code improvements since the last review have addressed the most critical concerns.

Minor observations:

  • The PR description claims test_simple_mpp, auto_retry_partial_failure, and test_keysend_dup_hash_partial_mpp were updated, but only test_single_channel_multiple_mpp (in chanmon_update_fail_tests.rs) appears modified in the diff. The other tests likely pass unchanged because they use higher-level claim helpers.
  • The startup replay path (channelmanager.rs:20709) correctly uses force_holding_cell: false for per-part replay, avoiding the batch path entirely during deserialization.
  • The handle_monitor_update_completion_actions code at line 10655 correctly handles the one-action-per-channel batch semantics: the retain closure processes exactly one ClaimedMPPPayment blocker per channel, and the freed_channels logic correctly unblocks all channels once all preimages are durable.

Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 88d62c4 to 5880633 Compare April 12, 2026 11:08
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz jkczyz requested review from TheBlueMatt and removed request for jkczyz April 14, 2026 22:09
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 5880633 to d801ec9 Compare April 16, 2026 15:08
@Abeeujah
Copy link
Copy Markdown
Contributor Author

let _ = htlc_value_msat

It's intentionally left unused right now, pending any design decision that would be made for the msat values that field holds, in the case of a single claim, there's a completion_action function computing with the value, In this case, it is a batch claim, so it's open ended to how batch claims should work

  • Do we sum all since they were claimed at a single swoop and pass to completion_action
  • Should a Batch Claim Event be implemented as there is for a single claim

This is why that field exist, the absense of the direction it should take is the reason it is intentionally currently left unused.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 70.22901% with 78 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.36%. Comparing base (44828f7) to head (fc83c75).
⚠️ Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 68.34% 59 Missing and 4 partials ⚠️
lightning/src/ln/channel.rs 76.19% 15 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4552      +/-   ##
==========================================
+ Coverage   86.12%   86.36%   +0.23%     
==========================================
  Files         157      158       +1     
  Lines      108922   109493     +571     
  Branches   108922   109493     +571     
==========================================
+ Hits        93812    94561     +749     
+ Misses      12495    12381     -114     
+ Partials     2615     2551      -64     
Flag Coverage Δ
fuzzing-fake-hashes 5.06% <0.00%> (?)
fuzzing-real-hashes 22.89% <64.17%> (?)
tests 86.09% <70.22%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Copy Markdown
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would this not be simpler by using the holding cell instead? When we're failing, we call queue_fail_htlc which just shoves the failure into the holding cell then call check_free_peer_holding_cells to release them all at once. Rather than rebuilding that logic for claims, we should be able to do something similar. We'd have to handle the monitor updates with the preimages a bit differently, but not drastically so.

@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from d801ec9 to 5992824 Compare April 27, 2026 14:00
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 5992824 to f466457 Compare April 27, 2026 15:44
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from f466457 to dddd924 Compare May 13, 2026 18:26
Comment thread lightning/src/ln/chanmon_update_fail_tests.rs
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch 2 times, most recently from 086a669 to 2a28f1d Compare May 14, 2026 12:39
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs
Comment thread lightning/src/ln/channelmanager.rs Outdated
When multiple parts of an MPP payment arrive on the same channel, claim
them via the holding cell so they are released as one combined
ChannelMonitorUpdate (every PaymentPreimage step plus a single
commitment_signed) instead of one round-trip per part.

claim_payment_internal now forces the holding-cell path whenever there
is more than one source, records each touched channel, and flushes them
after queueing. The single-source case keeps the existing inline fast
path.

FundedChannel::get_update_fulfill_htlc gains a force_holding_cell flag
that suppresses the per-claim ChannelMonitorUpdate so the flush can
emit one combined update. When the holding cell cannot be freed
immediately (awaiting RAA, monitor update in progress, disconnected,
quiescent, ...), build_preimage_only_monitor_update writes a one-step
preimage update so the preimage stays durable across restarts; the
commitment_signed follows when the holding cell is naturally flushed.

Update test_single_channel_multiple_mpp, test_simple_mpp, and
auto_retry_partial_failure to reflect the new single-round-trip
behavior, and drop the threaded reproducer that targeted the old
per-claim path.
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 2a28f1d to fc83c75 Compare May 14, 2026 16:24
@Abeeujah Abeeujah requested a review from TheBlueMatt May 15, 2026 11:33
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

// When the caller asked us to push into the holding cell for batching
// (`force_holding_cell`) we skip producing a `ChannelMonitorUpdate` here; the caller is
// responsible for either flushing the holding cell (producing one combined update with
// every queued preimage and the commitment) or building a preimage-only update for
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a massive layer violation - the ChannelManager shouldn't be making decisions around how to build ChannelMonitorUpdates, we shouldn't need a manual build_preimage_only_monitor_update, etc. Instead, we should be able to queue all the updates, then use the existing check_free_holding_cells logic (or check_free_peer_holding_cells to filter by the channels we actually updated) and have the channel automatically include the preimage updatesteps in the ChannelMonitorUpdate it builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better parallelize MPP claims

4 participants