Skip to content

Conversation

TheBlueMatt
Copy link
Collaborator

This finally adds support for full native Rust `async` persistence
to `ChainMonitor`.

Way back when, before we had any other persistence, we added the
`Persist` trait to persist `ChannelMonitor`s. It eventualy grew
homegrown async persistence support via a simple immediate return
and callback upon completion. We later added a persistence trait
in `lightning-background-processor` to persist the few fields that
it needed to drive writes for. Over time, we found more places
where persistence was useful, and we eventually added a generic
`KVStore` trait.

In dc75436c673fad8b5b8ed8d5a9db1ac95650685a we removed the
`lightning-background-processor` `Persister` in favor of simply
using the native `KVStore` directly.

Here we continue that trend, building native `async`
`ChannelMonitor` persistence on top of our native `KVStore` rather
than hacking support for it into the `chain::Persist` trait.
Because `MonitorUpdatingPersister` already exists as a common way
to wrap a `KVStore` into a `ChannelMonitor` persister, we build
exclusively on that (though note that the "monitor updating" part
is now optional), utilizing its new async option as our native
async driver.

Thus, we end up with a `ChainMonitor::new_async_beta` which takes
a `MonitorUpdatingPersisterAsync` rather than a classic
`chain::Persist` and then operates the same as a normal
`ChainMonitor`.

While the requirement that users now use a
`MonitorUpdatingPersister` to wrap their `KVStore` before providing
it to `ChainMonitor` is somewhat awkward, as we move towards a
`KVStore`-only world it seems like `MonitorUpdatingPersister`
should eventually merge into `ChainMonitor`.

Curious to hear what folks think. No tests yet which I need to do.

Though users maybe shouldn't use `MonitorUpdatingPersister` if they
don't actually want to persist `ChannelMonitorUpdate`s, we also
shouldn't panic if `maximum_pending_updates` is set to zero.
In the coming commits `MonitorUpdatingPersister`'s internal state
will be reworked. To avoid spurious test diff, we instead use the
public API of `MonitorUpdatingPersister` rather than internal bits
in tests.
In the coming commits, we'll use the `MonitorUpdatingPersister` as
*the* way to do async monitor updating in the `ChainMonitor`.
However, to support folks who don't actually want a
`MonitorUpdatingPersister` in that case, we explicitly support them
setting `maximum_pending_updates` to 0, disabling all of the
update-writing behavior.
As we've done with several other structs, this adds an async
variant of `MonitorUpdatingPersister` and adds an async-sync
wrapper for those using `KVStoreSync`. Unlike with other structs,
we leave `MonitorUpdatingPersister` as the sync variant and make
the new async logic a `MonitorUpdatingPersisterAsync` as the async
monitor updating flow is still considered beta.

This does not yet expose the async monitor updating logic anywhere,
as doing a standard `Persist` async variant would not work for
ensuring the `ChannelManager` and `ChainMonitor` don't block on
async writes or suddenly require a runtime.
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Sep 9, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Copy link

codecov bot commented Sep 9, 2025

Codecov Report

❌ Patch coverage is 60.96866% with 137 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.30%. Comparing base (ecce859) to head (dfaa102).
⚠️ Report is 31 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/util/persist.rs 70.09% 76 Missing and 14 partials ⚠️
lightning/src/chain/chainmonitor.rs 6.00% 46 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4063      +/-   ##
==========================================
- Coverage   88.76%   88.30%   -0.47%     
==========================================
  Files         176      177       +1     
  Lines      129518   131652    +2134     
  Branches   129518   131652    +2134     
==========================================
+ Hits       114968   116254    +1286     
- Misses      11945    12745     +800     
- Partials     2605     2653      +48     
Flag Coverage Δ
fuzzing 21.57% <1.03%> (-0.45%) ⬇️
tests 88.14% <60.96%> (-0.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tnull tnull self-requested a review September 9, 2025 07:44
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see how you wrestle the wrappers, polling and generics, especially after having fought with that myself for quite some time.

@@ -542,15 +551,15 @@ where
kv_store: K, logger: L, maximum_pending_updates: u64, entropy_source: ES,
signer_provider: SP, broadcaster: BI, fee_estimator: FE,
) -> Self {
MonitorUpdatingPersister {
kv_store,
MonitorUpdatingPersister(MonitorUpdatingPersisterAsync::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good commit. A lot happening, but also nothing happening, but the changes are out of the way.

);
(start, end)
})
let latest_update_id = monitor.get_latest_update_id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there test coverage to make this change safely? Migration code often isn't covered as well. It seems that persister_with_real_monitors provides some, but not sure what exactly.

Also wondering if this change is necessary for this PR. Otherwise it might be better to split it off.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its sadly required. We have to make the persist_new_channel call as a non-async call and then block its future later. If we kept the old code which read the old monitor, we'd need to await it before calling persist_new_channel and suddenly we can have write-order inversions. I'll add a comment describing why its important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this would break upgrades for any monitors written pre-0.1, right (i.e., even if users just have them stilll lying around)? Should we document that in the pending_changelog?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't? The change is still functionally-equivalent, we just do the read and cleanup after the first write.

/// [`MonitorUpdatingPersisterAsync`] and thus allows persistence to be completed async.
///
/// Note that async monitor updating is considered beta, and bugs may be triggered by its use.
pub fn new_async_beta(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are now three ways to do persistence: sync, the previous async way via implementing a different Persist and this new_async_beta?

Is there any form of consolidation possible between the two async setups?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I mention it in the last commit, but I think the eventual consolidation should be that we merge MonitorUpdatingPersister into ChainMonitor and then the Persist interface is just the interface between ChannelManager and ChainMonitor, a user will always just instantiate a ChainMonitor with either a KVStore or a KVStoreSync and we'll deal with the rest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. I just wondered if we should already now steer towards MonitorUpdatingPersister with an async kv store as the only way to do async. I don't think it is more "beta" than the current callback-based async?

&self, monitor_name: MonitorName, monitor: &ChannelMonitor<<SP::Target as SignerProvider>::EcdsaSigner>,
) {
let inner = Arc::clone(&self.0);
let future = inner.persist_new_channel(monitor_name, monitor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a comment also here somewhere that it is important to get the future before spawning to make sure we recorded the ordering.

// completion of the write. This ensures monitor persistence ordering is preserved.
res_c = Some(self.persist_new_channel(monitor_name, monitor));
}
async move {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since only one of the three results is set, can't there be three separate return blocks right after the future is created? Or do you then get into future type issues?

Either way it may be helpful to explain this construct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will leave a comment.

@@ -534,6 +534,10 @@ where
/// less frequent "waves."
/// - [`MonitorUpdatingPersister`] will potentially have more listing to do if you need to run
/// [`MonitorUpdatingPersister::cleanup_stale_updates`].
///
/// Note that you can disable the update-writing entirely by setting `maximum_pending_updates`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the Persist impl for KvStore (non-updating) be removed now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think the next step in cleaning things up would be to consolidate ChainMonitor and MonitorUpdatingPersister and then we'd remove that blanket impl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could already remove it now and let users use MonitorUpdatingPersister with zero updates? Or you want to keep it to avoid a two-step api change?

self.0.future_spawner.spawn(async move {
match future.await {
Ok(()) => {}, // TODO: expose completions
Ok(()) => inner.async_completed_updates.lock().unwrap().push(completion),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind this commit and the previous being combined.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I don't have a strong opinion 🤷

@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch from c5c70bc to 53cb9f6 Compare September 9, 2025 21:06
@joostjager joostjager mentioned this pull request Sep 10, 2025
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first high-level pass, looks good so far!

I think this would be great candidate to follow the recently-discussed policy of testing on LDK Node. Mind also opening a draft PR against its develop branch we now have to test this out? (doesn't need to be production ready, happy to take it over eventually)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocker, but I'd still prefer to do the MonitorUpdatingPersisterSync renaming for consistency, instead of adding MonitorUpdatingPersisterAsync. Or, if we want to go with the latter, rename all the other structs we did so far.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed this in the commit message somewhat, but basically because we are still calling the async persist logic "Beta" it seems to make more sense to leave "the one we expect folks to use" not having a suffix.

fn poll_sync_future<F: Future>(future: F) -> F::Output {
let mut waker = dummy_waker();
let mut ctx = task::Context::from_waker(&mut waker);
// TODO A future MSRV bump to 1.68 should allow for the pin macro
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated #4002, so if that lands you might be able to rebase.

);
(start, end)
})
let latest_update_id = monitor.get_latest_update_id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this would break upgrades for any monitors written pre-0.1, right (i.e., even if users just have them stilll lying around)? Should we document that in the pending_changelog?

@@ -409,6 +411,21 @@ where
Ok(res)
}

/// A generic trait which is able to spawn futures in the background.
pub trait FutureSpawner: Send + Sync + 'static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this live in lightning-types or a new shared lightning-util crate, so that we can DRY up the one in lightning-block-sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Not entirely sure what to do with it honestly. It doesn't really below in lightning-types and adding a crate just for this seems like overkill...We could probably move it to lightning::util in a separate module and just use that one in lightning-block-sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I went ahead and did this.

struct PanicingSpawner;
impl FutureSpawner for PanicingSpawner {
fn spawn<T: Future<Output = ()> + MaybeSend + 'static>(&self, _: T) {
unreachable!();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a message here explaining what happened if it was ever hit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was clear from the PanicingSpawner name :) I could write out why its used below but it seems weird to put that here rather than where PanicingSpawner is used?

if let Some(a) = res_a {
a.await?;
}
if let Some(b) = res_b {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these else if branches to express that we'd only ever deal with one res here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more correct not to, given we're not returning the Ok from them just ?ing them?

///
/// Note that async monitor updating is considered beta, and bugs may be triggered by its use.
pub fn new_async_beta(
chain_source: Option<C>, broadcaster: T, logger: L, feeest: F,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not simply fee_estimator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cause I copied the one from new 🤷‍♂️

@TheBlueMatt
Copy link
Collaborator Author

I think this would be great candidate to follow the recently-discussed policy of testing on LDK Node. Mind also opening a draft PR against its develop branch we now have to test this out? (doesn't need to be production ready, happy to take it over eventually)

Indeed, though I don't actually think we want to use this right away - because async persist is still considered "beta" I don't think there's anything actionable in LDK Node until probably at least 0.3? I'm happy to do a test integration but does LDK Node already have support for async KVStore/use MonitorUpdatingPersister? Without those it doesn't seem worth much?

I'll at least go update ldk-sample, as my goal here is in part to make it easier to test async persist in my own node in 0.2.

@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch 2 times, most recently from 5ea91b9 to 22f0271 Compare September 11, 2025 15:05
Pre-0.1, after a channel was closed we generated
`ChannelMonitorUpdate`s with a static `update_id` of `u64::MAX`. In
this case, when using `MonitorUpdatingPersister`, we had to read
the persisted `ChannelMonitor` to figure out what range of monitor
updates to remove from disk. However, now that we have a `list`
method there's no reason to do this anymore, we can just use that.

Simplifying code that we anticipate never hitting anymore is always
a win.
In the next commit we'll use this to spawn async persistence
operations in the background, but for now we just move the
`lightning-block-sync` `FutureSpawner` into `lightning`.
In the next commit we'll add the ability to use an async `KVStore`
as the backing for a `ChainMonitor`. Here we tee this up by adding
an async API to `MonitorUpdatingPersisterAsync`. Its not intended
for public use and is thus only `pub(crate)` but allows spawning
all operations via a generic `FutureSpawner` trait, initiating the
write via the `KVStore` before any `await`s (or async functions).

Because we aren't going to make the `ChannelManager` (or
`ChainMonitor`) fully async, we need a way to alert the
`ChainMonitor` when a persistence completes, but we leave that for
the next commit.
This finally adds support for full native Rust `async` persistence
to `ChainMonitor`.

Way back when, before we had any other persistence, we added the
`Persist` trait to persist `ChannelMonitor`s. It eventualy grew
homegrown async persistence support via a simple immediate return
and callback upon completion. We later added a persistence trait
in `lightning-background-processor` to persist the few fields that
it needed to drive writes for. Over time, we found more places
where persistence was useful, and we eventually added a generic
`KVStore` trait.

In dc75436 we removed the
`lightning-background-processor` `Persister` in favor of simply
using the native `KVStore` directly.

Here we continue that trend, building native `async`
`ChannelMonitor` persistence on top of our native `KVStore` rather
than hacking support for it into the `chain::Persist` trait.
Because `MonitorUpdatingPersister` already exists as a common way
to wrap a `KVStore` into a `ChannelMonitor` persister, we build
exclusively on that (though note that the "monitor updating" part
is now optional), utilizing its new async option as our native
async driver.

Thus, we end up with a `ChainMonitor::new_async_beta` which takes
a `MonitorUpdatingPersisterAsync` rather than a classic
`chain::Persist` and then operates the same as a normal
`ChainMonitor`.

While the requirement that users now use a
`MonitorUpdatingPersister` to wrap their `KVStore` before providing
it to `ChainMonitor` is somewhat awkward, as we move towards a
`KVStore`-only world it seems like `MonitorUpdatingPersister`
should eventually merge into `ChainMonitor`.
@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch from ac1c9d5 to dfaa102 Compare September 11, 2025 16:55
@TheBlueMatt TheBlueMatt self-assigned this Sep 11, 2025
@tnull
Copy link
Contributor

tnull commented Sep 12, 2025

I'm happy to do a test integration but does LDK Node already have support for async KVStore/use MonitorUpdatingPersister? Without those it doesn't seem worth much?

For the former, it will after lightningdevkit/ldk-node#633 lands which I intend to finish ~early next week (will also need some changes over here in LDK). For the latter we already had a PR open / close that however was blocked on the LDK upgrade (lightningdevkit/ldk-node#456). I think I'll see to pick that up in the coming week(s), too.

@tnull
Copy link
Contributor

tnull commented Sep 12, 2025

will also need some changes over here in LDK

Actually: #4069

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants