-
Notifications
You must be signed in to change notification settings - Fork 418
Async background persistence #3905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Thanks for assigning @tnull as a reviewer! |
1b95d30
to
21dc34c
Compare
3fb7d6b
to
1847e8d
Compare
1f59bbe
to
723a5a6
Compare
bc9c29a
to
90ab1ba
Compare
lightning/src/util/sweep.rs
Outdated
fn persist_state<'a>( | ||
&self, sweeper_state: &SweeperState, | ||
) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> { | ||
let encoded = &sweeper_state.encode(); | ||
|
||
self.kv_store.write( | ||
OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE, | ||
OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE, | ||
OUTPUT_SWEEPER_PERSISTENCE_KEY, | ||
encoded, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encoded
variable is captured by reference in the returned future, but it's a local variable that will be dropped when the function returns. This creates a potential use-after-free issue. Consider moving ownership of encoded
into the future instead:
fn persist_state<'a>(
&self, sweeper_state: &SweeperState,
) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> {
let encoded = sweeper_state.encode();
self.kv_store.write(
OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE,
OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE,
OUTPUT_SWEEPER_PERSISTENCE_KEY,
&encoded,
)
}
This ensures the data remains valid for the lifetime of the future.
fn persist_state<'a>( | |
&self, sweeper_state: &SweeperState, | |
) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> { | |
let encoded = &sweeper_state.encode(); | |
self.kv_store.write( | |
OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE, | |
OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE, | |
OUTPUT_SWEEPER_PERSISTENCE_KEY, | |
encoded, | |
) | |
fn persist_state<'a>( | |
&self, sweeper_state: &SweeperState, | |
) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> { | |
let encoded = sweeper_state.encode(); | |
self.kv_store.write( | |
OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE, | |
OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE, | |
OUTPUT_SWEEPER_PERSISTENCE_KEY, | |
&encoded, | |
) | |
Spotted by Diamond
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this real?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so as the compiler would likely optimize that away, given that encoded
will be an owned value (Vec
returned by encode()
). Still, the change that it suggests looks cleaner.
In general it will be super confusing that we encode
at the time of creating the future, but would only actually persist once we dropped the lock. Starting from now we'll need to be super cautious about the side-effects of interleaving persist calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that an async kv store store encodes the data and stores the write action in a queue at the moment the future is created. Things should still happen in the original order.
Can you show a specific scenario where we have to be super cautious even if we have that queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved &
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that an async kv store store encodes the data and stores the write action in a queue at the moment the future is created. Things should still happen in the original order.
If that is the idea that we start assuming in this PR, we should probably also start documenting these assumptions in this PR on KVStore
already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this requirement to the async KVStore
trait doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM, just one real comment and a doc nit.
lightning/src/util/sweep.rs
Outdated
} | ||
|
||
output_info.status.broadcast(cur_hash, cur_height, spending_tx.clone()); | ||
self.broadcaster.broadcast_transactions(&[&spending_tx]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it used to be the case that we'd first persist, wait for that to finish, then broadcast. I don't think its critical, but it does seem like we should retain that behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to first await the persist future, and then broadcast.
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
1809349
to
8f79368
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question about the requirements we want, but figuring out the answer doesn't have to block landing this PR as-is.
) -> Result<Vec<u8>, io::Error>; | ||
/// Persists the given data under the given `key`. | ||
) -> Pin<Box<dyn Future<Output = Result<Vec<u8>, io::Error>> + 'static + Send>>; | ||
/// Persists the given data under the given `key`. Note that the order of multiple writes calls needs to be retained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh actually, do we want this to be the restriction, or do we want "the order of multiple writes to the same key needs to be retained"? I imagine the second, we don't currently have a need inside LDK to require a strict total order, and it could definitely substantially slow down async persist. cc @tnull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One related thing I've been thinking about is whether it is okay to skip a stale write? If two consecutive same-key writes are executed out of order, is it fine to simply drop the first write? Or could it be that we do need to read that first written data at some point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how it could not be okay - writes overwrite, so if there's two writes to the same key we're required to eventually end up with the second one on disk. Only question, I guess, is whether we're allowed to complete the second future first, then the first future later, and still end up with the second future's write. I think that's something we should accept (and document?) but that's the only caller-observable question, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of a write -> read -> write pattern, but I believe we already established that that isn't happening in LDK. We weren't going to do ordering for reads anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of a write -> read -> write pattern,
Hmm, that's indeed a good question, i.e., whether we'd need to deal with interleaving reads also, otherwise we may end up reading data that was written later, actually?
but I believe we already established that that isn't happening in LDK.
I'm not sure where we established that, but for LDK that def. won't be the case for much longer, as we'll want to migrate to stores that are not completely held in-memory, and we'll read data on-demand on cache failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of a write -> read -> write pattern, but I believe we already established that that isn't happening in LDK. We weren't going to do ordering for reads anyway.
Hmm, that's indeed a good question, i.e., whether we'd need to deal with interleaving reads also, otherwise we may end up reading data that was written later, actually?
I don't see an issue here - after the storer calls write
, the data may be in place (ie returned by a call to read
) and after write
's future completes is will be in place. That is implicit in the API, and is in fact required by any similar-looking API - you cannot know what is happening after you start the write
call, so relying on anything other than the above would obviously be race-y. The same holds for multiple calls to write to the same key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look mostly good to me, some minor comments.
be6eaa8
to
2a00b9b
Compare
2a00b9b
to
63551c9
Compare
In preparation for the addition of an async KVStore, we here remove the Persister pseudo-wrapper. The wrapper is thin, would need to be duplicated for async, and KVStore isn't fully abstracted anyway anymore because the sweeper takes it directly.
63551c9
to
673e0a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixups look mostly good to me, two nits/comments. Feel free to squash from my side.
lightning/src/util/sweep.rs
Outdated
let (fut, res) = { | ||
let mut state_lock = self.sweeper_state.lock().unwrap(); | ||
|
||
let (res, persist_if_dirty) = callback(&mut state_lock)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe calling this skip_persist
might be more intuitive? I also wonder if adding another update_state_skipping_persist
method would be cleaner than having the secondary return vale on the callback that is only used in one place. But no hard blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip_persist
is indeed nicer, made the change.
Another method update_state_skipping_persist
doesn't work, because the callback may or may not want to skip persist. Also the duplication, or extra abstraction...
673e0a8
to
9afb72c
Compare
Stripped down version of #3778. It allows background persistence to be async, but channel monitor persistence remains sync. This means that for the time being, users wanting async background persistence would be required to implement both the sync and the async
KVStore
trait. This model is available throughprocess_events_full_async
.process_events_async
still takes a synchronous kv store to remain backwards compatible.Usage in ldk-node: lightningdevkit/ldk-node@main...joostjager:ldk-node:upgrade-to-async-kvstore