-
Notifications
You must be signed in to change notification settings - Fork 7
Sync head and tail using Sparkscan API #270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync head and tail using Sparkscan API #270
Conversation
|
@dangeross that looks good. TBH I am still worried about moving the sync logic to be totally depend on sparkscan ATM. spark-sdk/crates/breez-sdk/core/src/sdk.rs Line 399 in c331e1e
Now that we have both implementations available to review can we do a comparison and list the tradeoffs here so we can take the right decision? |
|
´> Now that we have both implementations available to review can we do a comparison and list the tradeoffs here so we can take the right decision? @roeierez the advantages (of using sparkscan) I see are:
disadvantages:
Maybe there are some I missed, I'll keep thinking about it. But Daniel's implementation looks really nice I think |
|
I agree. One more advantage in Daniels approach is that:
I tend to think we should keep the spark scan implementation inactive for now. At list for the first tokens release. Perhaps we will integrate it as an optional sync strategy in later versions? |
27dbbe3 to
92dc1cb
Compare
b755861 to
c73a1ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Left one comment for better clarity.
crates/breez-sdk/core/src/sdk.rs
Outdated
| if SPARKSCAN_ENABLED { | ||
| self.sync_pending_payments().await?; | ||
| self.sync_payments_head_to_storage(&object_repository) | ||
| .await?; | ||
| } else { | ||
| self.sync_bitcoin_payments_to_storage(&object_repository) | ||
| .await?; | ||
| self.sync_token_payments_to_storage(&object_repository) | ||
| .await?; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have these in different objects out of the sdk.rs as sync strategies. Seems it will make it easier to follow the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review the PR in detail or in its entirety. Just dropped here to raise the concern about how pending token payments are dealt with when syncing them :)
| const TX_CACHE_KEY: &str = "tx_cache"; | ||
| const STATIC_DEPOSIT_ADDRESS_CACHE_KEY: &str = "static_deposit_address"; | ||
|
|
||
| // Old keys (avoid using them) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can get rid of this
| async fn sync_token_payments_to_storage( | ||
| &self, | ||
| object_repository: &ObjectCacheRepository, | ||
| ) -> Result<(), SdkError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't the kind of issues raised here still present in this implementation? I just skimmed through it, and I didn't see any obvious changes that would address them.
Might be missing something though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue wasn't addressed, this sync strategy was just reintroduced. I'll take a look at it today
|
The Spark sync strategy for token transactions now uses the payment id of the synced payment to check if we already have that payment stored in it's final state (completed/failed). If its final we can assume its in a fixed position in the token transaction list and as it's now final, it was previously synced and we've caught up. Fixed an issue when sending a token payment, the id is set to the txid:vout causing the payment to never be updated in the sync and be forever pending. Instead we use the TokenOutput id for the payment id by reusing |
crates/breez-sdk/core/src/sdk.rs
Outdated
| last_sync_time = SystemTime::now(); | ||
| } | ||
|
|
||
| if let Err(e) = sdk.sync_service.sync_historical_payments().await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see sync_historical_payments is called anyway whenever the sync_wallet_internal is called. Perhaps it should be an implementation details as part of the sync_service_sync_payments() function?
Implemented this way only in sparkscan strategy?
| "Encountered already finalized payment {}, stopping sync", | ||
| payment.id | ||
| ); | ||
| break 'page_loop; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be a problem here if we started sync, got 1-2 pages and then restarted.
If before we restart some payments were done then the sync will fetch some payments that exists in the db and according to this logic may exit early and not sync all the way back.
One way I think of solving this is to have the newer payment id that we should sync to (and then stop).
One way to solve this would be:
- maintain a payment id (in a separate cached item) that is the newest id we should sync up to (stop when we getting into this payment).
- Whenever we reach that payment we update it to be the newest final payment id in the db.
- On first sync is is None so we should sync until we have no more pages and then updated it to the newest final one.
09078a4 to
cc795ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment but regardless LGTM
| async fn sync_payments(&self) -> Result<(), SdkError>; | ||
| } | ||
|
|
||
| pub enum SyncStrategy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the advantabe of the SyncStrategy enum? Vs just using the SyncService trait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this makes it static instead of dynamic dispatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I mainly have a concern about uniformity between the sync implementations.
| sync_type_res = sync_trigger_receiver.recv() => { | ||
| if let Ok(sync_type) = sync_type_res { | ||
| info!("Sync trigger changed: {:?}", &sync_type); | ||
|
|
||
| if let Err(e) = sdk.sync_wallet_internal(sync_type.clone()).await { | ||
| error!("Failed to sync wallet: {e:?}"); | ||
| } else if matches!(sync_type, SyncType::Full) { | ||
| last_sync_time = SystemTime::now(); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes make the awaited sync_wallet_internal future uncancellable by shutdown_receiver. Is it intentional? It can lead to mem leak issues on nodejs
| // Group outputs by owner public key | ||
| let mut outputs_by_owner = std::collections::HashMap::new(); | ||
| for output in &transaction.outputs { | ||
| outputs_by_owner | ||
| .entry(output.owner_public_key) | ||
| .or_insert_with(Vec::new) | ||
| .push(output); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sparkscan implementation doesn't group outputs and will create a payment for each output, even if it pays to the same recipient. I decided to make that simplification when switching to sparkscan but now that we want to keep both implementations, we should make this behavior uniform.
Also, for sparkscan we don't have access to token output ids to use as payment id as we do here. I think it may be fine not to fix it for now if we document it appropriately as a TODO. Until then, one can't switch sync implementations in the same instance, as it will lead to duplicate payments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. If we keep both implementations we need them to use the same id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to not group outputs and to unify the payment id as the token tx hash. It's only appended with the ":vout" if there are multiple outputs to/from the user (depending on direction) 6c6e075
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. That way we have compatible ids in most cases. There will still be edge cases where the ids may not match (multiple outputs to same recipient) because sparkscan doesn't expose the vout.
782945b to
11a7a8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a bunch of questions. I will ask more in the future. I first need to understand what head and what tail sync is I think.
| address_transactions, | ||
| ssp_user_requests, | ||
| }) = self | ||
| .fetch_address_transactions_with_ssp_user_requests( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I my mind whenever there was a request to the server, the resulting payments should be stored, so that you never have to request it again. What is the reason we fetch transactions in batches, accumulate them, and then store them? Is it to make it easier to reason about head sync? Either you completed it entirely or you haven't completed it at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it could be improved. I think it insures we don't overwrite the last_synced_payment_id with an older payment id, so we collect them first then order them. If we hit a server issue/API rate limit we just move on to processing what we have.
Either you completed it entirely or you haven't completed it at all?
Do you mean in terms of the whole sync or the sync of a single payment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean in terms of the whole sync or the sync of a single payment?
I mean in terms of sync progress. You have either synced it entirely or none at all, according to last_synced_payment_id.
| } | ||
|
|
||
| // Insert what synced payments we have into storage from oldest to newest | ||
| payments_to_sync.sort_by_key(|p| p.timestamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the order we insert payments matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We at least need to know the newest synced payment in the end for the last_synced_payment_id, and they are not necessarily in order (see fn comment). The loop below inserts the payments chronologically, storing also the last_synced_payment_id for each inserted payment
| }) = self | ||
| .fetch_address_transactions_with_ssp_user_requests( | ||
| &legacy_spark_address, | ||
| next_offset, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About sparkscan: So a 0 offset gets you the latest transactions? Isn't that racy? You must always assume the index of the latest payment you've fetched could have increased in the meantime. Do we know why it's like this?
| let mut next_offset = self | ||
| .storage | ||
| .list_payments(None, None, Some(PaymentStatus::Completed)) | ||
| .await? | ||
| .len() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? Because you could have synced completed payments from the head as well as the tail? Should it be a cached number instead?
Co-authored-by: Daniel Granhão <[email protected]>
c4df9b9 to
64c384c
Compare
9182f6b to
9c70b27
Compare
|
Closing this in favor of the base #199 PR, which was also changed to rollback to bitcoin/token scan but without keeping the sparkscan implementation. The sparkscan implementation from here can be found in |
Splits
sync_payments_to_storage()fn intosync_payments_head_to_storage()andsync_payments_tail_to_storage()so it can sync both the head and tail.The first head sync syncs 1 page then lets the tail sync do the rest. The following head syncs will attempt to sync until the
last_synced_payment_idis found. If we get an error from the Sparkscan API (maybe hit a request limit), we store what payments we have and cache thenext_head_offsetto start the head sync from in the next cycle.The tail sync syncs a maximum of 5 pages per cycle, starting at an offset counting the stored completed payments. Once there are no more pages available, we set the
tail_syncedand won't try to sync it again.Update: This also temporarily disables using Sparkscan to sync and reverts to using @danielgranhao's bitcoin/token sync