Skip to content

Conversation

@dangeross
Copy link
Collaborator

@dangeross dangeross commented Sep 23, 2025

Splits sync_payments_to_storage() fn into sync_payments_head_to_storage() and sync_payments_tail_to_storage() so it can sync both the head and tail.

The first head sync syncs 1 page then lets the tail sync do the rest. The following head syncs will attempt to sync until the last_synced_payment_id is found. If we get an error from the Sparkscan API (maybe hit a request limit), we store what payments we have and cache the next_head_offset to start the head sync from in the next cycle.

The tail sync syncs a maximum of 5 pages per cycle, starting at an offset counting the stored completed payments. Once there are no more pages available, we set the tail_synced and won't try to sync it again.

Update: This also temporarily disables using Sparkscan to sync and reverts to using @danielgranhao's bitcoin/token sync

@roeierez
Copy link
Member

@dangeross that looks good. TBH I am still worried about moving the sync logic to be totally depend on sparkscan ATM.
I went back to look at Daniel's sync version (before spark scan was introduced) here:

async fn sync_token_payments_to_storage(

Now that we have both implementations available to review can we do a comparison and list the tradeoffs here so we can take the right decision?

@dangeross
Copy link
Collaborator Author

´> Now that we have both implementations available to review can we do a comparison and list the tradeoffs here so we can take the right decision?

@roeierez the advantages (of using sparkscan) I see are:

  • Single sync for both bitcoin/token payments
  • Token payment sync doesn't require a second parent tx query per result (includes all needed data)
  • Ordering of results doesn't change

disadvantages:

  • Rate limits
  • Dependent on an external service

Maybe there are some I missed, I'll keep thinking about it. But Daniel's implementation looks really nice I think

@roeierez
Copy link
Member

I agree. One more advantage in Daniels approach is that:

  1. We are able to keep full payments history
  2. For those that don't use tokens they don't need to get into the rate limit issues without compromising ux

I tend to think we should keep the spark scan implementation inactive for now. At list for the first tokens release. Perhaps we will integrate it as an optional sync strategy in later versions?

@dangeross dangeross force-pushed the daniel-breez-sdk-token-support branch from 27dbbe3 to 92dc1cb Compare September 24, 2025 12:21
@dangeross dangeross force-pushed the savage-sparkscan-bidirectional-sync branch 2 times, most recently from b755861 to c73a1ad Compare September 24, 2025 13:03
Copy link
Member

@roeierez roeierez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left one comment for better clarity.

Comment on lines 346 to 355
if SPARKSCAN_ENABLED {
self.sync_pending_payments().await?;
self.sync_payments_head_to_storage(&object_repository)
.await?;
} else {
self.sync_bitcoin_payments_to_storage(&object_repository)
.await?;
self.sync_token_payments_to_storage(&object_repository)
.await?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have these in different objects out of the sdk.rs as sync strategies. Seems it will make it easier to follow the process.

@dangeross dangeross marked this pull request as ready for review September 24, 2025 17:57
Copy link
Collaborator

@danielgranhao danielgranhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the PR in detail or in its entirety. Just dropped here to raise the concern about how pending token payments are dealt with when syncing them :)

const TX_CACHE_KEY: &str = "tx_cache";
const STATIC_DEPOSIT_ADDRESS_CACHE_KEY: &str = "static_deposit_address";

// Old keys (avoid using them)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we can get rid of this

Comment on lines +110 to +112
async fn sync_token_payments_to_storage(
&self,
object_repository: &ObjectCacheRepository,
) -> Result<(), SdkError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the kind of issues raised here still present in this implementation? I just skimmed through it, and I didn't see any obvious changes that would address them.

Might be missing something though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue wasn't addressed, this sync strategy was just reintroduced. I'll take a look at it today

@dangeross
Copy link
Collaborator Author

The Spark sync strategy for token transactions now uses the payment id of the synced payment to check if we already have that payment stored in it's final state (completed/failed). If its final we can assume its in a fixed position in the token transaction list and as it's now final, it was previously synced and we've caught up.

Fixed an issue when sending a token payment, the id is set to the txid:vout causing the payment to never be updated in the sync and be forever pending. Instead we use the TokenOutput id for the payment id by reusing token_transaction_to_payments() in the same way as used in the sync

last_sync_time = SystemTime::now();
}

if let Err(e) = sdk.sync_service.sync_historical_payments().await {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see sync_historical_payments is called anyway whenever the sync_wallet_internal is called. Perhaps it should be an implementation details as part of the sync_service_sync_payments() function?
Implemented this way only in sparkscan strategy?

"Encountered already finalized payment {}, stopping sync",
payment.id
);
break 'page_loop;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be a problem here if we started sync, got 1-2 pages and then restarted.
If before we restart some payments were done then the sync will fetch some payments that exists in the db and according to this logic may exit early and not sync all the way back.
One way I think of solving this is to have the newer payment id that we should sync to (and then stop).
One way to solve this would be:

  • maintain a payment id (in a separate cached item) that is the newest id we should sync up to (stop when we getting into this payment).
  • Whenever we reach that payment we update it to be the newest final payment id in the db.
  • On first sync is is None so we should sync until we have no more pages and then updated it to the newest final one.

@dangeross dangeross force-pushed the savage-sparkscan-bidirectional-sync branch 2 times, most recently from 09078a4 to cc795ce Compare September 26, 2025 15:20
Copy link
Member

@roeierez roeierez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment but regardless LGTM

async fn sync_payments(&self) -> Result<(), SdkError>;
}

pub enum SyncStrategy {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantabe of the SyncStrategy enum? Vs just using the SyncService trait?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this makes it static instead of dynamic dispatch

Copy link
Collaborator

@danielgranhao danielgranhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I mainly have a concern about uniformity between the sync implementations.

Comment on lines +260 to +270
sync_type_res = sync_trigger_receiver.recv() => {
if let Ok(sync_type) = sync_type_res {
info!("Sync trigger changed: {:?}", &sync_type);

if let Err(e) = sdk.sync_wallet_internal(sync_type.clone()).await {
error!("Failed to sync wallet: {e:?}");
} else if matches!(sync_type, SyncType::Full) {
last_sync_time = SystemTime::now();
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make the awaited sync_wallet_internal future uncancellable by shutdown_receiver. Is it intentional? It can lead to mem leak issues on nodejs

Comment on lines 50 to 57
// Group outputs by owner public key
let mut outputs_by_owner = std::collections::HashMap::new();
for output in &transaction.outputs {
outputs_by_owner
.entry(output.owner_public_key)
.or_insert_with(Vec::new)
.push(output);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sparkscan implementation doesn't group outputs and will create a payment for each output, even if it pays to the same recipient. I decided to make that simplification when switching to sparkscan but now that we want to keep both implementations, we should make this behavior uniform.

Also, for sparkscan we don't have access to token output ids to use as payment id as we do here. I think it may be fine not to fix it for now if we document it appropriately as a TODO. Until then, one can't switch sync implementations in the same instance, as it will lead to duplicate payments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. If we keep both implementations we need them to use the same id.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to not group outputs and to unify the payment id as the token tx hash. It's only appended with the ":vout" if there are multiple outputs to/from the user (depending on direction) 6c6e075

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. That way we have compatible ids in most cases. There will still be edge cases where the ids may not match (multiple outputs to same recipient) because sparkscan doesn't expose the vout.

@danielgranhao danielgranhao force-pushed the daniel-breez-sdk-token-support branch from 782945b to 11a7a8f Compare October 2, 2025 08:37
Copy link
Collaborator

@JssDWt JssDWt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a bunch of questions. I will ask more in the future. I first need to understand what head and what tail sync is I think.

address_transactions,
ssp_user_requests,
}) = self
.fetch_address_transactions_with_ssp_user_requests(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I my mind whenever there was a request to the server, the resulting payments should be stored, so that you never have to request it again. What is the reason we fetch transactions in batches, accumulate them, and then store them? Is it to make it easier to reason about head sync? Either you completed it entirely or you haven't completed it at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it could be improved. I think it insures we don't overwrite the last_synced_payment_id with an older payment id, so we collect them first then order them. If we hit a server issue/API rate limit we just move on to processing what we have.

Either you completed it entirely or you haven't completed it at all?

Do you mean in terms of the whole sync or the sync of a single payment?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean in terms of the whole sync or the sync of a single payment?

I mean in terms of sync progress. You have either synced it entirely or none at all, according to last_synced_payment_id.

}

// Insert what synced payments we have into storage from oldest to newest
payments_to_sync.sort_by_key(|p| p.timestamp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the order we insert payments matter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We at least need to know the newest synced payment in the end for the last_synced_payment_id, and they are not necessarily in order (see fn comment). The loop below inserts the payments chronologically, storing also the last_synced_payment_id for each inserted payment

}) = self
.fetch_address_transactions_with_ssp_user_requests(
&legacy_spark_address,
next_offset,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About sparkscan: So a 0 offset gets you the latest transactions? Isn't that racy? You must always assume the index of the latest payment you've fetched could have increased in the meantime. Do we know why it's like this?

Comment on lines +357 to +361
let mut next_offset = self
.storage
.list_payments(None, None, Some(PaymentStatus::Completed))
.await?
.len() as u64;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Because you could have synced completed payments from the head as well as the tail? Should it be a cached number instead?

@dangeross dangeross force-pushed the savage-sparkscan-bidirectional-sync branch from c4df9b9 to 64c384c Compare October 6, 2025 07:42
@danielgranhao danielgranhao force-pushed the daniel-breez-sdk-token-support branch from 9182f6b to 9c70b27 Compare October 6, 2025 12:08
@danielgranhao
Copy link
Collaborator

Closing this in favor of the base #199 PR, which was also changed to rollback to bitcoin/token scan but without keeping the sparkscan implementation. The sparkscan implementation from here can be found in sparkscan-sync in case it's helpful in the future as a reference.

@danielgranhao danielgranhao deleted the savage-sparkscan-bidirectional-sync branch October 6, 2025 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants