-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Related to #275.
This is an extension of the original refactor (#135) that gave us contexts, with the intent of making our APIs support multiple SPs in single calls.
Currently we:
const context = await synapse.storage.createContext()
// and then
await synapse.storage.upload(blob, { context })
In synapse.storage
(StorageManager
), the upload()
, download()
and even terminateDataSet()
(should be renamed to terminate()
) functions should be extended so that their context
argument can be an array of contexts. Then the operations will be performed on all contexts in ~parallel.
const context1 = await synapse.storage.createContext()
const context2 = await synapse.storage.createContext()
// and then
await synapse.storage.upload(blob, { context: [ context1, context2 ] })
Further, we should make it easy to make multiple contexts, and document this as the golden path to setting up:
const contexts = await synapse.storage.createContexts({ count: 2 }) // we can make the default `2` so the option is not even necessary
await synapse.storage.upload(blob, { contexts })
This becomes our default way of setting up and uploading. createContext()
(singular) becomes an advanced operation, devs are encouraged just to createContexts()
and they're up and going and get built-in duplicate operations.
There are some challenges here:
Context creation
createContexts()
could have the following options object:
export interface CreateContextsOptions {
/** Number of contexts to create (optional, defaults to 2) */
count?: number
/**
* Specific provider IDs to use (if not using providerAddresses)
* must be no longer than count (optional)
*/
providerIds?: number[]
/**
* Specific provider address to use (optional)
* If not using providerIds
* Must be no longer than count
*/
providerAddresses?: string[]
/**
* Specific data set IDs to use (optional)
* Cannot be used with provider options
* Must be no longer than count
*/
dataSetIds?: number[]
/**
* Custom metadata for the data sets (key-value pairs)
* When smart-selecting data sets, this metadata will be used to match.
*/
metadata?: Record<string, string>
/** Force creation of a new data sets, even if a candidates exist */
forceCreateDataSets?: boolean
/** Callbacks for creation process (will need to change to handle multiples) */
callbacks?: StorageCreationCallbacks
/** Maximum number of uploads to process in a single batch (default: 32, minimum: 1) */
uploadBatchSize?: number
}
- By default we can use the smart provider selection, but we have to be able to land on separate providers when we do multiple. This could involve extending the smart selection to
exclude
a list of providers so that it smart selects any but the one(s) you've listed. So, in the case of a fresh wallet with no data sets, we run it once, get a provider, run it again with anexclude
for that provider and we now have two separate providers. - We have to support the various cases of existing data sets - there may be enough that match metadata requirements, or there may be fewer than
count
so we need to make more. - We currently have a limitation of having to pipeline data set creation due to the
clientDataSetId
nonce in the contract. @wjmelements is working on an EIP-3009-style non-sequential nonce solution for us to make this parallel, but for now, we can just queue them up and run them one by one. - Callbacks for context creation will need tweaking to handle multiples (
StorageCreationCallbacks
above) - How to handle failures - do we fall-back automatically to trying again excluding the provider that failed?
Upload
Uploads can be done in parallel without any problems, and we only have per-data set nonces so we can even submit AddRoots in parallel for different providers. But there are some nuances:
- Callbacks may need to be adjusted to account for multiples - we could pretend and present operations as singular, but that would give users the opportunity to be more granular in their UX feedback.
- Error conditions will need some care - we could just start off with basic failure notifications, but how to communicate partial failure and is there any remedial action we can/should take? We could at least advise the user to try again with only the context(s) that failed, but it'll be a bit knotty.
- CommP should only be calculated once! The streaming calculation in feat(pdp): new PieceCID-last flow (internal only) #280 only needs to be done for one of the upload streams, and the resulting CommP can be used to finalize all of the uploads.
Download
This is nice because you're simply saying "try downloading from any of these providers", and that's an easy operation of just racing to find the first to return a response body and cancelling the rest. We have code that does most of this already.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status