Skip to content
This repository was archived by the owner on Mar 28, 2023. It is now read-only.

Comments

feat: Call provide endpoint in batches.#64

Closed
ajnavarro wants to merge 1 commit intomainfrom
feature/provide-batch
Closed

feat: Call provide endpoint in batches.#64
ajnavarro wants to merge 1 commit intomainfrom
feature/provide-batch

Conversation

@ajnavarro
Copy link
Member

This will avoid huge provide payloads.

It closes #55

Signed-off-by: Antonio Navarro Perez antnavper@gmail.com

This will avoid huge provide payloads.

Signed-off-by: Antonio Navarro Perez <antnavper@gmail.com>
@ajnavarro ajnavarro self-assigned this Nov 8, 2022
@ajnavarro ajnavarro changed the title Call provide endpoint in batches. feat: Call provide endpoint in batches. Nov 8, 2022
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick nits:

  1. batch size based on item count is not enough, we may need second one based on byte size (see comment below why)
  2. server side is does not enforce batch size limits, which is a DoS vector
  3. limits should be documented in specs (ipfs/specs#337)

}

if c.ProvideBatchSize == 0 {
c.ProvideBatchSize = 30000 // this will generate payloads of ~1MB in size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (1): this comes with various assumptions about average size of a single item, which will quickly get outdated when we have WebTransport enabled by default (/webtransport multiaddrs with two /certhash segments will baloon the size of the batch beyond initial estimate), or add more transports in the future.

In other places, such as UnixFS autosharding, we've moved away from ballpark counting items assuming they are of some arbitrary average size, and switched to calculating the total size of the final block.

Thoughts on swithcing to byte size, or having two limits? ProvideBatchSize (current one) and ProvideBatchByteSize (new one)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (1): this comes with various assumptions about average size of a single item, which will quickly get outdated when we have WebTransport enabled by default (/webtransport multiaddrs with two /certhash segments will baloon the size of the batch beyond initial estimate), or add more transports in the future.

Then we need to add a specific limit on the amount of multiaddrs allowed. But that is not the problem right now.

Checking the raw byte size of the payload will make the code hard to read and understand. In my opinion, is not worth it (have a payload of ~2Mb instead of ~900Kb because we added more multiaddresses)

Copy link
Contributor

@ischasny ischasny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for implementing.

@ajnavarro
Copy link
Member Author

@lidel:

  1. batch size based on item count is not enough, we may need second one based on byte size (see comment below why)

See my comment about why I think check byte size is not worth-it in that case.

  1. server side is does not enforce batch size limits, which is a DoS vector

Is up to the implementation to limit it.

  1. limits should be documented in specs

They are here: https://github.com/ipfs/specs/blob/main/reframe/REFRAME_KNOWN_METHODS.md#provide

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Split provide messages (message size limits)

4 participants