-
Notifications
You must be signed in to change notification settings - Fork 7
I'm confused about EmptyOutboard
#59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you explain a little bit why you're reaching for this very low-level crate instead of using iroh-blobs? If iroh-blobs does too much for your taste, and bao-tree provides too many features you don't need (such as streaming sub-ranges), have you tried looking into using the bao crate? |
Outboards are for caching the implicit merkle tree from blake3-hashing a blob. |
Thanks for response!
I was not considering using blobs, as it seems too high level for me. I have a one-off to embed some larger data in an rpc, and I'd like to have it incrementaly verificatied. I started with
Half of my point is to give feedback about the API. When I look at bao documentation it says:
which sounds to me like "an outboard" is the verification data about the data being sent, and not some cache/database/provider. So the name and description of this trait confuses me. e.g. why do I need it all during writing the data out, etc. Isn't it the job of the encoding to calculate these? And how can "decoding" work with
Now that I understand what it does, maybe it should say "not return any known parts of the outboard"? I think I have some understanding why now, but the documentation did not make it clear. Maybe
"Encoding and decoding will query the implementation of the trait for already known parts of the outboard to avoid recalculating it"? |
Seems like on the sending side, the outboard needs to be initialized and can't be the empty one, otherwise the receiving side will reject the message, but only if it's larger than a block. |
Hi. Sorry, have been traveling and am only now catching up. So EmptyOutboard is just a black hole implementation that you can pass in to e.g. decode_ranges if you don't want to store an outboard at all. E.g. you are a pure receiver and have no interest in storing the outboard because you don't want to share the data, or you are going to download the entire data anyway, so you can recompute the outboard at any time. The reason why EmptyOutboard has to have the hash and the tree is that decode_ranges needs to know the hash to be able to verify that the incoming data is correct, and the size to know what chunks to expect from the stream. As you have discovered yourself, if you use an EmptyOutboard on the send side things will just fail. You need both outboard and data to encode a verified stream. There is one exception: anything that is < a chunk group in length, there is no need for an outboard, so it will work. |
Ha. Quite relevant: https://www.iroh.computer/blog/blob-store-design-challenges |
Uh oh!
There was an error while loading. Please reload this page.
The documentation says an Outboard:
OK, so it's like a small database for the purpose of tracking things?
It confuses me because I thought "outboard" was the "extra data being written to verify chunks". So the relationship between "that data" and "the implementation of how these data is being tracked" is conflated(?) here.
But then what is
EmptyOutboard
?How does it work then? How can anything get done when all hashes are 0? Does it mean extra btree information is not recorded at all during sending, and nothing gets verified during receiving?
Why does it even need
hash
then? Isn't it going to discard everything anyway? And why it can't just calculate it from the data? Is it because the data is given whole, but sending can be done only partially?I guess maybe what I'm asking for is some nicer utility API.
I want to send bunch of data over iroh (whole), and the other side expecting it has a hash and would be great if it stopped receiving any data as soon as it realizes that it's being fed lies. That's the whole point of using verified incremental hashing.
I don't want to send chunks, and I don't want to worry about outboard. I would suspect this usage it going to be most common, so might deserve a helper API.
For reference, right now I have the following for sending and receiving data:
and I came up with it using current documentation. Would be good to know if I shot myself in the foot somewhere, or I am doing the right thing. How are the verification information (is this the/determined by the outboard trait?) transmitted here BTW?
Is this the same with transmitting data over network stream? Always first?
The text was updated successfully, but these errors were encountered: