-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi, author of rdedup here. I'm currently using rsroll and it looks your crate is quite well polished comparing to rsroll. I wonder if there are any not-immediately obvious benefits of hash-roll.
I am currently on an optimization mission for rdedup, and the rolling checksum is sometimes a bottleneck that is not easily parallelizable.
I have filled some PRs: https://github.com/aidanhs/rsroll/pulls optimizing the performance, and first: I think some of it there could be re-done for hash-roll (especially Gear). Second: are you planning to add benchmarks?
I am planning to implement FastCDC soon, as it seems to be the state of the art https://www.usenix.org/node/196197 . If you are aware of anything better (especially faster) please let me know. If you're interested in helping, eg. by implementing this stuff in hash-roll or rdedup (I am actually not yet sure at which layer some of this stuff will be) etc. that would be awesome, and please let me know. Just asking and trying to be welcoming, no presure though.