Skip to content

hash-roll vs rsroll, benchmarks, gear CDC, and some other questions #1

@dpc

Description

@dpc

Hi, author of rdedup here. I'm currently using rsroll and it looks your crate is quite well polished comparing to rsroll. I wonder if there are any not-immediately obvious benefits of hash-roll.

I am currently on an optimization mission for rdedup, and the rolling checksum is sometimes a bottleneck that is not easily parallelizable.

I have filled some PRs: https://github.com/aidanhs/rsroll/pulls optimizing the performance, and first: I think some of it there could be re-done for hash-roll (especially Gear). Second: are you planning to add benchmarks?

I am planning to implement FastCDC soon, as it seems to be the state of the art https://www.usenix.org/node/196197 . If you are aware of anything better (especially faster) please let me know. If you're interested in helping, eg. by implementing this stuff in hash-roll or rdedup (I am actually not yet sure at which layer some of this stuff will be) etc. that would be awesome, and please let me know. Just asking and trying to be welcoming, no presure though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions