Skip to content

Add API so that Repository write functions only compute hash once #2516

@jonmeow

Description

Summary 💡

write_blob looks like it computes the hash twice. Could a variant of write_buf/write_stream be added that avoids the second hash computation?

To detail, on Repository, I believe write_blob will:

  1. Calls compute_hash to produce a hash.
  2. Call exists with the hash for a possible early exit.
  3. Call write_buf, which calls write_stream, which again computes a hash.

The hashes computed by compute_hash (in write_blob) and write_stream should be identical, implying write_stream could avoid a hash computation and instead rely on the earlier value.

Motivation 🔦

I was looking at this while examining the write_blob implementation as part of digging into performance for jj-vcs/jj#9304, and I'd probably use it in the related code. I think it'd be an incremental performance improvement. If you're supportive of this idea, I'd be happy to contribute a PR for it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions