Summary 💡
write_blob looks like it computes the hash twice. Could a variant of write_buf/write_stream be added that avoids the second hash computation?
To detail, on Repository, I believe write_blob will:
- Calls
compute_hash to produce a hash.
- Call
exists with the hash for a possible early exit.
- Call
write_buf, which calls write_stream, which again computes a hash.
The hashes computed by compute_hash (in write_blob) and write_stream should be identical, implying write_stream could avoid a hash computation and instead rely on the earlier value.
Motivation 🔦
I was looking at this while examining the write_blob implementation as part of digging into performance for jj-vcs/jj#9304, and I'd probably use it in the related code. I think it'd be an incremental performance improvement. If you're supportive of this idea, I'd be happy to contribute a PR for it.
Summary 💡
write_bloblooks like it computes the hash twice. Could a variant ofwrite_buf/write_streambe added that avoids the second hash computation?To detail, on
Repository, I believewrite_blobwill:compute_hashto produce a hash.existswith the hash for a possible early exit.write_buf, which callswrite_stream, which again computes a hash.The hashes computed by
compute_hash(inwrite_blob) andwrite_streamshould be identical, implyingwrite_streamcould avoid a hash computation and instead rely on the earlier value.Motivation 🔦
I was looking at this while examining the
write_blobimplementation as part of digging into performance for jj-vcs/jj#9304, and I'd probably use it in the related code. I think it'd be an incremental performance improvement. If you're supportive of this idea, I'd be happy to contribute a PR for it.