Skip to content

Conversation

@ahfuzhang
Copy link
Contributor

@ahfuzhang ahfuzhang commented Oct 28, 2025

Describe Your Changes

  1. Use bitwise division code
  2. Remove array out-of-bounds checks
  3. Use bitwise division to speed up when the bitmap is a power of 2

Checklist

The following checks are mandatory:

@ahfuzhang ahfuzhang marked this pull request as draft October 29, 2025 01:20
@ahfuzhang ahfuzhang marked this pull request as ready for review October 29, 2025 01:21
@ahfuzhang
Copy link
Contributor Author

@valyala Could you review this PR ? thank you.

@valyala
Copy link
Contributor

valyala commented Oct 30, 2025

@ahfuzhang , could you provide benchmark results for these performance improvements? Please add missing benchmarks in a separate pull request if needed, so software engineers could validate performance improvements by running go test -bench=... command. This concerns all the pull requests with performance optimizations.


import (
"fmt"
bitutil "math/bits"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't a good idea to use aliases for packages, since it complicates investigating and reading the code. It is better to rename local variables, which clash with the package names instead.

bf.bits = bits
}

func roundUpPowerOf2(x int64) int64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't used, so it must be removed. Also, it might be better from the readability and performance PoV to use bits.Len64() instead of the manual calculations (if this function will be used in the future):

func roundUpPowerOf(x int64) int64 {
  return int64(1 << bits.Len(uint64(x)))
}


func initBloomFilter(bits, hashes []uint64) {
maxBits := uint64(len(bits)) * 64
maxBits := uint64(len(bits)) << 6 // x<<6 means x*64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization is performed by Go compiler, so it isn't needed - it just complicates reading and maintaining the code.

maxBits := uint64(len(bits)) * 64
maxBits := uint64(len(bits)) << 6 // x<<6 means x*64
ptr := unsafe.Pointer(unsafe.SliceData(bits))
if bitutil.OnesCount64(maxBits) == 1 { // Determine whether it is a power of 2
Copy link
Contributor

@valyala valyala Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fast path will be triggered in very rare cases, so its' usefulness is questionable. Suppose the average size of the bloom filter is 32*128 bits. Then this optimization will be triggered only 6 out of 31 times - for (64*1, 64*2, 64*4, 64*8, 64*16, 64*32) bit lengths, or in 6/31=19% of cases. It isn't worth complicating the code for such cases. Additionally, more code in hot paths means the CPU must fit more code into instruction cache. This may lead to slowdowns if the needed instructions cannot fit the instruction cache.

Comment on lines +150 to +151
i := idx >> 6
j := idx & 63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These optimizations aren't needed - Go compiler performs them on itself.

@ahfuzhang
Copy link
Contributor Author

Thank you very much.

My optimization wasn't rigorous enough, so I'll cancel it for now.

Thank you for your time.

@ahfuzhang ahfuzhang closed this Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants