Skip to content

Latest commit

 

History

History
320 lines (231 loc) · 9.05 KB

PerformanceAndResourceUsage.md

File metadata and controls

320 lines (231 loc) · 9.05 KB

Performance and resourse usage, some are not strictly Go related

Consider using a time.Sleep in the default case of a select statement (with multiple channel reads/writes cases) to avoid unnecessary computation checks

Substring and memory leaks (100 Go Mistakes #41)

If a substring needs to be extracted and saved, make a copy (strings.Clone) so that the backing array size of the output substring is not pointing to the original full string (do not use subString := originalHugeString[:10])

Nil Channel (100 Go Mistakes #66)

Once a channel is closed and don't need to be read anymore, assign nil to it so that it can't be read in a select statement: if the channel is just closed then zero-value, false are read.

	for ch1 != nil || ch2 != nil {
		select {
		case v, open := <-ch1:
			if !open {
				ch1 = nil
				break
			}
			ch <- v
		case v, open := <-ch2:
			if !open {
				ch2 = nil
				break
			}
			ch <- v
		}
	}

time.After memory leaks (100 Go Mistakes #76)

func consumer(ch <-chan Event) {
	for {
		select {
		case event := <-ch:
			handle(event)
		case <-time.After(time.Hour):
			log.Println("warning: no messages received")
		}
	}
}

For each loop a new time.Time channel is returned by time.After: this channel is closed only when the timeout expires which can cause memory leaks.

A solution is to use instantiate a single *time.Timer with time.NewTimer and to use the Reset function:

func consumer(ch <-chan Event) {
	timer := time.NewTimer(time.Hour)
	defer timer.Stop()
	for {
		timer.Reset(time.Hour)
		select {
		case event := <-ch:
			handle(event)
		case <-timer.C:
			log.Println("warning: no messages received")
		}
	}
}

Reduce allocations (100 Go Mistakes #96 and go-perfbook)

  • Prefer share down approach to prevent auto escape to the heap
// Share down
type Reader interface {
	Read(p []byte) (n int, err error)
}

// Share up
type Reader interface {
	Read(n int) (p []byte, err error)
}
  • Allow passing in buffers so caller can reuse and slice can be modified in place
  • use error variables instead of errors.New() / fmt.Errorf() at call site (performance or style? interface requires pointer, so it escapes to heap anyway)
  • Use strconv instead of fmt if possible
  • Use strings.EqualFold(str1, str2) instead of strings.ToLower(str1) == strings.ToLower(str2) or strings.ToUpper(str1) == strings.ToUpper(str2) to efficiently compare strings if possible.
  • Use string(yourByteSlice) to access a map[string]any
  • Use sync.Pool to reuse already allocated memory
	var pool = sync.Pool{
		New: func() any {
			return make([]byte, 1024)
		},
	}

	write := func(w io.Writer) {
		buffer := pool.Get().([]byte)
		buffer = buffer[:0]
		defer pool.Put(buffer)
		//
	}

No Allocation (Go Optimizations 101)

  • If the input slice is allowed to be mutated, then avoid allocations.
func check(v int) bool {
	return v%2 == 0
}

func FilterOneAllocation(data []int) []int {
	var r = make([]int, 0, len(data))
	for _, v := range data {
		if check(v) {
			r = append(r, v)
		}
	}
	return r
}

func FilterNoAllocations(data []int) []int {
	var k = 0
	for i, v := range data {
		if check(v) {
			data[i] = data[k]
			data[k] = v
			k++
		}
	}
	return data[:k]
}
  • strings.EqualFold(a, b) is more performant than strings.ToLower(a) == strings.ToLower(b)

Index tables vs maps (Go Optimizations 101)

MapSwitch is around 10 times slower than IfElse and IndexTable

func IfElse(x bool) func() {
	if x {
		return f
	} else {
		return g
	}
}

var m = map[bool]func(){true: f, false: g}

func MapSwitch(x bool) func() {
	return m[x]
}

func b2i(b bool) int {
	if b {
		return 1
	}
	return 0
}

var a = [2]func(){g, f}

func IndexTable(x bool) func() {
	return a[b2i(x)]
}

Select Read with channels (Go Optimizations 101)

If possible make it so that there are fewer channel read cases in a select statement by merging the channels and differentiating by a field condition. See select_test.go.

# go version go1.21.2 linux/amd64
go test -bench=. select_test.go

Benchmark_Select_OneCase-12             57681339                20.64 ns/op
Benchmark_Select_TwoCases-12            23804878                49.43 ns/op
Benchmark_Select_OneNil-12              35653648                33.27 ns/op
Benchmark_TwoChannels-12                 7241138               165.4 ns/op
Benchmark_OneChannel_Interface-12       10793074               110.0 ns/op
Benchmark_OneChannel_Struct-12          10981497               112.5 ns/op

Calling interface methods (Go Optimizations 101)

Calling an interface method requires some extra cost. See interface_test.go

# go version go1.21.2 linux/amd64
go test -bench=. interface_test.go

Benchmark_Add_Inline-12         1000000000               0.3437 ns/op
Benchmark_Add_NotInlined-12     775364438                1.545 ns/op
Benchmark_Add_Interface-12      771482826                1.557 ns/op

Pre-Allocate Efficient Go Chapter 11

If the size of a slice is known it is better to allocate the needed memory at the beginning. append will re-allocate the memory in case the size is going to exceed the slice capacity.

func ReadAll1(r io.Reader, size int) ([]byte, error) {
	buf := bytes.Buffer{}
	buf.Grow(size)
	n, err := io.Copy(&buf, r)
	return buf.Bytes()[:n], err
}
func ReadAll2(r io.Reader, size int) ([]byte, error) {
	buf := make([]byte, size)
	n, err := io.ReadFull(r, buf)
	if err == io.EOF {
		err = nil
	}
	return buf[:n], err
}

Both ReadAll1 and ReadAll2 are faster than the io.ReadAll from the standard library but the byte slice size must be known.

An example is extracting the Content-Length header from an http response to preallocate the slice needed to read the body

For selection problems make two random picks and choose the best from those

Instead of looping over a huge list, divide it into smaller batches / buffers and double loop over them: each batch / buffer (within a certain range) is moved to a cache / faster memory

buffer := make([]uint, 256)
j := uint(0)
j, buffer = bitmap.NextSetMany(j, buffer)
for ; len(buffer) > 0; j, buffer = bitmap.NextSetMany(j, buffer) {
     for k := range buffer {
        // do something with buffer[k]
     }
     j += 1
}

Scan a slice by keeping track of an offset

Reading materials

Tools

Examples