-
Notifications
You must be signed in to change notification settings - Fork 129
[tstate] inline byte to string conversions #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Sam Batschelet <[email protected]>
Signed-off-by: Sam Batschelet <[email protected]>
fe7f852
to
3002fd4
Compare
Signed-off-by: Sam Batschelet <[email protected]>
Signed-off-by: Sam Batschelet <[email protected]>
ts.scopeStorage = storage | ||
} | ||
|
||
// checkScope returns whether [k] is in ts.readScope. | ||
func (ts *TState) checkScope(_ context.Context, k []byte) bool { | ||
for _, s := range ts.scope { | ||
// TODO: benchmark and see if creating map is worth overhead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this very much depends on the size of the set. since lookup is O(n) its reasonable to have a map if we might have 8+ keys. I added the change for ref but we can revert the commit if we feel the set size would remain similar to tokenvm (4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the lookup is much faster for map it does increase the alloc in FetchAndSetScope
by a fair amount. As this set grows into possibly hundreds the time gained on lookup will be probably overshadowed by memory bloat/cpu time.
before
goos: linux
goarch: amd64
pkg: github.com/ava-labs/hypersdk/tstate
cpu: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
BenchmarkFetchAndSetScope/fetch_and_set_scope_4_keys_with_length_65-12 894080 1408 ns/op 800 B/op 7 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_8_keys_with_length_65-12 687463 2001 ns/op 1473 B/op 11 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_16_keys_with_length_65-12 372249 3031 ns/op 2833 B/op 19 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_32_keys_with_length_65-12 243076 5262 ns/op 5424 B/op 35 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_64_keys_with_length_65-12 110935 14345 ns/op 11416 B/op 68 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_128_keys_with_length_65-12 42150 26776 ns/op 22680 B/op 132 allocs/op
after
goos: linux
goarch: amd64
pkg: github.com/ava-labs/hypersdk/tstate
cpu: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
BenchmarkFetchAndSetScope/fetch_and_set_scope_4_keys_with_length_65-12 716866 1802 ns/op 1392 B/op 14 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_8_keys_with_length_65-12 404772 2923 ns/op 2529 B/op 22 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_16_keys_with_length_65-12 282474 4318 ns/op 4824 B/op 38 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_32_keys_with_length_65-12 157590 7921 ns/op 9284 B/op 70 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_64_keys_with_length_65-12 83341 15155 ns/op 19376 B/op 136 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_128_keys_with_length_65-12 42186 40291 ns/op 38448 B/op 264 allocs/op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverting for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see the cost of lookup as the set increases.
BenchmarkCheckScope/set_scope_4_keys_with_length_65-12 55952859 21.66 ns/op 0 B/op 0 allocs/op
BenchmarkCheckScope/set_scope_8_keys_with_length_65-12 30695019 39.45 ns/op 0 B/op 0 allocs/op
BenchmarkCheckScope/set_scope_16_keys_with_length_65-12 14245750 84.93 ns/op 0 B/op 0 allocs/op
BenchmarkCheckScope/set_scope_32_keys_with_length_65-12 7016282 160.0 ns/op 0 B/op 0 allocs/op
BenchmarkCheckScope/set_scope_64_keys_with_length_65-12 3868280 286.7 ns/op 0 B/op 0 allocs/op
BenchmarkCheckScope/set_scope_128_keys_with_length_65-12 2166471 547.1 ns/op 0 B/op 0 allocs/op
8f89b1a
to
f687764
Compare
Signed-off-by: Sam Batschelet <[email protected]>
Signed-off-by: Sam Batschelet <[email protected]>
We should do the same for: https://github.com/ava-labs/hypersdk/blob/main/chain/processor.go |
Signed-off-by: Sam Batschelet <[email protected]>
This PR has become stale because it has been open for 30 days with no activity. Adding the |
I explored a few alternative approaches and generally found this PR does provide some small gains that could be considered as an incremental performance improvement. During research I found that using a 64bit hash such as
xxhash64
for the map key was a large performance improvement this is a the approach taken with for examplefast cache
. But since we also need to handle hash collisions the complexity reduced performance until it was not worth the effort at small scale.This approach is documented here https://pthevenet.com/posts/programming/go/bytesliceindexedmaps/#optimization local results from the benchmark confirm.
before
after
benchvset-after.cpu.txt
[1] https://github.com/VictoriaMetrics/fastcache