-
-
Notifications
You must be signed in to change notification settings - Fork 459
VM performance improvements in function calls #832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
93986dc
to
d030d1e
Compare
func estimateFnArgsCount(program *Program) int { | ||
// Implementation note: a program will not necessarily go through all | ||
// operations, but this is just an estimation | ||
var count int | ||
for _, op := range program.Bytecode { | ||
if int(op) < len(opArgLenEstimation) { | ||
count += opArgLenEstimation[op] | ||
} | ||
} | ||
return count | ||
} | ||
|
||
var opArgLenEstimation = [...]int{ | ||
OpCall1: 1, | ||
OpCall2: 2, | ||
OpCall3: 3, | ||
// we don't know exactly but we know at least 4, so be conservative as this | ||
// is only an optimization and we also want to avoid excessive preallocation | ||
OpCallN: 4, | ||
// here we don't know either, but we can guess it could be common to receive | ||
// up to 3 arguments in a function | ||
OpCallFast: 3, | ||
OpCallSafe: 3, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially used a switch
in estimateFnArgsCount
but then tried with this table and got a 4% improvement in speed.
However, you can see that I am making an array with 56 elements but I'm using only 6. I preferred it this way because I think the code looks clearer. But if you prefer to make the table use exactly the number of items it needs then we would just need to make the following change (you can just apply this suggestion as it is here if you want, I just tested this exact code and it is also correctly formatted with spaces :) ):
func estimateFnArgsCount(program *Program) int { | |
// Implementation note: a program will not necessarily go through all | |
// operations, but this is just an estimation | |
var count int | |
for _, op := range program.Bytecode { | |
if int(op) < len(opArgLenEstimation) { | |
count += opArgLenEstimation[op] | |
} | |
} | |
return count | |
} | |
var opArgLenEstimation = [...]int{ | |
OpCall1: 1, | |
OpCall2: 2, | |
OpCall3: 3, | |
// we don't know exactly but we know at least 4, so be conservative as this | |
// is only an optimization and we also want to avoid excessive preallocation | |
OpCallN: 4, | |
// here we don't know either, but we can guess it could be common to receive | |
// up to 3 arguments in a function | |
OpCallFast: 3, | |
OpCallSafe: 3, | |
} | |
func estimateFnArgsCount(program *Program) int { | |
// Implementation note: a program will not necessarily go through all | |
// operations, but this is just an estimation | |
var count int | |
for _, op := range program.Bytecode { | |
op -= OpCall1 // if underflows only becomes bigger so it's ok | |
if int(op) < len(opArgLenEstimation) { | |
count += opArgLenEstimation[op] | |
} | |
} | |
return count | |
} | |
var opArgLenEstimation = [...]int{ | |
OpCall1 - OpCall1: 1, | |
OpCall2 - OpCall1: 2, | |
OpCall3 - OpCall1: 3, | |
// we don't know exactly but we know at least 4, so be conservative as this | |
// is only an optimization and we also want to avoid excessive preallocation | |
OpCallN - OpCall1: 4, | |
// here we don't know either, but we can guess it could be common to receive | |
// up to 3 arguments in a function | |
OpCallFast - OpCall1: 3, | |
OpCallSafe - OpCall1: 3, | |
} |
1e5e3d8
to
2fb1f53
Compare
Improve memory handling of function arguments in vm.VM by preallocating a single slice to hold all the arguments for all the function calls. This is based on an estimation made by inspecting the program's bytecode.
There are several points to argue here of course:
program.Arguments
to get the exact number of arguments being passed. Answer: While this is true, this adds a little more computation and an estimation works fairly good for most cases. We gain ~5% speed by making an estimation and it is likely to be good enough in many situations.In general, this optimization works well for many simple and common use cases and doesn't affect other cases.
Benchmark results: