Optimize GDScriptInstance::notification for better performance#94118
Conversation
|
I see |
|
@CrazyRoka Can you share your test project? I'm curious to run the profiler on my machine before approving your PR. |
I've just created a new repository with my test project. Have a look: https://github.com/CrazyRoka/godot-benchmark-moving-cubes/tree/main |
adamscott
left a comment
There was a problem hiding this comment.
This is quite an improvement. I tested out @CrazyRoka's test repo and I've seen an improvement of ~61% over one minute of benchmark testing. I've only tested once, so don't take my results too seriously.
I would wait though for @RandomShaper's input though.
GDScriptInstance::notification for better performance
|
First, as a disclaimer, note that I haven't reviewed if the logic is exactly equivalent, but it has been reviewed by others and any failure I guess would be pretty notorious. Now, I think the changes are very good. If anything, I'd use |
|
Yeah this should use LocalVector. |
|
I'd be interested which of the changes cause how much of a difference. Specifically, I'm somewhat skeptical of the thread_local use - the rest is harmless i suppose - but if it contributes a significant margin to the speed, it's of course worth it. |
aaronp64
left a comment
There was a problem hiding this comment.
I dont think we can use thread_local here, as one notification could cause another to be called before it's finished - the second call would overwrite the values in the thread_local vector from the first call while it's still iterating through them. For example, the code below prints "Base1 Base2 Child2 Child1" on master, but "Base1 Base2 Child2 Child2" with this change - the second inheritance chain is overwriting the first in between the Base1 and Child1 calls, so Child1 is replaced with Child2:
func _ready():
var child1 := Child1.new()
child1.notification(NOTIFICATION_ENABLED)
class Base1 extends Node:
func _notification(what: int) -> void:
if what == NOTIFICATION_ENABLED:
print("Base1")
var child2 := Child2.new()
child2.notification(NOTIFICATION_ENABLED)
class Child1 extends Base1:
func _notification(what: int) -> void:
if what == NOTIFICATION_ENABLED:
print("Child1")
class Base2 extends Node:
func _notification(what: int) -> void:
if what == NOTIFICATION_ENABLED:
print("Base2")
class Child2 extends Base2:
func _notification(what: int) -> void:
if what == NOTIFICATION_ENABLED:
print("Child2")
Maybe we could loop through the _base pointers to get the count, then use alloca instead of the List to get a similar speedup?
|
Out of curiosity I tested both removing
This was my first time using a profiler, so please take this with a grain of salt. Edit: tested again using the test project above:
Given the issue with |
This comment was marked as outdated.
This comment was marked as outdated.
afaa94c to
c3f8998
Compare
|
@DeeJayLSP Thanks for the advice and deep investigations. I applied your suggestion and used LocalVector. |
There was a problem hiding this comment.
Changes still don't address Aaron's concerns (#94118 (review)). This needs to be fixed by removing static.
While not using a static might limit the performance somewhat, replacing List is the most important part, so this should still be faster than master in either case.
If we wanted to improve this further, we should use reserve or alloca, but that would require a cached depth on the GDScript, so this is for another time.
c3f8998 to
bcc0922
Compare
Ivorforce
left a comment
There was a problem hiding this comment.
While there are more ways to improve this, this should be a straight upgrade on all fronts. Thanks for keeping up with the reviews! Let's finally get this merged.
|
Thanks! |


Optimize GDScriptInstance::notification for better performance
This PR optimizes the
GDScriptInstance::notificationfunction to improve performance and reduce CPU usage.Changes
List<GDScript *>with aLocalVector<GDScript *>to reduce allocations and deallocations.push_front()andpush_back()operations by reusing the LocalVector across calls.unlikely()hint for vector growth condition to optimize the common case.Performance Impact
Listoperations, which were causing significant CPU usage.Testing
Profiling data with current mainline
Profiling data with my changes
This optimization should significantly reduce the CPU time spent in
GDScriptInstance::notification, particularly for projects with frequent notifications or deep GDScript inheritance hierarchies.