Skip to content

Conversation

@chocola-mint
Copy link
Contributor

@chocola-mint chocola-mint commented Feb 26, 2025

Closes godotengine/godot-proposals#9947

This PR implements the proposal above, making Sprite3Ds (SpriteBase3Ds to be precise) reuse meshes whenever possible.

There's still some overhead associated with this reuse system, (on top of Sprite3D's existing overhead) so the performance isn't completely identical to using MeshInstance3Ds, but the performance gain is still very noticeable. See the results attached below.

Benchmarking is performed using the project provided in the proposal. Though in this case the Renderdoc captures should be enough to confirm the performance increase.

10000 Quad MeshInstance3Ds (which already use instancing) (40~ FPS)

mesh_instance_3d

mesh_instance_3d_renderdoc

10000 Sprite3Ds without Instancing (< 1 FPS)

sprite_3d

sprite_3d_renderdoc

10000 Sprite3Ds with Instancing (30~ FPS)

sprite_3d_instancing

sprite_3d_instancing_renderdoc

Implementation details:

  • To avoid having to create and destroy meshes frequently, this PR does not create any new mesh.
  • A SpriteBase3D will try to use the mesh of an existing SpriteBase3D with the same SpriteMeshKey.
    • SpriteMeshKey being a small POD struct that identifies a Sprite mesh configuration.
    • This key is defined a bit differently than the hash proposed in the proposal above, but should be more accurate and probably a bit more memory-efficient.
    • SpriteMeshKeys are generated and recorded every time draw_texture_rect is called.
  • If no such SpriteBase3D can be found, it will fallback to using its own mesh, and registering itself to the static HashMap shared_sprites so other SpriteBase3Ds can share its mesh.
  • When a SpriteBase3D wants to share another SpriteBase3D's mesh, it adds itself to its users vector.
  • When a SpriteBase3D that's sharing its mesh wants to update its own mesh, it will pick a "successor" and copy its mesh data to the successor, then let every other user share the successor's mesh instead. Afterwards, the users vector is cleared.
  • When a SpriteBase3D that's sharing someone else's mesh wants to update its own mesh, it will not bother removing itself from the previous sprite's users vector, as there are mechanisms in place that allow a SpriteBase3D to ignore invalid users.
    • See the source code for more detail.

Remaining issues:

  • Instancing isn't supported in the Compatibility renderer, (at the moment) in which case this optimization would add additional overhead without performance gains. However, currently there doesn't seem to be a fast way to check if mesh instancing is supported. (outside of checking RS::get_singleton()->get_current_rendering_method() == "gl_compatibility", which would add significant overhead on its own)
    • Perhaps there should be an option to turn off Sprite3D instancing altogether, and this optimization can be configured in the project's Rendering settings.
    • In this case, what should the option be called? Maybe rendering/3d/optimization/automatic_sprite_3d_instancing?

@chocola-mint chocola-mint requested a review from a team as a code owner February 26, 2025 08:11
@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch 2 times, most recently from 94d626e to 10fcacf Compare February 26, 2025 09:47
@AThousandShips AThousandShips added this to the 4.x milestone Feb 26, 2025
@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch 4 times, most recently from 8e33549 to fa159f1 Compare March 4, 2025 04:48
@TokageItLab TokageItLab requested a review from a team March 4, 2025 05:56
@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch from fa159f1 to c0a09f7 Compare March 5, 2025 05:08
@Calinou
Copy link
Member

Calinou commented Mar 5, 2025

However, currently there doesn't seem to be a fast way to check if mesh instancing is supported. (outside of checking RS::get_singleton()->get_current_rendering_method() == "gl_compatibility", which would add significant overhead on its own)

Can we check this the first time a Sprite3D is instanced and cache it in a static variable? The rendering method can't change at runtime.

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected.

Testing project: https://github.com/Calinou/godot-sprite3d-vs-meshinstance3d

PC specifications
  • CPU: Intel Core i9-13900K
  • GPU: NVIDIA GeForce RTX 4090
  • RAM: 64 GB (2×32 GB DDR5-5800 C30)
  • SSD: Solidigm P44 Pro 2 TB
  • OS: Linux (Fedora 41)

Using an optimized release export template build (production=yes lto=full) and the default window size from the testing project.

Amount Mesh Sprite before Sprite after
0 8480 FPS (0.12 mspf) 8480 FPS (0.12 mspf) 8480 FPS (0.12 mspf)
1,000 2896 FPS (0.35 mspf) 979 FPS (1.02 mspf) 2790 FPS (0.36 mspf)
5,000 762 FPS (1.31 mspf) 143 FPS (6.99 mspf) 709 FPS (1.41 mspf)
10,000 465 FPS (2.15 mspf) 64 FPS (15.63 mspf) 442 FPS (2.26 mspf)

I also quickly tested the Mobile rendering method and the speedup is more modest there, but MeshInstance3D is already significantly slower than it is in Forward+. From 64 FPS with 10,000 Sprite3Ds, you go to 266 FPS with this PR. For comparison, 10,000 MeshInstance3Ds is 281 FPS. This may be chalked up to the lack of depth prepass in Mobile though, since meshes are opaque in this project (and Sprite3D uses alpha-cut transparency).

@chocola-mint
Copy link
Contributor Author

Can we check this the first time a Sprite3D is instanced and cache it in a static variable? The rendering method can't change at runtime.

Got it.

The new version also included some extra fixes to handle what happens when a sprite gets deleted while using another sprite. (Basically making the users vector hold weak pointers that get invalidated automatically)

@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch from fac5770 to 97ff301 Compare March 6, 2025 12:36
@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch from 97ff301 to d18405d Compare May 14, 2025 15:17
@chocola-mint
Copy link
Contributor Author

chocola-mint commented May 14, 2025

Rebased to fix minor conflict with #105785

@clayjohn
Copy link
Member

Just as a heads up. I think this is something that is wanted, I've added it to the rendering team meeting agenda to get more eyes on it

Copy link
Contributor

@Ansraer Ansraer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I took a glance at this PR just now. I like the idea, but am not sure I am happy with the implementation.

As far as I can tell right now one of the Sprite3Ds (called A for the sake of this discussion) stores the entire mesh and all the other Sprite3Ds that want to use the same mesh access Sprite3D's mesh.

This means we have special cases for Sprite3D's sharing their own mesh, accessing another Sprite3D's mesh and what happens when the the Sprite3D that owns the mesh gets deleted (with some kind of handover procedure to let another Sprite3D "own" the mesh).

IMO it would be far simpler if the mesh management was pulled out of Sprite3D entirely (maybe into "Sprite3DMeshes"), which would do simple counting for how many Sprite3Ds need a given mesh.
This way we would remove the seperation between mesh owning and mesh subscribing Sprite3Ds.

@ydeltastar
Copy link
Contributor

ydeltastar commented May 28, 2025

Billboards don't work well. These all happen after enabling and disabling the billboard flag:

  • The Sprite3D stays in billboard mode after I disable it.
  • Reloading the scene (Scene > Reload Saved Scene) sometimes loads the mesh and texture from another Sprite3D in the scene.
  • Sometimes the sprite disappears completely after reloading. ERROR: C:\godot\godot\servers/rendering/renderer_scene_cull.cpp:691 - unimplemented base type encountered in renderer scene cull
  • I get access violation crashes sometimes when the previous issues mix.

@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch from d18405d to 45174a4 Compare May 29, 2025 14:35
@chocola-mint
Copy link
Contributor Author

chocola-mint commented May 29, 2025

@Ansraer

As far as I can tell right now one of the Sprite3Ds (called A for the sake of this discussion) stores the entire mesh and all the other Sprite3Ds that want to use the same mesh access Sprite3D's mesh.

Yep, that's basically it.

I think the alternative is actually flawed in practice. I assume that you're suggesting something like this:

  • There is a Sprite3DMeshManager that manages every mesh for Sprite3Ds.
  • Every Sprite3D asks Sprite3DMeshManager for a mesh that fits its SpriteMeshKey when draw_texture_rect is called.
    • If it doesn't exist, a new mesh is created and assigned to it. The previous mesh it is assigned to gets its reference count reduced by 1.
    • Otherwise, it shares an existing mesh and increases its reference count by 1.
    • When a mesh's reference count hits 0, it is freed.
    • When the Sprite3D gets freed, the mesh it is currently using also gets its reference count reduced by 1.

While it is simpler in terms of implementation, I believe it will perform badly in practice.

A common use case for Sprite3Ds is to change its color dynamically with modulate. Every time this happens, it is most certainly not going to be able to find an existing mesh to reuse.

  • Imagine if we're fading a Sprite3D out with modulate.a over 1 second - This is going to cause the Sprite3D to allocate and free 60 different mesh instances over that 1 second. (assuming 60 FPS)

Other frequent changes to the mesh data (UVs for AnimatedSprite3Ds for example) will also have similar issues, but modulate blending is likely the most serious case here.

Reusing the mesh already created in Sprite3D's constructor (as demonstrated in this implementation) avoids this pitfall by basically providing a perfect upper bound for allocated meshes. A Sprite3D can keep sharing its mesh with others until it wants to change its mesh, in which case it picks a successor, copy its mesh data over, and tell everyone else to share with that successor instead. No new meshes need to be allocated and no old meshes need to be freed aggressively.


@ydeltastar

Billboards don't work well. These all happen after enabling and disabling the billboard flag:

  • The Sprite3D stays in billboard mode after I disable it.
  • Reloading the scene (Scene > Reload Saved Scene) sometimes loads the mesh and texture from another Sprite3D in the scene.
  • Sometimes the sprite disappears completely after reloading. ERROR: C:\godot\godot\servers/rendering/renderer_scene_cull.cpp:691 - unimplemented base type encountered in renderer scene cull
  • I get access violation crashes sometimes when the previous issues mix.

Just noticing now that the material (including the shader and the texture used) also needs to be batched alongside the mesh. This is an oversight on my part, sorry! I've updated the implementation to reflect this.

Changes
  • In draw_texture_rect, material update is reordered to happen before mesh data is updated, allowing us to use the updated shader RID in hashing.
  • Material-related info (including shader and texture RIDs) is added into SpriteMeshKey. This inflates the struct to 112 bytes unfortunately.

@ydeltastar
Copy link
Contributor

It's much better now. A problem remaining is that two nodes, once they share the same configuration, become quantum entangled. Changing material flags on one will also affect the other.

To reproduce:

  • Create a Sprite3D with a texture.
  • Duplicate the node.
  • Enable billboard in the duplicated node.
  • Then enable billboard in the original too.
  • Now changing any material flag affects both.

@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch from 45174a4 to 7b973bb Compare May 30, 2025 14:46
@chocola-mint
Copy link
Contributor Author

@ydeltastar

It's much better now. A problem remaining is that two nodes, once they share the same configuration, become quantum entangled. Changing material flags on one will also affect the other.

Fixed. It was a very sneaky bug.

SpriteBase3D::_stop_sharing_sprite before:

if (last_sprite_mesh_key.alpha_cut_disabled) {
	RS::get_singleton()->material_set_render_priority(get_material(), get_render_priority());
	RS::get_singleton()->mesh_surface_set_material(successor->mesh, 0, get_material());
}

SpriteBase3D::_stop_sharing_sprite after:

if (last_sprite_mesh_key.alpha_cut_disabled) {
	RS::get_singleton()->material_set_render_priority(successor->get_material(), get_render_priority());
	RS::get_singleton()->mesh_surface_set_material(successor->mesh, 0, successor->get_material());
}

(Basically, I accidentally assigned the same material to the successor's mesh, causing the material to be shared between two meshes)

@chocola-mint
Copy link
Contributor Author

I had similar issues to the previous but while toggling shaded between three nodes. After toggling it in two nodes, toggling it in the third node affects the other two instead.

You may need to decrease the number of bookkeeping variables as having to keep multiple places updated increases the complexity. Most of them do the same things as the hash map. For example, this on top of your last changes seems to fix the issues with flags, and everything seems to be working:

-		if (sharing_own_mesh) {
+		if (shared_sprites.has(last_sprite_mesh_key)) {
			if (last_sprite_mesh_key != sprite_mesh_key) {
				// Sprite mesh data changed.
				_stop_sharing_sprite();

using_sprite and using_sprite_user_index can probably be replaced with shared_sprites.getptr(current_key) too.

I think having bookkeeping variables here is worth it for minimizing overhead. That is, avoiding shared_sprites.getptr() calls as much as possible.

The actual oversight here is forgetting that every shared_sprites.insert should be paired with setting sharing_own_mesh to true. I had forgotten to set this in the successor logic.

			if (users[i] && users[i]->using_sprite == this) {
				successor = users[i];
				shared_sprites.insert(last_sprite_mesh_key, successor);
+				successor->sharing_own_mesh = true;

This logic has been refactored into _start_sharing_sprite(), ensuring that it will always be set correctly.

void SpriteBase3D::_start_sharing_sprite() {
	shared_sprites.insert(last_sprite_mesh_key, this);
	sharing_own_mesh = true;
}

With some extra refactoring I also added the _start_using_sprite(SpriteBase3D *p_using_sprite) function. This eliminates the unnecessary code duplication in the previous version and should make the logic a lot more readable and maintainable.

void SpriteBase3D::_start_using_sprite(SpriteBase3D *p_using_sprite) {
	using_sprite = p_using_sprite;
	using_sprite_user_index = using_sprite->users.size();
	set_base(using_sprite->mesh);
	set_aabb(using_sprite->aabb);
	using_sprite->users.push_back(this);
	// We don't need to remove this sprite from the previous shared sprite's users list,
	// as setting using_sprite means they will be detected and filtered out later.
}

@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch 2 times, most recently from f05c085 to 85de885 Compare June 7, 2025 05:52
@chocola-mint chocola-mint requested a review from a team as a code owner June 7, 2025 05:52
@chocola-mint
Copy link
Contributor Author

Update: Added unit tests for completeness.

@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch 8 times, most recently from f70f54b to 523b1e7 Compare June 7, 2025 10:29
@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch 3 times, most recently from 0d1ca8c to cb4d9d1 Compare June 8, 2025 00:10
@chocola-mint
Copy link
Contributor Author

Finally got the unit tests passing on every CI build. Using memcmp for SpriteMeshKey was having issues with precision=double due to uninitialized trailing padding bytes. (The solution was to make SpriteMeshKey tightly packed with #pragma pack(1))

I've also changed the hashing algorithm to be identical to MaterialKey's (hash_djb2_buffer) for consistency.

@akien-mga akien-mga requested review from a team and Ansraer June 9, 2025 09:00
@Repiteo Repiteo modified the milestones: 4.x, 4.6 Jun 16, 2025
@chocola-mint chocola-mint force-pushed the sprite-3d-instancing branch from cb4d9d1 to c0b9bd0 Compare June 18, 2025 14:10
@chocola-mint
Copy link
Contributor Author

Rebased due to RenderingServer::free() being deprecated here: 9fbf580

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reuse Sprite3D meshes across nodes when possible to benefit from automatic instancing

7 participants