Skip to content

Conversation

@dsnopek
Copy link
Contributor

@dsnopek dsnopek commented Jul 29, 2025

Currently, setting "Thread Model" set to "Separate" won't work on Android: it'll just render a black screen and give numerous errors to the logs

This PR gets it working via two primary changes:

  1. Setting egl_current_window_id to INVALID_WINDOW_ID on release_rendering_thread(), so that we'll do eglMakeCurrent() on the main window when Godot's rendering thread calls gl_window_make_current(). (Without this change it won't do anything, because it thinks the main window is already current.)
  2. Making the Java render thread ignore errors from eglMakeCurrent() when creating a surface, if Godot is using a separate render thread. This is necessary in the case where the EGL surface is being re-created, because eglMakeCurrent() will fail on the Java render thread because Godot's render thread has already taken ownership of the EGL context -- but this is fine, since Godot's render thread will call eglMakeCurrent() later and succeed, and that's where all the rendering will happen anyway

Additionally, since we're using eglGetCurrent*() to get the various EGL resources, we will stash them on local variables whenever we can, since they'll only be available on the thread that called eglMakeCurrent().

This PR builds on top of PR #105529, and so it'll be marked as DRAFT until that one is merged

Original description (before 2025-08-04)

Currently, setting "Thread Model" set to "Separate" won't work on Android: it'll just render a black screen and give numerous errors to the logs

This PR gets it working via two primary changes:

  1. Allow Godot's rendering thread to take ownership of the EGL display, surface and context via DisplayServerAndroid::gl_window_make_current() (and later relinquish it via DisplayServerAndroid::release_rendering_thread())
  2. Prevent the Java code from calling eglSwapBuffers() because its render thread no longer owns the EGL resources, and allow Godot's render thread to handle that on the C++ side (via DisplayServerAndroid::swap_buffers())

The implementation is kind of messy because of the way the Java-side code is structured. If we refactored that so that less was done on the Java-side, and more responsibility was given to the C++ side (as well as switching to android.opengl so we can pass native handles back and forth), then it could be done more cleanly.

However, I've taken the approach of making the smallest amount of changes to get this feature working. Given that the diff isn't that big, I think this could be fine for now?

This depends on PR #109057 and will be marked as DRAFT until it is merged

@dsnopek dsnopek added this to the 4.6 milestone Jul 29, 2025
@dsnopek dsnopek requested review from a team as code owners July 29, 2025 14:10
@dsnopek dsnopek force-pushed the opengl-working-separate-threading-model branch from d70e694 to e376535 Compare July 29, 2025 14:59
@dsnopek dsnopek marked this pull request as draft July 29, 2025 15:48
Copy link
Contributor

@m4gr3d m4gr3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at #105529, where as part of the unification/clean up logic, I'm adding support for the native code to control the egl render loop.

You should be able to rebase this PR on top of it.

@dsnopek dsnopek force-pushed the opengl-working-separate-threading-model branch from e376535 to bda1c35 Compare July 31, 2025 21:00
@dsnopek
Copy link
Contributor Author

dsnopek commented Jul 31, 2025

Take a look at #105529, where as part of the unification/clean up logic, I'm adding support for the native code to control the egl render loop.

You should be able to rebase this PR on top of it.

Thanks! With that refactor, it does make the code here a lot cleaner. I've rebased this PR on that one, and added the changes that I think should be necessary, however, it doesn't actually work.

The challenge is that with PR #105529, it seems to start by creating a (0, 0) surface, and then a little bit later creating the "real" surface with the real dimensions. This causes problems because Godot's render thread will have already taken ownership of the EGL resources after the first surface was created, so the Java render thread can't use them to make the second surface.

So, I'll need to figure out a way to prevent creating the first surface, I guess? Although, it seems like it'd still be possible for the surface to change or be resized, which would lead to the exact same problem again...

@dsnopek dsnopek force-pushed the opengl-working-separate-threading-model branch from bda1c35 to 022e3c1 Compare July 31, 2025 21:16
@m4gr3d
Copy link
Contributor

m4gr3d commented Aug 2, 2025

The challenge is that with PR #105529, it seems to start by creating a (0, 0) surface, and then a little bit later creating the "real" surface with the real dimensions.

The logic in the PR starts by creating a off screen pixel buffer surface first in to allow the native logic to start executing as soon as possible. Then when the Android surface is available, we switch from the off screen pixel buffer surface to the Android surface.
For prototyping I can add a flag to disable that new behavior.

because Godot's render thread will have already taken ownership of the EGL resources after the first surface was created, so the Java render thread can't use them to make the second surface.

I'm not sure what you mean; only the egl surface should be destroyed and recreated when the transition happens, and as you mention the same behavior should trigger when the surface is resized.

@dsnopek
Copy link
Contributor Author

dsnopek commented Aug 4, 2025

because Godot's render thread will have already taken ownership of the EGL resources after the first surface was created, so the Java render thread can't use them to make the second surface.

I'm not sure what you mean; only the egl surface should be destroyed and recreated when the transition happens

I mean that when it creates the 2nd surface, eglMakeCurrent() returns EGL_BAD_ACCESS. I think this is because the EGL context still belongs to the Godot render thread, so the Java thread can't use it. I need to do some more investigation, though.

and as you mention the same behavior should trigger when the surface is resized.

Yeah, the whole concept of re-creating the EGL surface is tricky together with Godot's multithreaded rendering. It seems like we'll need to synchronize both threads somehow in this case

@dsnopek
Copy link
Contributor Author

dsnopek commented Aug 4, 2025

I mean that when it creates the 2nd surface, eglMakeCurrent() returns EGL_BAD_ACCESS. I think this is because the EGL context still belongs to the Godot render thread, so the Java thread can't use it. I need to do some more investigation, though.

Heh, actually, just ignoring the eglMakeCurrent() error seems to allow multi-threaded rendering to start! That call fails on the Java side, but then when the Godot render thread does it later, it succeeds, and since that's where the rendering actually takes place, things do seem to work.

I need to figure out how to do this in a non-hacky way, though :-)

@dsnopek dsnopek force-pushed the opengl-working-separate-threading-model branch from 022e3c1 to e9b1c4d Compare August 4, 2025 21:30
@dsnopek
Copy link
Contributor Author

dsnopek commented Aug 4, 2025

Alright, the code in my latest push is actually working! I've updated the PR description to reflect its current state

@dsnopek dsnopek requested a review from m4gr3d August 4, 2025 21:41
Comment on lines 1006 to 1010
// These may have changed if the surface was changed or resized, so grab them again.
egl_display = eglGetCurrentDisplay();
egl_surface = eglGetCurrentSurface(EGL_DRAW);
egl_context = eglGetCurrentContext();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notify_surface_changed is invoked when the surface changes so these can be updated there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset_window() is also invoked, and may be a more suited location to update these variables.

Copy link
Contributor Author

@dsnopek dsnopek Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we need to call these on the same thread as eglMakeCurrent() was successfully run on. I think those will be invoked on the Java render thread, whereas we need to do this on Godot's render thread, which this function will be run on (and a couple lines up, it should have successfully done the eglMakeCurrent() so it should have the correct handles).

Anyway, I'll double check and see if notify_surface_changed() or reset_window() run on the correct thread or not.

(Overall, it would be nice if the Java side could just pass the native EGL handles to us, rather than having to indirectly pass them via eglMakeCurrent() and eglGetCurrent*(), but we're still using the javax.microedition.khronos.egl.* versions of the EGL objects, which don't provide a way to get the native handles out. It looks like the objects on android.opengl.* do have a way to do that, though)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they do run on different thread, we could set a flag in reset_window and update the variables here based on that flag.
I'll take a look at the android.opengl.* API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look at the android.opengl.* API.

For example, android.opengl.EGLSurface extends android.opengl.EGLObjectHandle which provides getNativeHandle(). I think we could pass that value to the C++ side as a uint64_t and reinterpret_cast(...) it to EGLSurface

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsnopek I've updated #105529 and added a commit that switches from javax.microedition.khronos.* to android.opengl.*

Copy link
Contributor Author

@dsnopek dsnopek Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

In my latest push, I've updated to the latest version of #105529 and changed it to directly pass the native EGL handles from the Java side to the C++ side. But I'm not sure if how I integrated this into various classes/interfaces on the Java side is the right way - please let me know if you'd like any changes to it

@m4gr3d
Copy link
Contributor

m4gr3d commented Aug 6, 2025

The rest of the code looks good! Do you have a specific project you're using to test it?

@m4gr3d
Copy link
Contributor

m4gr3d commented Aug 6, 2025

@syntaxerror247 @Alex2782 Along with #105529, this PR should be prioritized once we enter the 4.6 dev cycle given the scope of the changes and to ensure we don't introduce regressions.

@dsnopek
Copy link
Contributor Author

dsnopek commented Aug 6, 2025

The rest of the code looks good! Do you have a specific project you're using to test it?

I've been testing with a modified version of the GDQuest TPS demo (it can run both flat screen or in VR on Quest):

https://github.com/dsnopek/gdquest-tps-demo/tree/standalone-vr-slush-multithreaded

NOTE: it is fairly easy to get it to crash, but I think that's due to other non-thread-safe bugs in the renderer and OpenXR. It will run for quite a while without crashing, though!

@Alex2782
Copy link
Member

Alex2782 commented Aug 6, 2025

Unfortunately, I can't test Android VR stuff, but I could check later what other complex demos there are that are optimized for Android.

Only this configuration needs to be tested?

rendering/driver/threads/thread_model = Separate (2)

Experimental: This setting has several known bugs which can lead to crashing, especially when using particles or resizing the window. Not recommended for use in production at this stage.

The thread model to use for rendering. Rendering on a thread may improve performance, but synchronizing to the main thread can cause a bit more jitter.

@dsnopek
Copy link
Contributor Author

dsnopek commented Aug 6, 2025

Only this configuration needs to be tested?

rendering/driver/threads/thread_model = Separate (2)

Yep!

However, as I mentioned above, and as stated in the note from Project Settings that you shared, there are other Godot bugs with using this thread model that may lead to crashes. These should be unrelated to the Android-specific issues that this PR attempts to address, though. This PR just aims to get this mode working at all on Android with OpenGL (it already works with Vulkan, also with some bugs :-))

@dsnopek dsnopek force-pushed the opengl-working-separate-threading-model branch from e9b1c4d to d8eea1e Compare August 8, 2025 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants