Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a variable sized thread pool in AssetManager #5151

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

fishface60
Copy link
Contributor

@fishface60 fishface60 commented Jan 26, 2025

This fix is targeted against 1.17, for the version targeting 1.16 see #5153

Identify the Bug or Feature request

fixes #4962

Description of the Change

AssetManager has a thread pool for loading assets that prevents the network handler thread from blocking when it receives a message that requires an asset load.

It is a single thread, which effectively serialises loading.

Unfortunately this means that if the appropriate action after loading an asset is to load a subsequent asset (such as AddOns loading the onInit script in order to run them when the AddOn has been loaded) then this deadlocks.

If the asset is not already cached then instead of the load callbacks being called in the loader thread the assetLoader class' thread pool handles the callbacks, so it should be safe to use a larger thread pool to load assets.

This patch switches the single thread pool with a reference to the ForkJoinPool.commonPool() which is used by CompletableFutures as the "default" thread pool.

Possible Drawbacks

Supporting parallel loading of assets could potentially have observable changes that cause other bugs, but there are circumstances already that allow assets to load in parallel, so these code paths aren't untested.

Release Notes

  • Fixed a deadlock when joining a server if it included an AddOn that had an onInit script and the AddOn was already cached.

This change is Reviewable

@cwisniew
Copy link
Member

I don't think removing the thread pool in getAssetAsynchronously() is the correct way to tackle this. The method is called in the network code and the map rendering code (as well as other GUI code), pausing to load files in these locations can cause problems

@fishface60
Copy link
Contributor Author

I don't think removing the thread pool in getAssetAsynchronously() is the correct way to tackle this. The method is called in the network code and the map rendering code (as well as other GUI code), pausing to load files in these locations can cause problems

Ah, I see, I missed that getAsset reads the asset from disk since it only uses that to check whether the asset exists, and it's pretty fast to read files on Linux with an NVMe drive so I didn't notice it doing anything.

I'll change this to a previous approach of replacing that thread pool with a ForkJoinPool.commonPool()

@fishface60
Copy link
Contributor Author

On second thoughts, that's an observable behaviour change that could have repercussions if suddenly all load callbacks are being executed in parallel, which could cause bugs in a patch release.

Fundamentally the problem is that it needs to run the asset load of the script inside the asset load of the addon, but I might be able to make it work like a recursive lock and if it detects it's already being called from a thread in that pool then load the asset from the thread instead of queueing a new task.

@fishface60
Copy link
Contributor Author

On second thoughts, that's an observable behaviour change that could have repercussions if suddenly all load callbacks are being executed in parallel, which could cause bugs in a patch release.

Fundamentally the problem is that it needs to run the asset load of the script inside the asset load of the addon, but I might be able to make it work like a recursive lock and if it detects it's already being called from a thread in that pool then load the asset from the thread instead of queueing a new task.

This approach doesn't work because it's not the loader thread itself that gets blocked calling getAssetAsynchronous.
The AddOnLibrary readFile function uses CompletableFuture.supplyAsync which spawns a new thread in the common thread pool.

The getAssetAsynchronous method uses a thread pool so that requests to
load assets don't block networking threads etc.

It has been a single thread pool through the existing git history
but it is assumed that it is like this because it used to have a thread
and then was converted to use a single thread fixed thread pool.

Unfortunately having a single thread only means that if an asset load
callback indirectly loads another asset then this can deadlock
since the asset loader thread needs to be released before the load
request can start, which it can't because that thread is waiting for the
result of loading the new asset, thus it deadlocks if an AddOn includes
a JavaScript onInit script since it deadlocks waiting to load that
script.

Changing the thread pool so that the threads load in parallel is an
observable change that could cause problems,
but if the AddOn isn't already loaded the addonLoader's thread pool
loads it instead and that works fine,
so MapTool should be alright with addons and other assets loading in
parallel.
@fishface60 fishface60 force-pushed the develop-addon-load-hang branch from e683c6c to 7f7ff1a Compare January 28, 2025 21:55
@fishface60 fishface60 changed the title Remove asset loader thread pool Use a variable sized thread pool in AssetManager Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants