Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPLE and OptiFine compatibility #43

Merged
merged 11 commits into from
Jan 8, 2024

Conversation

basdxz
Copy link
Contributor

@basdxz basdxz commented Jan 7, 2024

Summary

We have implemented full compatibility for OptiFine Shaders and RPLE colored light.

The following configurations have been tested:

  • Neodymium
  • Neodymium + OptiFine (No Shaders)
  • Neodymium + OptiFine (Latest BSL Shader)
  • Neodymium + RPLE
  • Neodymium + RPLE + OptiFine (No Shaders)
  • Neodymium + RPLE + OptiFine (Latest BSL Shader)

Any bugs we could find have been patched, although some external validation would is advised.
Flying far off to the world border still works as expected with no noticeable jitter.

Both RPLE & OptiFine extend the vertex stride, with OptiFine shaders having the most significant impact:

  • Neodymium:
    4 attributes, 28 bytes (24 bytes with short UVs)
  • Neodymium + OptiFine (With Shaders)
    9 attributes, 72 bytes
  • Neodymium + RPLE
    6 attributes, 36 bytes (32 bytes with short UVs)
  • Neodymium + OptiFine (With Shaders) + RPLE
    12 attributes, 88 bytes

(Note that short UVs are currently not supported with OptiFine shaders)

This leads to a significant increase in VRAM usage, as such we may need to increase the default VRAM size depending on the current configuration. Currently, around 1GiB~ was needed in order to load a 16 chunk radius. With the maximum possible allocation being a little under 2GiB, dynamic memory allocation may also be needed in the future.

Technical

  • Compat has been extended to allow detection of OptiFine Shaders, RPLE and FalseTweaks. OptiFine shaders will also no longer disable any functions of Neodymium.
  • MeshQuad now has fields to store the additional data needed when rendering chunks with RPLE/OptiFine Shaders, alongside code to copy the mesh data from the Tessellator as needed.
  • NeoRenderer includes additional logic when setting VBOs with OptiFine shaders or when using Neodymium with just RPLE. With additional transformations being done on the matrix state to keep the jitter fix compatible with OptiFine.

@FalsePattern
Copy link
Contributor

Also, the renderSortedRenderers mixin now cancels the method when Neodymium is rendering.

This bypasses all the redundant logic that FalseTweaks does in that method, and lets us avoid any glGenLists/glBeginList/glEndList calls when both Neodymium and the FT occlusion engine are enabled.

FalseTweaks will have the other half of this specific optimization added in 2.8.0, but it depends on the changes added to Neodymium in this PR, otherwise it spams GL errors in the log.

@makamys
Copy link
Owner

makamys commented Jan 8, 2024

Some more remarks:

  • Every time when switching shaders in-game, and sometimes on world load with shaders enabled, the subchunk the player is inside becomes invisible. F3+A or (usually) relogging fixes it. This should get addressed before this is merged.
  • The value returned by preRenderSortedRenderers is technically wrong when face culling is enabled, because it counts the number of meshes rendered (which each subchunk can have 0~7 of) rather than the number of subchunks. The counting would have to be done in initIndexBuffers instead. However, the value returned seems to be unused by vanilla so it probably doesn't matter anyway.
  • shortUV's config comment should mention that it does nothing with shaders/RPLE, but I can add that myself.
  • I'll probably add a config that re-enables the GL list logic if any issues are discovered from disabling it.
  • MeshQuad and NeoRenderer are getting huge now, I'll have to think about organizing the compat extensions into separate classes.

But overall it seems good, thanks for implementing this!

PS: I thought I'd let you know that the gains I get on my setup (GTX1050Ti, 1920x1080, Linux with proprietary drivers) are somewhat modest (48->54 FPS [+12.5%] - for comparison, with shaders disabled it's 490->620 FPS [+26.5%]). According to VisualVM, ~80% of time is spent inside nglGetError with Neodymium, and ~88% without.

For testing I used OF E7 + Nd + Sildurs Vibrant Shaders v1.23 Lite.

@couleurm
Copy link

couleurm commented Jan 8, 2024

hi whats RPLE

@FalsePattern
Copy link
Contributor

@FalsePattern
Copy link
Contributor

I thought I'd let you know that the gains I get on my setup (GTX1050Ti, 1920x1080, Linux with proprietary drivers) are somewhat modest (48->54 FPS [+12.5%]

Yes, the main difference we noticed during testing is a huge decrease in microstuttering. This compat is mainly useful to reduce those spikes in frametimes instead of getting a huge FPS boost.

the subchunk the player is inside becomes invisible.

This seems to be an odd issue in the vanilla culling logic, FalseTweaks' occlusion engine (with the mentioned patches planned for 2.8.0) seems to fix it. We couldn't pinpoint what exactly causes this unfortunately.

MeshQuad and NeoRenderer are getting huge now

One solution i can propose for this is moving the 4 implementation variants into utility classes. I will add this to the PR in a bit, it should reduce the spamminess of the compat variants by a bit.

@basdxz
Copy link
Contributor Author

basdxz commented Jan 8, 2024

PS: I thought I'd let you know that the gains I get on my setup (GTX1050Ti, 1920x1080, Linux with proprietary drivers) are somewhat modest (48->54 FPS [+12.5%] - for comparison, with shaders disabled it's 490->620 FPS [+26.5%]). According to VisualVM, ~80% of time is spent inside nglGetError with Neodymium, and ~88% without.

OptiFine has a few internal bugs, (some fixed by RPLE or FalseTweaks, with all being eventually migrated to the latter). Optimally with no FPS limit, 50%+ of the time should be spent waiting on Display.update() which is what I have seen in my testing when running with ND/RPLE/OF (RPLE hard-depends on FT).

Copy link
Owner

@makamys makamys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We couldn't pinpoint what exactly causes this unfortunately.

No worries then, I was just hoping it was a simple issue.

One solution i can propose for this is moving the 4 implementation variants into utility classes.

Nice, it's a lot better now.

some fixed by RPLE or FalseTweaks

I see, I probably should've tried with a less minimal setup too then.

@makamys makamys merged commit 091d817 into makamys:master Jan 8, 2024
1 check passed
makamys added a commit that referenced this pull request Jan 8, 2024
- Make `shortUV` config comment mention interaction with RPLE/OF
- Fix NeoRenderer#init implSpec javadoc
- Update OF shader status in readme
@FalsePattern FalsePattern deleted the optifine-compat branch January 8, 2024 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants