Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 2-component vector constructor #1569

Merged
merged 12 commits into from
Jan 17, 2025

Conversation

petrihakkinen
Copy link
Contributor

Implement RFC: 2-component vector constructor. This includes 2-component overload for vector.create and associated fastcall function, and its type definition. These features are controlled by a new feature flag LuauVector2Constructor. Additionally constant folding now supports two components when LuauVector2Constants feature flag is set.

Note: this work does not include changes to CodeGen. Thus calls to vector.create with only two arguments are not natively compiled currently. This is left for future work.

@petrihakkinen
Copy link
Contributor Author

petrihakkinen commented Dec 17, 2024

Thanks for the comments. Pushed a revision that addresses all the points mentioned.

Also added a new micro benchmark for vector library. Test results:

LuauVector2Constructor false:

Test Min Average StdDev% Driver
vector: create 71.122ms 72.708ms 1.727% luau.exe

LuauVector2Constructor true:

Test Min Average StdDev% Driver
vector: create 71.517ms 73.315ms 1.447% luau.exe

Command line used:
python bench.py --folder micro_tests --run-test test_vector_lib --extra-loops 50

Tested on: Windows 10, MSVC 2022, Intel i5-3570K

The difference is a bit less than 1% in favor of the old version. I don't think this will make a big difference in practice though.

I also ran the test with only two arguments (the new LuauVector2Constructor path):

Test Min Average StdDev% Driver
vector: create2 51.412ms 52.479ms 1.487% luau.exe

Creating vectors from two components is now much faster as expected (previously ~218ms on average with a user defined 2D constructor without the fastcall).

An idea for future work: we could consider adding a new built-in for the 2D case (luauF_vector2). This would get rid of the extra branch in luauF_vector and get back that 1% and make luauF_vector2 even faster. This would require detecting the number of arguments in the compiler and choosing the correct builtin. Not sure if it makes sense to add complexity for such a small gain though.

@vegorov-rbx
Copy link
Collaborator

Sorry for the delay.
We had a long production pause at Roblox, but we're back now and will look at this soon.

@vegorov-rbx
Copy link
Collaborator

An idea for future work: we could consider adding a new built-in for the 2D case (luauF_vector2). This would get rid of the extra branch in luauF_vector and get back that 1% and make luauF_vector2 even faster. This would require detecting the number of arguments in the compiler and choosing the correct builtin. Not sure if it makes sense to add complexity for such a small gain though.

I actually kind of expected to see that :D
But looks like it performs well enough without that complexity (and 3/4-comp fastcall will have to be ready to accept 2 arguments anyway since it will not be 100% detectable statically).

Code looks good, but need a code merge. Hopefully the definitions conflicts will not cause too much pain.

Copy link
Collaborator

@vegorov-rbx vegorov-rbx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to re-request after merging.

@petrihakkinen
Copy link
Contributor Author

Thanks! I'll fix the conflicts next.

3/4-comp fastcall will have to be ready to accept 2 arguments anyway since it will not be 100% detectable statically

I was actually thinking that the 3/4-comp fastcall would not need to handle 2 arguments and fallback to the slow path instead. It should be rare that the compiler cannot deduce number of arguments in practical use cases. But anyway, I agree that it would be a premature optimization at this point.

@petrihakkinen
Copy link
Contributor Author

Feel free to re-request after merging.

Conflicts have now been fixed, could you recheck please?

EmbeddedBuiltinDefinitions.cpp is quite messy indeed when there are multiple feature flags. 2 flags per library is sort of doable but with more it's going to be hit by combinatorial explosion. Maybe there's a better way...

@vegorov-rbx vegorov-rbx merged commit 67e9d85 into luau-lang:master Jan 17, 2025
7 checks passed
@vegorov-rbx
Copy link
Collaborator

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants