Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If full visibility of x87 ops convert to 64bit with unsafe flags #676

Open
Sonicadvance1 opened this issue Jan 22, 2021 · 6 comments
Open
Milestone

Comments

@Sonicadvance1
Copy link
Member

If we have full visibility of the scope of an x87 op, we can unsafely convert to FP64.
Only enabled with unsafe flag.
Necessary for some x87 perf gains.

@skmp skmp modified the milestones: 2210, 2211 Aug 10, 2022
@skmp skmp moved this to 🆕 Unplanned in Next Project Milestone Aug 18, 2022
@pmatos
Copy link
Collaborator

pmatos commented Oct 21, 2024

@Sonicadvance1 does this still make sense now that we have reduced precision?

@Sonicadvance1
Copy link
Member Author

While it would still be nice, it's definitely less interesting since we can just enable the low precision toggle in most options.

Would still be nice for those games that actually want full precision, but we can cheese out some of the f32 and f64 operations.

@pmatos
Copy link
Collaborator

pmatos commented Oct 21, 2024

While it would still be nice, it's definitely less interesting since we can just enable the low precision toggle in most options.

Would still be nice for those games that actually want full precision, but we can cheese out some of the f32 and f64 operations.

I see what you mean now... detecting operations in the stack optimization pass that can be lowered to 64bits and do those there?

So lets say we have a 32/64bit store to memory of an add. Instead of performing the 80bit operation, we can do that directly in 32/64 and avoid the need to go through 128bit registers...

Nice!

@pmatos
Copy link
Collaborator

pmatos commented Oct 21, 2024

I will add this to my todo list.

@Sonicadvance1
Copy link
Member Author

While it would still be nice, it's definitely less interesting since we can just enable the low precision toggle in most options.
Would still be nice for those games that actually want full precision, but we can cheese out some of the f32 and f64 operations.

I see what you mean now... detecting operations in the stack optimization pass that can be lowered to 64bits and do those there?

So lets say we have a 32/64bit store to memory of an add. Instead of performing the 80bit operation, we can do that directly in 32/64 and avoid the need to go through 128bit registers...

Nice!

Yea, since we have the stack tracking now, theoretically that can be bolted on

@pmatos
Copy link
Collaborator

pmatos commented Oct 25, 2024

While it would still be nice, it's definitely less interesting since we can just enable the low precision toggle in most options.
Would still be nice for those games that actually want full precision, but we can cheese out some of the f32 and f64 operations.

I see what you mean now... detecting operations in the stack optimization pass that can be lowered to 64bits and do those there?
So lets say we have a 32/64bit store to memory of an add. Instead of performing the 80bit operation, we can do that directly in 32/64 and avoid the need to go through 128bit registers...
Nice!

Yea, since we have the stack tracking now, theoretically that can be bolted on

I was thinking if this is actually worth it.

Lets say that after the stack pass finishes we have:

push push qword
|    |
 `  /
  add   push qword
   `    /
   multiply
   | 
  store

We could transform add and multiply into 64 bits, but does it really make sense? We will end up with a floating point error similar to that of reduced precision, but in normal precision mode. Of course, we could introduce another option for these optimizations, but I am not sure it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 Unschedulled
Development

No branches or pull requests

3 participants