Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add newlib cbrtf #177

Closed
wants to merge 7 commits into from
Closed

Add newlib cbrtf #177

wants to merge 7 commits into from

Conversation

tesuji
Copy link

@tesuji tesuji commented May 28, 2019

@tesuji
Copy link
Author

tesuji commented May 28, 2019

I honestly don't know why it failed. And I need some review!

@alexcrichton
Copy link
Member

Thanks for the PR! I have the same questions as #169 for this though. Why is this change being made relative to what's currently already implemented?

@tesuji
Copy link
Author

tesuji commented May 28, 2019

This implementation does not use double (f64) type. So it could run on
hardwares that have no support for double precision floats.

@alexcrichton
Copy link
Member

Ok that makes sense to me, although does Rust run on any architectures that are like that? (are there examples today of platforms that would benefit?)

Review-wise there's no need to keep the musl code around, we only need one implementation of this function in-tree.

If the purpose here as well is to avoid f64, could some form of CI check be added to assert that f64 isn't used?

@Schultzer
Copy link
Contributor

Schultzer commented Jul 2, 2019

@lzutao can you please rebase your PR's and align them with #169 so we only have one impl. in the tree? We got a benchmark suite so we can evaluate each function now.

@Schultzer Schultzer mentioned this pull request Jul 5, 2019
@tesuji
Copy link
Author

tesuji commented Jul 6, 2019

I don't understand what you mean. This PR doesn't have any relations
with exp2f in #169 .

@tesuji
Copy link
Author

tesuji commented Jul 6, 2019

Ok. I rebased on master and add your commit in #169.

@Schultzer
Copy link
Contributor

Schultzer commented Jul 6, 2019

@lzutao I'm sorry, what I meant was that we only want one implementation in the tree, we don't want to have a musl module or a newlib module. The benchmark suite is to give a rough estimate, ideally this should be tested on hardware.

@tesuji
Copy link
Author

tesuji commented Jul 7, 2019

@Schultzer Do you know what wrong with this PR?
The ported cbrtf has the same generated asm as C version: https://godbolt.org/z/tfeV0c
as far as I understand.

@Schultzer
Copy link
Contributor

Schultzer commented Jul 7, 2019

The errors here are properly expected, I think the system on the ci uses musl and, as far as I know given the different implementation in newlib, we properly want to test this function against newlib or, increase the ULP for this case.

Copy link
Contributor

@gnzlbg gnzlbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this passes an exhaustive check against musl (testing all f32 bit patterns), and the benchmarks say this is faster. I see no reasons against making this change.

@tesuji
Copy link
Author

tesuji commented Jul 7, 2019

benchmarks say this is faster. I see no reasons against making this change.

I don't expect the ported newlib functions could be faster than old one. The reason
for newlib is to run on system have no hardware support for double precision floats.
See more in the issue #134 .

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 7, 2019

@lzutao so will this be a performance regression on systems that do have support for double precision floats ?

@tesuji
Copy link
Author

tesuji commented Jul 7, 2019

At least in x86_64, it doesn't have regression:
This PR:

test cbrtf     ... bench:          19 ns/iter (+/- 0)

Master:

test cbrtf     ... bench:          20 ns/iter (+/- 0)

@alexcrichton
Copy link
Member

It looks like this doesn't change the performance, so is this still necessary?

@burrbull
Copy link
Contributor

burrbull commented Jul 8, 2019

It should be tested on different platform, especially on embedded. With/without f32/f64 FPU.
Benchmark system should support different targets.

@tesuji
Copy link
Author

tesuji commented Nov 25, 2019

Seems like we don't agree the merge this. Closing then.

@tesuji tesuji closed this Nov 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

port newlib's cbrtf
5 participants