You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Split up the polynomial into small independent terms to give
410
-
* opportunities for parallel evaluation. The chosen splitting is
411
-
* micro-optimized for Athlons (XP, X64). It costs 2 multiplications
412
-
* relative to Horner's method on sequential machines.
391
+
* Split up the polynomial into small independent terms to give opportunities for parallel evaluation. The chosen splitting is a micro-optimization for specific hardware, as originally documented in FreeBSD's fdlibm. The splitting costs 2 multiplications relative to Horner's method on sequential machines.
413
392
*
414
-
* We add the small terms from lowest degree up for efficiency on
415
-
* non-sequential machines (the lowest degree termstend to be ready
416
-
* earlier). Apart from this, we don't care about order of
417
-
* operations, and don't need to care since we have precision to
418
-
* spare. However, the chosen splitting is good for accuracy too,
419
-
* and would give results as accurate as Horner's method if the
420
-
* small terms were added from highest degree down.
393
+
* We add the small terms from lowest degree up for efficiency on non-sequential machines (the lowest degree terms tend to be ready earlier). Apart from this, we don't care about order of operations, and don't need to care since we have precision to spare. However, the chosen splitting is good for accuracy, too, and would give results as accurate as Horner's method if the small terms were added from highest degree down.
0 commit comments