Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to map FP round currently to lassen. #29

Open
rdaly525 opened this issue Apr 26, 2019 · 13 comments · Fixed by #142
Open

No way to map FP round currently to lassen. #29

rdaly525 opened this issue Apr 26, 2019 · 13 comments · Fixed by #142

Comments

@rdaly525
Copy link
Contributor

Being able to do a FP round will give us round, ceil, and floor which are required from the Halide applications.

@rdaly525
Copy link
Contributor Author

To be very clear here, the round function that we need takes in a float and produces a float. It does not cast it into an int.

@nikhilbhagdikar
Copy link
Contributor

Round can be implemented (f is a bloat number)
fi = int(f)
ff = frac(f)
fi = fi + (ff>=0x80)
fr = int2float(fi)

I have added int2float, so this is now doable.

cdonovick added a commit that referenced this issue May 10, 2019
@steveri
Copy link

steveri commented May 29, 2019

@rdaly525 so...looking at the comments...this is resolved?

@rdaly525
Copy link
Contributor Author

The explicit task is the following:

  • Write 'Round' complex op in lassen/stdlib
  • Write a test for Round using hwtypes.FPVector (please do not use float2bfbin/bfbin2float)
  • Write 'Ceil' complex op in lassen/stdlib
  • Write a test for Ceil using hwtypes.FPVector (please do not use float2bfbin/bfbin2float)
  • Write 'Floor' complex op in lassen/stdlib
  • Write a test for Floor using hwtypes.FPVector (please do not use float2bfbin/bfbin2float)

@rdaly525
Copy link
Contributor Author

@nikhilbhagdikar, if you need an example of how to use FPVector, you can refer to tests/test_pe.py

@cdonovick
Copy link
Contributor

I can try to take on round

@cdonovick
Copy link
Contributor

cdonovick commented Jun 24, 2019

We can partially implement round by cascading FCnvInt2F and FGetFint (int2float and float2int). However it only works for a sub range of floats* also it implements round to 0 (floor on positive, ceil on negative). So round(-.7) == round(.7) == 0 and round(1.2) == round(1.7) == 1

*empirically it works (-BOUND, BOUND) for BOUND=215-26-1

I think we could implement a full range round to 0 with the following:

def round(x):
  if exponent(x) > mantissa_size(x):
     return x #x is already an integer
  else:
     return i2f(f2i(x))

However this will take 5 PEs (extract exponent, compare, mux, i2f, f2i) instead of 2.

To get a proper floor we could use the following algorithm:

def floor(x):
   sign = x & (1 << 15) #gets the sign bit
   a_x = abs(x)
   r_x = round(a_x)
   return r_x | sign #sets the sign bit

which takes 3 additional PEs over round (&, abs, |)
ceil would be 1 PE over floor as we could use basically the same procedure but need the negative absolute value.

round to nearest could also be implementable as 1 PE over floor

def round_nearest(x):
  return floor(x+0.5)

Long story short:
round_towards_zero(x) where x is guaranteed to be in some limited range takes 2 PEs
round_towards_zero(x) where x can be anything takes 5 PEs
floor(x) +3 PEs over round_towards_zero
ceil(x) +4 PEs over round_towards_zero
round_nearest(x) +4 PEs over round_towards_zero

@cdonovick
Copy link
Contributor

@nikhilbhagdikar If you have any ideas on how to do any of this with fewer PEs let me know

@rdaly525
Copy link
Contributor Author

@cdonovick, thanks for doing this analysis. @jeffsetter, could you comment on the range issue for halide applications? Specifically, what sets of applications could we get away with using the Bounded round?

@jeffsetter
Copy link
Contributor

*empirically it works (-BOUND, BOUND) for BOUND=2^15-2^6-1

Since the range of bfloat is approximately +- 2^127, this rounding loses many orders of magnitude of the bfloat range. Essentially this renders the rounding to integers and omits rounding of large floating point numbers entirely.

@jeffsetter
Copy link
Contributor

jeffsetter commented Jun 24, 2019

Sorry, I missed that the range does not have to be bounded, but instead takes 5 PEs. I believe we are most likely to use that mapping in most cases.

It is unclear to me if the above strategy is the only one possible for rounding. The algorithm I use in Halide is below, which may be useful in finding a more efficient mapping.
https://github.com/StanfordAHA/Halide-to-Hardware/blob/handcrafted/src/EmulateFloat16Math.cpp#L220

@cdonovick
Copy link
Contributor

@jeffsetter just to be clear we talking about implementing a round operations bfloat->bfloat (round(1.2) == 1.0, round(12.0) == 12.0) where only fractional parts (mantissa bits that have value < 1) are rounded. It looks like you are rounding ieee 754 singles to bfloat which is a very different task.

@cdonovick
Copy link
Contributor

Github is stupid, partially resolves != resolves

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants