-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement New SP Boosting Rules #1164
Conversation
For permanances, use an EPSILON. For boosting, round the boost factor to a couple decimal places. Floating point differences have a larger effect on boosting because it's multiplicative.
Avoid floating point differences with Python SpatialPooler
By analyzing the blame information on this pull request, we identified @scottpurdy, @rcrowder and @mrcslws to be potential reviewers |
{ | ||
if (minActiveDutyCycles_[i] <= 0) | ||
vector<Real> targetDensity(numColumns_, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This vector is really only needed in the local inhibition case. Maybe it'd be better to do the boostFactor computation in two places, and avoid creating this vector every time step in the "global inhibition" case?
In other words, there will be two different lines:
// Global inhibition
Real boostFactor = exp(-(activeDutyCycles_[i] - density) * maxBoost_);
// Local inhibition
Real boostFactor = exp(-(activeDutyCycles_[i] - targetDensity[i]) * maxBoost_);
{ | ||
continue; | ||
Real density = localAreaDensity_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be more readable (and infinitesimally faster) to handle the two cases in if/else blocks.
Real density;
if (numActiveColumnsPerInhArea_ > 0)
{
// ...
}
else
{
density = localAreaDensity_;
}
{ | ||
UInt numNeighbors = 0; | ||
Real localActivityDensity = 0; | ||
for (UInt neighbor : WrappingNeighborhood(i, inhibitionRadius_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this obey the wrapAround_
parameter? i.e. I think there should be one case that uses the Neighborhood
and the other that uses the WrappingNeighborhood
.
It might be helpful to have a |
@mrcslws I have included your comments and implements |
There's a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will be my final feedback. Note that I'm not qualified to review my own EPSILON / round
changes. If you merge those, you're my reviewer. :)
boostFactors_[i] = 1.0; | ||
continue; | ||
for (UInt neighbor : Neighborhood(i, inhibitionRadius_, | ||
columnDimensions_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: spacing. columnDimensions_
should line up with i
.
} | ||
boostFactors_[i] = ((1 - maxBoost_) / minActiveDutyCycles_[i] * | ||
activeDutyCycles_[i]) + maxBoost_; | ||
targetDensity[i] = localActivityDensity / numNeighbors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now realizing that this vector doesn't need to exist here either. We could just combine the 2 loops and avoid having to create this vector.
In other words, we could replace this line with:
Real targetDensity = localActivityDensity / numNeighbors;
Real boostFactor = exp(-(activeDutyCycles_[i] - targetDensity) * maxBoost_);
// Avoid floating point mismatches between implementations.
boostFactors_[i] = round(boostFactor * 100.0) / 100.0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it makes sense to use the array of targetDensities in Python, since that allows us to do batch numpy operations rather than doing math in Python, which is slow. But in C we don't get any benefit from the vector (that I can see).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I'm curious to know what you / others think of this floating point strategy, having a PERMANENCE_EPSILON for permanences and rounding boost factors to the nearest hundredth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with first pass.
@@ -813,7 +815,7 @@ void SpatialPooler::updatePermanencesForColumn_(vector<Real>& perm, | |||
numConnected = 0; | |||
for (UInt i = 0; i < perm.size(); ++i) | |||
{ | |||
if (perm[i] >= synPermConnected_) | |||
if (perm[i] >= synPermConnected_ - PERMANENCE_EPSILON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. If the two values are exactly equal (use of >=
instead of >
makes me think that is significant) then the epsilon will result in the wrong outcome. If there is some case where different platforms have a slight difference then I don't even see how this would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed, you can think of floating point math as creating a bell curve of possible results. If the "correct" answer is 0.5, the floating point math might result in numbers between 0.4999996 and 0.5000004. With this EPSILON, we move the threshold so that it sits on one side of the entire bell curve, so results will be consistent no matter where it landed in the bell curve.
I do think >=
makes sense, because then the code still works correctly if PERMANENCE_EPSILON
is set to 0.
|
||
void SpatialPooler::updateBoostFactorsGlobal_() | ||
{ | ||
Real targetDensity; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to use Real32
. We don't need 64 bit precision so better to get deterministic results with explicit # of bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good thing about using Real
is that the Python code can then use GetNTAReal()
if it wants to use the same precision in its numpy arrays.
Real targetDensity; | ||
if (numActiveColumnsPerInhArea_ > 0) | ||
{ | ||
UInt inhibitionArea = pow((Real) (2 * inhibitionRadius_ + 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, use explicit UInt32
(Real) columnDimensions_.size()); | ||
inhibitionArea = min(inhibitionArea, numColumns_); | ||
targetDensity = ((Real) numActiveColumnsPerInhArea_) / inhibitionArea; | ||
targetDensity = min(targetDensity, (Real) 0.5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the max value? Could use a comment explaining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inherited from the inhibitColumns_
function. I actually don't have a good explanation for this logic. Would it be better if we do a parameter check during initialization and throw an error if the targetDensity > 0.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm hard to say. @mrcslws ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I presume. For local inhibition, the inhibitionRadius_
isn't really a parameter, it changes with the statistics of the data. So if you're using the numActiveColumnsPerInhArea_
parameter, you can accidentally wind up in situations where you're activating way more columns than you want, because the inhibition areas are small.
This does not apply to global inhibition unless someone goes in and manually changes the inhibition radius. ::initialize
will set this radius to cover the whole space. So in this code, the inhibition radius is predictable. So I don't think we should perform min
check here.
Similarly, I would argue that this logic in inhibitColumns_
should be moved to inhibitColumnsLocal_
, and inhibitColumnsGlobal_
should just obey the parameters. Though maybe that's out of scope for this change.
Ultimately this is all a hack that exists because the numActiveColumnsPerInhArea
parameter is awkward when mixed with local inhibition. With local inhibition it's probably best to use the localAreaDensity
parameter instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with whatever you guys agree on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @mrcslws and I think this change is out of scope of this PR. Marcus, can you create a separate issue for this and close this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I opened numenta/nupic-legacy#3420
|
||
for (UInt i = 0; i < numColumns_; ++i) | ||
{ | ||
Real boostFactor = exp(-(activeDutyCycles_[i] - targetDensity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler to replace:
-(activeDutyCycles_[i] - targetDensity)
with:
(targetDensity - activeDutyCycles_[i])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why the exp
? I understand the subtraction (figure out if duty cycle is higher or lower than target density) and the multiplication (scale the difference from target density from a fraction to the magnitude scale specified by max boost), but I don't understand the exp
after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exp
ensures several things. First, the boostFactors are always positive. Second, the boostFactor will be one if the activeDutyCycle matches targetDensity. Third, it is monotonic and continuous, so weak columns are boosted and strong columns are suppressed. There are other functions that satisfy the three properties but I prefer exp
for its simplicity.
* maxBoost_); | ||
|
||
// Avoid floating point mismatches between implementations. | ||
boostFactors_[i] = round(boostFactor * 100.0) / 100.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to avoid float precision errors is to just use integers instead. It's a little less readable since you have to read comments to understand what the scale means but avoids lines like this. I wouldn't recommend making this fairly large refactor in this PR, just pointing it out as a perhaps cleaner way to implement things.
Also, you might be able to avoid this line if you use the fixed precision variables types (UInt32
, Real32
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that none of these changes from UInt to UInt32 or Real to Real32 will have any effect. It's only when we build with NTA_BIG_INTEGER or NTA_DOUBLE_PRECISION that these become UInt64 / Real64, respectively.
@@ -944,11 +944,11 @@ namespace nupic | |||
|
|||
The column is identified by its index, which reflects the row in | |||
the matrix, and the permanence is given in 'dense' form, i.e. a full | |||
arrray containing all the zeros as well as the non-zero values. It is in | |||
array containing all the zeros as well as the non-zero values. It is in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the pirate speak
Fixes #1155