- Not just for unit tests
- Check assumption, raise AssertionError
!python
def normalize_ranges(colname):
# Range of 1-D numpy array we loaded application with
orig_range = get_base_range(colname)
colspan = orig_range['datamax'] - orig_range['datamin']
# User filtered data from GUI
live_data = get_column_data(colname)
live_min = numpy.min(live_data)
live_max = numpy.max(live_data)
ratio = {}
ratio['min'] = (live_min - orig_range['datamin']) / colspan
ratio['max'] = (live_max - orig_range['datamin']) / colspan
return ratio
- Normalize data range to [0 - 1] for a GUI widget
- Read data from model and GUI, return range [0 - 1]
- Incorrect calculation here could go unnoticed, far away from this function
!python
# Base/starting data
>>> age = numpy.array([10.0, 20.0, 30.0, 40.0, 50.0])
>>> height = numpy.array([60.0, 66.0, 72.0, 63.0, 66.0])
>>> normalize_ranges('age')
{'max': 1.0, 'min': 0.0}
# Change current range, simulate user filtering data
>>> age = numpy.array([20.0, 30.0, 40.0, 50.0])
>>> normalize_ranges('age')
{'max': 1.0, 'min': 0.25}
>>> age = numpy.array([-10.0, 20.0, 30.0, 40.0, 50.0])
>>> normalize_ranges('age')
{'max': 1.0, 'min': -0.5}
- Negative numbers, numbers bigger than the actual data
- Clue you into where data is wrong
- Maybe it's correct data, just unexpected
- Maybe it's a specific set of data
.fx: small
!python
def normalize_ranges(colname):
orig_range = get_base_range(colname)
# Max will always be greater than the minimum
colspan = orig_range['datamax'] - orig_range['datamin']
# User filtered data from GUI
live_data = get_column_data(colname)
live_min = numpy.min(live_data)
live_max = numpy.max(live_data)
ratio = {}
# Numbers should always be positive
ratio['min'] = (live_min - orig_range['datamin']) / colspan
ratio['max'] = (live_max - orig_range['datamin']) / colspan
# Should be between [0 - 1]
return ratio
- These comments are a start, but their not executable. If we disobey them nothing happens.
- Too easy to go unnoticed
!python
assert 0.0 <= ratio['min'] <= 1.0, (
'"%s" min (%f) not in [0-1] given (%f) colspan (%f)' % (
colname, ratio['min'], orig_range['datamin'], colspan))
assert 0.0 <= ratio['max'] <= 1.0, (
'"%s" max (%f) not in [0-1] given (%f) colspan (%f)' % (
colname, ratio['max'], orig_range['datamax'], colspan))
- Add asserts to verify we return correct answer, not that user passed in correct data
- Executable documentation
- Close the gap
- Debug info when it counts
- Here's just a high-level view of the benefits we're going to dive into
- Runs alongside production code
- Normal docs are stale
- Higher degree of confidence
- Better than regular docs b/c higher chances of being up to date
- Debugging is hard
- Complain early and often
- Close gap between symptom and cause
- Alert invalid conditions ASAP
- Good messages with parameters/local state
- Invalid assumptions about environment
- Avoid unreproducible bugs
- New use cases
- Documentation oversights
- Add in debug information you'll need BEFORE you need it!
- Remember our assert messages? They showed input params and our state
- Only available in debug mode
- Increased code noise
- No dynamic control
- Use -O to run in optimized mode, turns off asserts for performance
- Default is debug on
- Docs say it's clear to use for debug, not production
- Can't control asserts dynamically, can't assign to debug
- Check return values, not inputs
- Document usage in style guide
- Don't ruin duck-typing
- Checking returns can allow you to complain about inputs if you get wrong answer, so it's a 2 for 1
.fx: small
!python
def normalize_ranges(colname):
assert isinstance(colname, str)
orig_range = get_base_range(colname)
assert orig_range['datamin'] >= 0
assert orig_range['datamax'] >= 0
assert orig_range['datamin'] <= orig_range['datamax']
colspan = orig_range['datamax'] - orig_range['datamin']
assert colspan >= 0, 'Colspan (%f) is negative' % (colspan)
live_data = get_column_data(colname)
assert len(live_data), 'Empty live data'
live_min = numpy.min(live_data)
live_max = numpy.max(live_data)
ratio = {}
ratio['min'] = (live_min - orig_range['datamin']) / colspan
ratio['max'] = (live_max - orig_range['datamin']) / colspan
assert 0.0 <= ratio['min'] <= 1.0
assert 0.0 <= ratio['max'] <= 1.0
return ratio