Currently Checkpointing based on loss value happens in response to train batch loss at the end of the epoch.
This should change to loss value reported by validate_batch if training is run with validation. If validation is not used it should behave as it does now.