-
Notifications
You must be signed in to change notification settings - Fork 4
How to implement batch norm #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Demo implementation:
|
Because we want that the fused op is used when possible, we need to wrap some RETURNN layer in any case. Because of this, I just wrapped the whole module to the RETURNN layer now. Some of these questions (how to handle/implement custom aux var updates) still remain but are not needed anymore for this case. Also, basically the case here is also clear, and just needs to be implemented. Which can be done once it is needed (maybe for sth else). But this is #90. And maybe also #18. So I will close this now. |
There are a couple of open question regarding how to implement batch norm using the building blocks of returnn-common. Of course we could also wrap the existing
BatchNormLayer
in RETURNN (which needs rwth-i6/returnn#891 though) but even if this would be the implementation ofBatchNorm
in returnn-common, the question still remains how to implement it from scratch using the building blocks of returnn-common. In any case, this should be possible, and preferably also in a straight-forward way.One question is, how to handle the the train flag. This is #18.
Another question is, how to do custom updates for the running statistic variables. This is #90.
Another question is, how to make use of the TF fused op, which would be important for efficiency. Specifically,
tf.compat.v1.nn.fused_batch_norm
withdata_format="NCHW"
.Related are also the batch norm defaults (#83) although not too relevant for the question on how to implement this.
The text was updated successfully, but these errors were encountered: