Implement weight norm. The implementation can look very similar to PyTorch. Probably similar would be weight dropout (#100) or other transformations or reparameterizations of weights. The more generic issue is rwth-i6/returnn_common#59.