Generalized constraints: post update hooks

Currently our implemented constraints are:

- L2 on weights (`L2` option on a layer)
- Some exotic things on activations (`darc1`, `spatial_smoothing`)

We already have the possibility to decouple the constraints from the normal loss computation, via `decouple_constraints`. In https://github.com/rwth-i6/returnn/pull/1206, this behavior will change a bit, and then it decouples only the data-independent constraints, i.e. namely only L2 currently.

L2 is equivalent to weight decay when SGD is used. With the new decoupled constraints code (#1206), it explicitly does:
```python
                return var.assign_sub(var * (l2 * 2.), use_locking=self.use_locking, read_value=False)
```

We can generalize such updates, and allow the user to perform some generic post updates on parameters.

For example, in https://github.com/rwth-i6/returnn_common/issues/241 it was suggested to extend L2 to have some `decay_center`. But instead of having such a L2-specific additional option, we can allow the user to perform any custom post updates, similar as the code above. Then the user could easily do such `delay_center` logic, but also many other things as well.

Also related: https://github.com/rwth-i6/returnn_common/issues/90

How would the API look like on RETURNN side? It's maybe also ok to only do this for the `VariableLayer`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generalized constraints: post update hooks #1214

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Generalized constraints: post update hooks #1214

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions