Skip to content

Eager-style debugging #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
albertz opened this issue Aug 9, 2021 · 2 comments
Closed

Eager-style debugging #22

albertz opened this issue Aug 9, 2021 · 2 comments

Comments

@albertz
Copy link
Member

albertz commented Aug 9, 2021

The way models are defined with the PyTorch-style API is oriented to allow for a simple mental model for the user, which allows for eager-like thinking/reasoning about the code / model definitions. This is even for recurrent definitions (#16).

For debugging purpose, it would be helpful to also allow eager execution.

This should be an optional option, and would not be used by default (default would be graph mode). But the code behavior should not change at all. It would be optional because it would be way more inefficient.

This should be technically possible though, because for all definitions / module calls, all values can be calculated at the time when the Python code is called. Some details on how we do this internally need to be sorted out. Not sure which is the easiest way. E.g.:

  • Really use TF eager mode. But this probably needs some changes on RETURNN side. (E.g. replace tf.placeholder.)
  • Implement this purely on returnn-common side.
albertz added a commit that referenced this issue Apr 24, 2022
@albertz
Copy link
Member Author

albertz commented Apr 24, 2022

Note: We have a first implementation of this now. See test_nn_debug_eager_mode.py for examples.
Or demo-debug-eader-mode.ipynb.

We really use TF eager mode. The changes on RETURNN side are so far minor.

For nn.get_extern_data, we currently automatically create arbitrary random data. This is not configurable at all currently. This probably needs some extension. Although the user can also simply overwrite the data like this:

data = nn.get_extern_data(...)
data.data.placeholder = ...  # reset
...

Or alternatively, the user can simply directly provide data = nn.constant(...) or data = nn.convert_to_tensor(...) using some data from some external source and not use nn.get_extern_data when debugging.
Or for example example_data.audio.get_sample_batch().

It seems to work fine. At least it seems so. E.g. nn.ConformerEncoder is tested in the sense that some data comes out at the end. It is not really verified that this is the correct data.

Control flow logic (nn.Loop, nn.Cond etc) is not implemented yet. But in principle we can add support for that as well.

@albertz
Copy link
Member Author

albertz commented Apr 28, 2022

I guess we can close this for now.

@albertz albertz closed this as completed Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant