-
-
Notifications
You must be signed in to change notification settings - Fork 7
Prevent model from being loaded multiple times if translate method is called multiple times #801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ddaspit reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @benjaminking)
silnlp/nmt/hugging_face_config.py
line 1223 at r1 (raw file):
trg_lang = self._config.data["lang_codes"].get(trg_iso, trg_iso) tokenizer = self._config.get_tokenizer() if self._cached_translate_params == (ckpt, src_lang, trg_lang) and self._cached_model is not None:
It would be good to clear the cached model, once we have finished translating all of the books. This isn't strictly necessary right now, but it might help us to avoid issues in the future. Here is how I would suggest doing it:
- Add a
clear_cache
method to the config class. - Update the
Translator
/NMTTranslator
class to be a context manager usingAbstractContextManager
. Callclear_cache
in the__exit__
method. - In
TranslationTask
, usewith
blocks whenever we construct a translator. This will ensure that the cache is cleared when we are done translating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjaminking reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mshannon-sil)
silnlp/nmt/hugging_face_config.py
line 844 at r1 (raw file):
self._num_devices = num_devices self._cached_model: Optional[PreTrainedModel] = None self._cached_translate_params: Optional[Tuple[Union[CheckpointType, str, int], str, str]] = None
When we have complicated type signatures like this, I like to try to factor out a class to represent the type. The class can also have logic that validates the type.
Previously, ddaspit (Damien Daspit) wrote…
This sounds good, and I've started working on it. Can you just confirm whether you'd prefer the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mshannon-sil)
silnlp/nmt/hugging_face_config.py
line 1223 at r1 (raw file):
Previously, mshannon-sil wrote…
This sounds good, and I've started working on it. Can you just confirm whether you'd prefer the
clear_cache
method in theHuggingFaceConfig
class or theHuggingFaceNMTModel
class from a design standpoint? To put it in the config class, I think I'd either need to move the cached model to the config class and then do some refactoring to get/set the cached model in the translate method which is not in the config class, or keep the parameters inHuggingFaceNMTModel
but pass in aHuggingFaceNMTModel
instance as a parameter toclear_cache
so it has access. It seems simpler on the surface to putclear_cache
inHuggingFaceNMTModel
, but if it's preferable for the config class to do everything related to resource management or for theclear_cache
method to be more broadly applicable I can definitely move it there.
You are correct. The clear_cache
method should added to the HuggingFaceNmtModel
class.
Previously, if the translate method was called multiple times, such as when there is a list of books to translate, the model was loaded each time. The model is now saved as an instance variable as a form of caching, as are the translate parameters. Whenever a call to translate uses the same parameters (ckpt, src_lang, trg_lang) for the translation model, the model is reused. If the parameters are different, then a new inference model is created and replaces the old cached model.
I tested the fix with a previous experiment where the model was being reloaded after every book and confirmed that it now only loads the model at the very beginning of the translate step.
This change is