Skip to content

Conversation

mshannon-sil
Copy link
Collaborator

@mshannon-sil mshannon-sil commented Sep 3, 2025

Previously, if the translate method was called multiple times, such as when there is a list of books to translate, the model was loaded each time. The model is now saved as an instance variable as a form of caching, as are the translate parameters. Whenever a call to translate uses the same parameters (ckpt, src_lang, trg_lang) for the translation model, the model is reused. If the parameters are different, then a new inference model is created and replaces the old cached model.

I tested the fix with a previous experiment where the model was being reloaded after every book and confirmed that it now only loads the model at the very beginning of the translate step.


This change is Reviewable

Copy link
Collaborator

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @benjaminking)


silnlp/nmt/hugging_face_config.py line 1223 at r1 (raw file):

        trg_lang = self._config.data["lang_codes"].get(trg_iso, trg_iso)
        tokenizer = self._config.get_tokenizer()
        if self._cached_translate_params == (ckpt, src_lang, trg_lang) and self._cached_model is not None:

It would be good to clear the cached model, once we have finished translating all of the books. This isn't strictly necessary right now, but it might help us to avoid issues in the future. Here is how I would suggest doing it:

  1. Add a clear_cache method to the config class.
  2. Update the Translator/NMTTranslator class to be a context manager using AbstractContextManager. Call clear_cache in the __exit__ method.
  3. In TranslationTask, use with blocks whenever we construct a translator. This will ensure that the cache is cleared when we are done translating.

Copy link
Collaborator

@benjaminking benjaminking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaminking reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mshannon-sil)


silnlp/nmt/hugging_face_config.py line 844 at r1 (raw file):

        self._num_devices = num_devices
        self._cached_model: Optional[PreTrainedModel] = None
        self._cached_translate_params: Optional[Tuple[Union[CheckpointType, str, int], str, str]] = None

When we have complicated type signatures like this, I like to try to factor out a class to represent the type. The class can also have logic that validates the type.

@mshannon-sil
Copy link
Collaborator Author

silnlp/nmt/hugging_face_config.py line 1223 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It would be good to clear the cached model, once we have finished translating all of the books. This isn't strictly necessary right now, but it might help us to avoid issues in the future. Here is how I would suggest doing it:

  1. Add a clear_cache method to the config class.
  2. Update the Translator/NMTTranslator class to be a context manager using AbstractContextManager. Call clear_cache in the __exit__ method.
  3. In TranslationTask, use with blocks whenever we construct a translator. This will ensure that the cache is cleared when we are done translating.

This sounds good, and I've started working on it. Can you just confirm whether you'd prefer the clear_cache method in the HuggingFaceConfig class or the HuggingFaceNMTModel class from a design standpoint? To put it in the config class, I think I'd either need to move the cached model to the config class and then do some refactoring to get/set the cached model in the translate method which is not in the config class, or keep the parameters in HuggingFaceNMTModel but pass in a HuggingFaceNMTModel instance as a parameter to clear_cache so it has access. It seems simpler on the surface to put clear_cache in HuggingFaceNMTModel, but if it's preferable for the config class to do everything related to resource management or for the clear_cache method to be more broadly applicable I can definitely move it there.

Copy link
Collaborator

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mshannon-sil)


silnlp/nmt/hugging_face_config.py line 1223 at r1 (raw file):

Previously, mshannon-sil wrote…

This sounds good, and I've started working on it. Can you just confirm whether you'd prefer the clear_cache method in the HuggingFaceConfig class or the HuggingFaceNMTModel class from a design standpoint? To put it in the config class, I think I'd either need to move the cached model to the config class and then do some refactoring to get/set the cached model in the translate method which is not in the config class, or keep the parameters in HuggingFaceNMTModel but pass in a HuggingFaceNMTModel instance as a parameter to clear_cache so it has access. It seems simpler on the surface to put clear_cache in HuggingFaceNMTModel, but if it's preferable for the config class to do everything related to resource management or for the clear_cache method to be more broadly applicable I can definitely move it there.

You are correct. The clear_cache method should added to the HuggingFaceNmtModel class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When translating a sequence of text files using Huggingface, the model is loaded multiple times
3 participants