Prevent model from being loaded multiple times if translate method is called multiple times #801

mshannon-sil · 2025-09-03T01:23:44Z

Previously, if the translate method was called multiple times, such as when there is a list of books to translate, the model was loaded each time. The model is now saved as an instance variable as a form of caching, as are the translate parameters. Whenever a call to translate uses the same parameters (ckpt, src_lang, trg_lang) for the translation model, the model is reused. If the parameters are different, then a new inference model is created and replaces the old cached model.

I tested the fix with a previous experiment where the model was being reloaded after every book and confirmed that it now only loads the model at the very beginning of the translate step.

This change is

ddaspit

@ddaspit reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @benjaminking)

silnlp/nmt/hugging_face_config.py line 1223 at r1 (raw file):

        trg_lang = self._config.data["lang_codes"].get(trg_iso, trg_iso)
        tokenizer = self._config.get_tokenizer()
        if self._cached_translate_params == (ckpt, src_lang, trg_lang) and self._cached_model is not None:

It would be good to clear the cached model, once we have finished translating all of the books. This isn't strictly necessary right now, but it might help us to avoid issues in the future. Here is how I would suggest doing it:

Add a clear_cache method to the config class.
Update the Translator/NMTTranslator class to be a context manager using AbstractContextManager. Call clear_cache in the __exit__ method.
In TranslationTask, use with blocks whenever we construct a translator. This will ensure that the cache is cleared when we are done translating.

benjaminking

@benjaminking reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mshannon-sil)

silnlp/nmt/hugging_face_config.py line 844 at r1 (raw file):

        self._num_devices = num_devices
        self._cached_model: Optional[PreTrainedModel] = None
        self._cached_translate_params: Optional[Tuple[Union[CheckpointType, str, int], str, str]] = None

When we have complicated type signatures like this, I like to try to factor out a class to represent the type. The class can also have logic that validates the type.

mshannon-sil · 2025-09-06T01:49:30Z

silnlp/nmt/hugging_face_config.py line 1223 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It would be good to clear the cached model, once we have finished translating all of the books. This isn't strictly necessary right now, but it might help us to avoid issues in the future. Here is how I would suggest doing it:

Add a clear_cache method to the config class.

Update the Translator/NMTTranslator class to be a context manager using AbstractContextManager. Call clear_cache in the __exit__ method.

In TranslationTask, use with blocks whenever we construct a translator. This will ensure that the cache is cleared when we are done translating.

This sounds good, and I've started working on it. Can you just confirm whether you'd prefer the clear_cache method in the HuggingFaceConfig class or the HuggingFaceNMTModel class from a design standpoint? To put it in the config class, I think I'd either need to move the cached model to the config class and then do some refactoring to get/set the cached model in the translate method which is not in the config class, or keep the parameters in HuggingFaceNMTModel but pass in a HuggingFaceNMTModel instance as a parameter to clear_cache so it has access. It seems simpler on the surface to put clear_cache in HuggingFaceNMTModel, but if it's preferable for the config class to do everything related to resource management or for the clear_cache method to be more broadly applicable I can definitely move it there.

ddaspit

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mshannon-sil)

silnlp/nmt/hugging_face_config.py line 1223 at r1 (raw file):

Previously, mshannon-sil wrote…

This sounds good, and I've started working on it. Can you just confirm whether you'd prefer the clear_cache method in the HuggingFaceConfig class or the HuggingFaceNMTModel class from a design standpoint? To put it in the config class, I think I'd either need to move the cached model to the config class and then do some refactoring to get/set the cached model in the translate method which is not in the config class, or keep the parameters in HuggingFaceNMTModel but pass in a HuggingFaceNMTModel instance as a parameter to clear_cache so it has access. It seems simpler on the surface to put clear_cache in HuggingFaceNMTModel, but if it's preferable for the config class to do everything related to resource management or for the clear_cache method to be more broadly applicable I can definitely move it there.

You are correct. The clear_cache method should added to the HuggingFaceNmtModel class.

cache model during translate step

39c7491

mshannon-sil requested a review from benjaminking September 3, 2025 01:23

mshannon-sil self-assigned this Sep 3, 2025

mshannon-sil linked an issue Sep 3, 2025 that may be closed by this pull request

When translating a sequence of text files using Huggingface, the model is loaded multiple times #535

Open

ddaspit requested changes Sep 3, 2025

View reviewed changes

benjaminking requested changes Sep 3, 2025

View reviewed changes

ddaspit reviewed Sep 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Prevent model from being loaded multiple times if translate method is called multiple times #801

Prevent model from being loaded multiple times if translate method is called multiple times #801

Uh oh!

mshannon-sil commented Sep 3, 2025 •

edited

Loading

Uh oh!

ddaspit left a comment

Uh oh!

benjaminking left a comment

Uh oh!

mshannon-sil commented Sep 6, 2025

Uh oh!

ddaspit left a comment

Uh oh!

Uh oh!

Uh oh!

Prevent model from being loaded multiple times if translate method is called multiple times #801

Are you sure you want to change the base?

Prevent model from being loaded multiple times if translate method is called multiple times #801

Uh oh!

Conversation

mshannon-sil commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ddaspit left a comment

Choose a reason for hiding this comment

Uh oh!

benjaminking left a comment

Choose a reason for hiding this comment

Uh oh!

mshannon-sil commented Sep 6, 2025

Uh oh!

ddaspit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mshannon-sil commented Sep 3, 2025 •

edited

Loading