-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uniformize processor Mllama #33876
uniformize processor Mllama #33876
Conversation
Hey! 🤗 Thanks for your contribution to the Before merging this pull request, slow tests CI should be triggered. To enable this:
(For maintainers) The documentation for slow tests CI on PRs is here. |
self.bos_token = processor.bos_token | ||
self.bos_token_id = processor.tokenizer.bos_token_id | ||
self.tmpdirname = tempfile.mkdtemp() | ||
processor.save_pretrained(self.tmpdirname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the use of self.processor
to this to have a "clean" processor at the beginning of each test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, could be replaced by a fixture with a function
scope then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, could be replaced by a fixture with a function scope then?
Actually my bad, it looks like the setUp
and tearDown
functions are already fixtures with a function scope for unittest
. So it's not necessary to use tempfile.
I don't know then why saving processors to a tempfile is used in the setUp function of so many test_processor files, and why tempfile in general is used in so many unittest setUp function in Transformers.
Unless there is a good reason to use tempfiles, maybe we should think about removing its use in setUp functions in all tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.tmpdirname
is used in ProcessorTesterMixin so I can't remove it here in fact, but maybe we could consider removing it from ProcessorTesterMixin in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on it - it's just convenient to use and makes sure the save/load from files works, can't create race conditions either. Since so many of the transformers utils are designed to work in pair with the hub, I suppose it makes sense to use tempfile
in that regard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, I have no strong opinion on it either, so maybe no need to change it :)
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! passing to @qubvel as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try:
from typing import Unpack
except ImportError:
from typing_extensions import Unpack
should be replaced by the import of Unpack
from processing utils
self.bos_token = processor.bos_token | ||
self.bos_token_id = processor.tokenizer.bos_token_id | ||
self.tmpdirname = tempfile.mkdtemp() | ||
processor.save_pretrained(self.tmpdirname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, could be replaced by a fixture with a function
scope then?
class MllamaProcessorTest(unittest.TestCase): | ||
class MllamaProcessorTest(ProcessorTesterMixin, unittest.TestCase): | ||
processor_class = MllamaProcessor | ||
|
||
def setUp(self): | ||
self.checkpoint = "hf-internal-testing/mllama-11b" # TODO: change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not, can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with the comments above, other than that, looks great to me! Thanks for updating it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks for fixing
f23d64f
to
a657967
Compare
* uniformize processor Mllama * nit syntax * nit
What does this PR do?
Adds uniformized processors following #31911 for Mllama.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@qubvel @molbap