Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default tokenizer for gpt_neox (the same as gpt_neo) #1

Open
wants to merge 1 commit into
base: neox20b
Choose a base branch
from

Conversation

aalok-sathe
Copy link

The tokenization_auto.py was missing a mapping for gpt_neox, causing the AutoTokenizer initialization for GPT NeoX to fail at runtime:

File ..., in load_tokenizer(model_name_or_path='./gpt-neox-20b', **kwargs={'cache_dir': '.cache/'})
     16 def load_tokenizer(model_name_or_path: str = None, **kwargs) -> AutoTokenizer:
---> 17     return AutoTokenizer.from_pretrained(model_name_or_path, **kwargs)
        model_name_or_path = './gpt-neox-20b'
        kwargs = {'cache_dir': '.cache/'}

File lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py:525, in AutoTokenizer.from_pretrained(cls=<class 'transformers.models.auto.tokenization_auto.AutoTokenizer'>, pretrained_model_name_or_path='./gpt-neox-20b', *inputs=(), **kwargs={'_from_auto': True, 'cache_dir': .cache/'})
    522         tokenizer_class = tokenizer_class_from_name(tokenizer_class_candidate)
    524     if tokenizer_class is None:
--> 525         raise ValueError(
    526             f"Tokenizer class {tokenizer_class_candidate} does not exist or is not currently imported."
    527         )
    528     return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
    530 # Otherwise we have to be creative.
    531 # if model is an encoder decoder, the encoder tokenizer class is used by default

ValueError: Tokenizer class GPTNeoXTokenizer does not exist or is not currently imported.

@zphang
Copy link
Owner

zphang commented Apr 21, 2022

Thanks! Actually NeoX doesn't use the GPT2Tokenizer. I'll fix the current PR based on this though.

zphang pushed a commit that referenced this pull request Jun 30, 2022
* chore: initial commit

Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets.

* chore: porting the rest of the modules to tensorflow

did not change the documentation yet, yet to try the playground on the model

* Fix initilizations (#1)

* fix: code structure in few cases.

* fix: code structure to align tf models.

* fix: layer naming, bn layer still remains.

* chore: change default epsilon and momentum in bn.

* chore: styling nits.

* fix: cross-loading bn params.

* fix: regnet tf model, integration passing.

* add: tests for TF regnet.

* fix: code quality related issues.

* chore: added rest of the files.

* minor additions..

* fix: repo consistency.

* fix: regnet tf tests.

* chore: reorganize dummy_tf_objects for regnet.

* chore: remove checkpoint var.

* chore: remov unnecessary files.

* chore: run make style.

* Update docs/source/en/model_doc/regnet.mdx

Co-authored-by: Sylvain Gugger <[email protected]>

* chore: PR feedback I.

* fix: pt test. thanks to @ydshieh.

* New adaptive pooler (#3)

* feat: new adaptive pooler

Co-authored-by: @Rocketknight1

* chore: remove image_size argument.

Co-authored-by: matt <[email protected]>

Co-authored-by: matt <[email protected]>

* Empty-Commit

* chore: remove image_size comment.

* chore: remove playground_tf.py

* chore: minor changes related to spacing.

* chore: make style.

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: amyeroberts <[email protected]>

* chore: refactored __init__.

* chore: copied from -> taken from./g

* adaptive pool -> global avg pool, channel check.

* chore: move channel check to stem.

* pr comments - minor refactor and add regnets to doc tests.

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: NielsRogge <[email protected]>

* minor fix in the xlayer.

* Empty-Commit

* chore: removed from_pt=True.

Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: matt <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants