Skip to content

Conversation

@mh-northlander
Copy link

fix #335.

From transformers>=4.34, PreTrainedTokenizer.__init__ requires self.vocab to be set.
Move the super(...).__init__ call to the end of JanomeSubwordsTokenizer.__init__ (and use unk_token instead of self.unk_token before init), following changes of BertTokenizer at that time.

Also move the call of self.add_tokens since it requires super(...).__init__ is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AttributeError: 'JanomeSubwordsTokenizer' object has no attribute 'vocab'

1 participant