Skip to content

RRisto/sentencepiece_experiments

Repository files navigation

Sentencpiece tokenizer tests

Testing sentencepiece tokenizer on Estonian language.

Experiment

  • run 1.0_train_tokenizers_risto.ipynb to train sentencepiece tokenizers with different parameters
  • run 2.0_train_sklearn_models_risto.ipynb to test tokenizers in sklearn text classification

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published