Releases: PyThaiNLP/pythainlp-corpus
Releases · PyThaiNLP/pythainlp-corpus
OSCAR word freq icu v1.0
OSCAR word freq v0.1 buit from icu word tokenize
Authors: Korakot Chaovavanich @korakot
from https://web.facebook.com/groups/colab.thailand/permalink/1524070061101680/?_rdc=1&_rdr
TNC unigarm 201712 and bi/ti-garm 201705
It is mrror TNC word frequency from Thai National Corpus (TNC)
LST20 CLS v0.2
LST20 v0.2.3
LST20 CLS v0.1
lst20-cls-v0.1 Update db.json
LST20 v0.2.2
- Rename taggers to
pos_lst20_unigram
andpos_lst20_perceptron
, following the convention of other POS taggers in PyThaiNLP - Minify Unigram JSON file
LST20 v0.2
lst20-v0.2 Update db.json
LST20 v0.1
lst20-v0.1 Update LICENSE
wiki_lm_lstm v0.32
- authors : Charin Polpanumas
- description : ULMFit for LSTM
- license : cc-by-sa-4.0
- url : https://github.com/cstorm125/thai2fit/
thai2fit_wv v0.1
thai2fit_wv v0.1
- Authors : Charin Polpanumas
- Description : thai2vec word embeddings
- License : cc-by-sa-4.0
- Url : https://github.com/cstorm125/thai2fit/