PyThaiNLP v3.1.0-beta0
Pre-release
Pre-release
This is the beta version for PyThaiNLP v3.1.
You can install by pip install --pre pythainlp==3.1.0b0
.
Documentation: https://pythainlp.github.io/dev-docs/
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See 3.1 Milestone.
What is new?
Deprecation and other API changes
#687 Remove deprecated function
- pythainlp.word_vector; doesnt_match, get_model, most_similar_cosmul, sentence_vectorizer, similarity. use WordVector class instead
- pythainlp.util.delete_tone. use pythainlp.util.remove_tonemark instead
- Remove pythainlp.util.time_time. use pythainlp.util.time_to_thaiword instead
- pythainlp.tokenize.syllable_tokenize. use pythainlp.tokenize.subword_tokenize instead
Dependency Parsing
- Now, PyThaiNLP support dependency_parsing 🎉 Add pythainlp.parse.dependency_parsing #706
Name Entity Tagging
- #665 Add Thai-NNER
pythainlp.tag.NNER
- #658 Add LST20NER onnx model. It is LST20NER model to onnx model from fine-turning by WangchanBERTa model.
Transliteration
- #659 Add ISO 11940 transliteration
- #660 Add Thai W2P v0.2
- #686 Add wunsen
- #694 Wunsen Mandarin and Japanese update
PyThaiNLP Corpus downloader
- #656 Add support zip/tar.gz to download corpus
Text normalization
- #673 Add a normalising rule for Lakkhangyao ๅ
Translate
- #674 add gpu option
Text summarize
- #679 Add mt5 cpe kmutt thai sentence sum
Util
- #682 Add live-dead syllable classification
- #684 Add live dead syllable classify
- #690 Add tone detector
Soundex
- #699 Add Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique
Other
- #689 map NG tag to PART
- #691 Remove TinyDB as a dependency
- #692 Fix notifications that newer versions of corpora are available
- Add warning about LST20 license
What's Changed
- Add more words from Royal Society by @wannaphong in #653
- Add support zip/tar.gz to download corpus by @wannaphong in #656
- Update from dev by @wannaphong in #657
- Add ISO 11940 transliteration by @wannaphong in #659
- Add Thai W2P v0.2 and PyThaiNLP v3.0.6dev0 by @wannaphong in #660
- Add LST20NER onnx model by @wannaphong in #658
- Add Thai-NNER by @wannaphong in #665
- Update dev base from 3.0 base by @wannaphong in #668
- PyThaiNLP 3.0.7 by @wannaphong in #670
- Update dev branche from pythainlp-3.0 branche by @wannaphong in #672
- Normalise Lakkhangyao by @chameleonTK in #673
- add gpu option by @vikimark in #674
- Bump tensorflow from 2.5.3 to 2.6.4 by @dependabot in #677
- Bump tensorflow from 2.6.4 to 2.7.2 by @dependabot in #678
- Add mt5 cpe kmutt thai sentence sum by @wannaphong in #679
- Add live-dead syllable classification by @wannaphong in #682
- Fixed CI Bug by @wannaphong in #683
- Add live dead syllable classify by @wannaphong in #684
- Add wunsen by @wannaphong in #686
- Add ThaiSum sentence segmentor by @chameleonTK in #688
- map NG tag to PART by @chameleonTK in #689
- Add tone detector by @wannaphong in #690
- Remove deprecated function by @wannaphong in #687
- Remove TinyDB as a dependency by @BLKSerene in #691
- Fix notifications that newer versions of corpora are available by @BLKSerene in #692
- Start PyThaiNLP v3.1.0-dev0 by @wannaphong in #693
- Wunsen Mandarin and Japanese update by @cakimpei in #694
- Add Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique by @wannaphong in #699
- Fixed #700 by @wannaphong in #701
- Update add-word_detokenize from dev by @wannaphong in #703
- Add word_detokenize by @wannaphong in #697
- Move model by @wannaphong in #705
- Add pythainlp.parse.dependency_parsing by @wannaphong in #706
New Contributors
- @chameleonTK made their first contribution in #673
- @vikimark made their first contribution in #674
- @BLKSerene made their first contribution in #691
- @cakimpei made their first contribution in #694
Full Changelog: v3.0.9...v3.1.0-beta0
All Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP