Skip to content

Commit 76fbdb0

Browse files
committed
Update docs
1 parent 6aa7a0c commit 76fbdb0

File tree

3 files changed

+34
-23
lines changed

3 files changed

+34
-23
lines changed

_pages/about.md

+28-2
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,34 @@ title: About PyThaiNLP
44
permalink: /about/
55
---
66

7-
PyThaiNLP Project is a Thai Natural Language Processing project. We build softwares and datasets for Thai language. Our Main Project is PyThaiNLP.
7+
PyThaiNLP Project is an open source community for natural Language Processing project in the Thai language. We build softwares and datasets for Thai language. Our Main Project is PyThaiNLP.
88

9-
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to NLTK with focus on Thai language. PyThaiNLP started at 2017.
9+
## About Project
10+
11+
Our project are open source. We create softwares, models, and datasets for Thai language to public and are open source license.
12+
13+
**See all our project at [pythainlp.org/projects/](https://pythainlp.org/projects/)**
14+
15+
Hugging Face: [https://huggingface.co/pythainlp](https://huggingface.co/pythainlp)
16+
17+
18+
## About PyThaiNLP
19+
20+
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to nltk, with focus on Thai language.
1021

1122
See: [Our Paper](https://aclanthology.org/2023.nlposs-1.4/)
23+
24+
## PyThaiNLP Features
25+
- Convenient character and word classes, like Thai consonants (pythainlp.thai_consonants), vowels (pythainlp.thai_vowels), digits (pythainlp.thai_digits), and stop words (pythainlp.corpus.thai_stopwords) -- comparable to constants like string.letters, string.digits, and string.punctuation
26+
- Thai linguistic unit segmentation/tokenization, including sentence (sent_tokenize), word (word_tokenize), and subword segmentations based on Thai Character Cluster (subword_tokenize)
27+
- Thai part-of-speech tagging (pos_tag)
28+
- Thai spelling suggestion and correction (spell and correct)
29+
- Thai transliteration (transliterate)
30+
- Thai soundex (soundex) with three engines (lk82, udom83, metasound)
31+
- Thai collation (sort by dictionary order) (collate)
32+
- Read out number to Thai words (bahttext, num_to_thaiword)
33+
- Thai datetime formatting (thai_strftime)
34+
- Thai-English keyboard misswitched fix (eng_to_thai, thai_to_eng)
35+
- Command-line interface for basic functions, like tokenization and pos tagging (run thainlp in your shell)
36+
37+
Please see [our tutorials](https://pythainlp.org/tutorials) on how to apply these functions to machine-learning problems.

_pages/projects.md

+3
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ permalink: /projects/
77

88
All 25 projects
99

10+
Hugging Face: [https://huggingface.co/pythainlp](https://huggingface.co/pythainlp)
11+
12+
1013
## AttaCut
1114
[![License: MIT](https://img.shields.io/badge/License-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)
1215
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/PyThaiNLP/attacut/graphs/commit-activity)

index.md

+3-21
Original file line numberDiff line numberDiff line change
@@ -4,32 +4,14 @@ layout: default
44

55
Welcome to The Official PyThaiNLP Project Website.
66

7-
The PyThaiNLP Project is a Thai Natural Language Processing project. We build softwares and datasets for Thai language. Our Main Project is PyThaiNLP.
7+
PyThaiNLP Project is an open source community for natural Language Processing project in the Thai language. We build softwares and datasets for Thai language. Our Main Project is PyThaiNLP that is a Python package for text processing and linguistic analysis on Thai language.
8+
9+
See more about the projec: [pythainlp.org/about](https://pythainlp.org/about)
810

911
**See all our project at [pythainlp.org/projects/](https://pythainlp.org/projects/)**
1012

1113
**สำหรับภาษาไทย คุณสามารถเยี่ยมชมเว็บภาษาไทยของ PyThaiNLP ได้ที่ [pythainlp.org/th/](https://pythainlp.org/th/)**
1214

13-
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to nltk, with focus on Thai language.
14-
15-
## PyThaiNLP Features
16-
- Convenient character and word classes, like Thai consonants (pythainlp.thai_consonants), vowels (pythainlp.thai_vowels), digits (pythainlp.thai_digits), and stop words (pythainlp.corpus.thai_stopwords) -- comparable to constants like string.letters, string.digits, and string.punctuation
17-
- Thai linguistic unit segmentation/tokenization, including sentence (sent_tokenize), word (word_tokenize), and subword segmentations based on Thai Character Cluster (subword_tokenize)
18-
- Thai part-of-speech tagging (pos_tag)
19-
- Thai spelling suggestion and correction (spell and correct)
20-
- Thai transliteration (transliterate)
21-
- Thai soundex (soundex) with three engines (lk82, udom83, metasound)
22-
- Thai collation (sort by dictionary order) (collate)
23-
- Read out number to Thai words (bahttext, num_to_thaiword)
24-
- Thai datetime formatting (thai_strftime)
25-
- Thai-English keyboard misswitched fix (eng_to_thai, thai_to_eng)
26-
- Command-line interface for basic functions, like tokenization and pos tagging (run thainlp in your shell)
27-
28-
Please see [our tutorials](https://pythainlp.org/tutorials) on how to apply these functions to machine-learning problems.
29-
30-
## Who uses PyThaiNLP?
31-
32-
You can read at [INTHEWILD.md](https://github.com/PyThaiNLP/pythainlp/blob/dev/INTHEWILD.md).
3315

3416
## Development Lead
3517
- Wannaphong Phatthiyaphaibun - foundation, distribution and maintenance

0 commit comments

Comments
 (0)