π§ [email protected] | [email protected] | π Personal Website | π» GitHub
Monash University
Starting May 2025
MISA JSC - misa.vn
January 2025 - May 2025
- Built top-ranked Vietnamese LLMs (VMLU #1 as of March 25, 2025; government-organized national top 5)
- Built automated data generation and evaluation tools, local LLMs and domain experts (Accounting, Finance Analysis)
- Focus: LLM fine-tuning and alignment, safety alignment, RAG, agent and multi-agent chatbots.
MISA JSC - misa.vn
April 2024 - December 2024
- Developed Legal document Q&A chatbot (search accuracy >95%, answer accuracy >90% accuracy on 1000+ expert-curated real-world questions, 1000+ documents each 3-300 pages long)
- Focus: Rasa, databases, chunking, RAG, text2sql, LLMs.
VinAI Research - vinai.io
March 2022 - April 2024
- Supervisor: Asst. Prof. Thien Huu Nguyen
- Research Topic: Weakly supervised learning, Information Extraction
-
SharpSeq: Empowering Continual Event Detection through Sharpness-Aware Sequential-task Learning
Thanh-Thien Le, Viet Dao*, Linh VΔn Nguyen*, Thi-Nhung Nguyen, Linh Van Ngo, Thien Huu Nguyen*
2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) -
BKEE: Pioneering Event Extraction in the Vietnamese Language
Thi-Nhung Nguyen, Bang Tran, Trong-Nghia Luu, Kiem-Hieu Nguyen and Thien Huu Nguyen
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) -
A Self-enhancement Multitask for Unsupervised Aspect Category Detection
Thi-Nhung Nguyen, Hoang Ngo, Kiem-Hieu Nguyen, Tuan-Dung Cao
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023 -
An Uncertainty-aware encoder for Aspect Detection
Thi-Nhung Nguyen, Kiem-Hieu Nguyen, Young-In Song, Tuan-Dung Cao
Findings of the Association for Computational Linguistics: EMNLP 2021
VinAI Research
March - August 2023
- Supervisor: Dr. Dat Quoc Nguyen
- Research Topic: LLM from scratch, Crawling, Ranking models, LLMs
- Programming Languages: Python, SQL
- Libraries & Frameworks: PyTorch, Rasa, FastAPI, LangGraph, ...
- AI & ML Technologies: LLMs, RAG, Embedding Models, Text2SQL, LLMs Finetuning and Alignment, Reinforcement Learning
- Databases & Search: MongoDB, SQLite, Elasticsearch, Qdrant, Hybrid Search
- Cloud Technologies: AWS, Azure
- DevOps & CI/CD: Docker, Jenkins
- Developer Tools: Git, GitHub, GitLab, Jira, VS Code
- Leadership & Management: Agile, Scrum, Team Mentorship, Project Management, Stakeholder Communication