🦛 CHONK your texts with Chonkie ✨ — The no-nonsense RAG chunking library
-
Updated
Jun 2, 2025 - Python
🦛 CHONK your texts with Chonkie ✨ — The no-nonsense RAG chunking library
DadmaTools is a Persian NLP tools developed by Dadmatech Co.
🦛 CHONK your texts with Chonkie ✨ Type-friendly, light-weight, fast and super-simple chunking library
A recursive text chunker that attempts to break the text on meaningful boundaries.
The primary purpose of this tool is to make it easier to input long YouTube transcripts into ChatGPT by splitting them into smaller chunks. Additionally, prompt-engineering techniques have been incorporated to improve the quality of the output.
A collection of useful small python helpers
A sentence chunker PHP class + visualizer for Berkeley Parser parse trees
A Processing Toolbox for Persian Texts
Contains implementation of models like BiLSTM CRF, Hierarchical BiLSTM for POS Tagging.
A lightweight and modular Retrieval-Augmented Generation (RAG) agent built with LangGraph and OpenAI. It supports document indexing with FAISS, and structured tool use. Ideal for prototyping question-answering systems over custom documents.
A GUI for batch conversion of Chunker CLI
Lazily split an array into chunks, just like slices of pizza 🍕
This repository contains Natural Language Processing programs in the Python programming language.
This is a simple project of building custom training and model data for Apache OpeNLP library. The main task is recognizing Ukrainian texts and building helpful questions and theses.
сценарій
Add a description, image, and links to the chunker topic page so that developers can more easily learn about it.
To associate your repository with the chunker topic, visit your repo's landing page and select "manage topics."