Skip to content

DemoMacro/nlptools

Repository files navigation

NLPTools

GitHub Contributor Covenant

Comprehensive NLP toolkit with high-performance string distance and similarity algorithms

NLPTools provides a complete suite of natural language processing tools, including high-performance WebAssembly implementations and JavaScript-based algorithms for text similarity and distance calculations.

Packages

This is a monorepo that contains the following packages:

  • @nlptools/nlptools - Main package that exports all algorithms and utilities from the entire toolkit
  • @nlptools/distance - Complete distance algorithms package including both WebAssembly and JavaScript implementations
  • @nlptools/splitter - Text splitting utilities for document chunking and processing
  • @nlptools/tokenizer - Tokenization utilities for fast text encoding and decoding with HuggingFace models
  • @nlptools/distance-wasm - High-performance WebAssembly library with optimized Rust implementations

Quick Start

Installation

# Install the main package (includes all algorithms)
pnpm install @nlptools/nlptools

# Or install specific packages
pnpm install @nlptools/distance        # Complete distance algorithms
pnpm install @nlptools/splitter        # Text splitting utilities
pnpm install @nlptools/tokenizer       # Tokenization utilities
pnpm install @nlptools/distance-wasm   # WASM-optimized algorithms only

# Clone the repository for development
git clone https://github.com/DemoMacro/nlptools.git
cd nlptools
pnpm install

Basic Usage

// Using the main package (recommended)
import * as nlptools from "@nlptools/nlptools";

// Calculate Levenshtein distance
const distance = nlptools.levenshtein("kitten", "sitting");
console.log(`Distance: ${distance}`); // Output: 3

// Calculate normalized similarity (0-1)
const similarity = nlptools.jaro("hello", "hallo");
console.log(`Similarity: ${similarity}`); // Output: 0.8666666666666667

// Use the universal compare function
const result = nlptools.compare("apple", "apply", "levenshtein");
console.log(`Result: ${result}`); // Output: 0.2

Contributing

We welcome contributions! Here's how to get started:

Quick Setup

  1. Fork the repository on GitHub

  2. Clone your fork:

    git clone https://github.com/YOUR_USERNAME/nlptools.git
    cd nlptools
  3. Add upstream remote:

    git remote add upstream https://github.com/DemoMacro/nlptools.git
  4. Install dependencies:

    pnpm install
  5. Development mode:

    pnpm dev

Development Workflow

  1. Code: Follow our project standards
  2. Test: pnpm build
  3. Commit: Use conventional commits (feat:, fix:, etc.)
  4. Push: Push to your fork
  5. Submit: Create a Pull Request to upstream repository

Support & Community

License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❤️ by Demo Macro

About

Comprehensive NLP toolkit with high-performance string distance and similarity algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •