A Document Similarity & Text Comparison Tool
TextSure is a web-based document similarity checker that compares two documents and determines how similar they are using Natural Language Processing (NLP) techniques. It supports sentence-level similarity detection and provides a clean, user-friendly interface.
The project is built using Flask (Python) for the backend and HTML/CSS/JavaScript for the frontend, and is deployed on Render.
- Upload and compare two documents
- Overall similarity percentage using TF-IDF & cosine similarity
- Sentence-level similarity detection
- Simple and responsive UI
- Deployed on cloud (Render)
- Secure file handling (temporary storage & cleanup)
- Python
- Flask
- Flask-CORS
- scikit-learn (TF-IDF, cosine similarity)
- pdfplumber
- python-docx
- BeautifulSoup
- HTML5
- CSS3
- Vanilla JavaScript (Fetch API)
- Render (Free Tier)
- Gunicorn
- User uploads two documents
- Text is extracted from each file
- Text is normalized and cleaned
- TF-IDF vectors are generated
- Cosine similarity is calculated
- If similarity is high, similar sentences are highlighted
| File Type | Status |
|---|---|
.txt |
Supported |
.docx |
Supported |
.pdf |
Supported (small files) |
Images (.png, .jpg) |
Disabled on cloud |
⚠️ Image OCR is disabled on the deployed version due to cloud infrastructure limitations.
🔗 Live URL:
https://textsure.onrender.com
Note: On Render’s free tier, large PDF files may fail due to CPU and timeout limits. This does not affect the core logic of the application.
git clone https://github.com/Vrinda2403/TextSure.git
cd TextSurepython -m venv venv
venv\Scripts\activatepip install -r requirements.txtpython app.py