Linkedin Profile | Kaggle Profile
This project integrates ElasticSearch with the k-NN plugin for image similarity search, covering over 1 million images from the Open Images Dataset. By using the VGG16 model to extract features from images, it enables efficient retrieval based on feature similarity. Additionally, fuzzy search is supported through the addition of tags and the use of dimensionality reduction techniques.
This project integrates ElasticSearch with the k-NN plugin for image similarity search. By using the VGG16 model to extract features from images, it enables efficient retrieval based on feature similarity. Additionally, fuzzy search is supported through the addition of tags and the use of dimensionality reduction techniques.
Here is the directory structure of the project:
Image-Search-Engine/
│
├── flask-app/
│ ├── Dockerfile.flask # Dockerfile for Flask app configuration
│ ├── Feature_ExtractorManager.py # Script for feature extraction from images using VGG16
│ ├── ElasticManager.py # Script for loading image features into ElasticSearch
│ ├── requirements.txt # List of dependencies for Flask app
│ └── app.py # Flask app
│
│
├── elasticsearch/
│ ├── Dockerfile.elasticsearch # Dockerfile for ElasticSearch with k-NN plugin
│
└── docker-compose.yml # Docker Compose file to orchestrate containers
- flask-app/: Contains the Flask app code for handling image feature extraction and interaction with ElasticSearch.
- elasticsearch/: Contains the configuration for ElasticSearch, including the k-NN plugin setup.
- docker-compose.yml: Defines and manages the services for both the Flask app and ElasticSearch.
- Flask: A lightweight Python web framework used to create the backend for this project. It handles image processing, feature extraction, and managing communication with the ElasticSearch service.
- JavaScript, CSS, HTML: Used for building the front-end interface. JavaScript handles the dynamic interaction with the backend, while HTML and CSS are used for structuring and styling the web page.
- ElasticSearch: A distributed search engine used for storing image feature data and performing k-NN (k-Nearest Neighbor) searches. ElasticSearch is enhanced with the k-NN plugin to support similarity searches based on the VGG16 image features.
You can easily set up both ElasticSearch and the Flask app using Docker Compose. Follow the steps below to get started.
git clone https://github.com/yusufM03/Image-Search-Engine.git
cd Image-Search-Engine
Docker Compose will automatically build the necessary images for both ElasticSearch and the Flask app from their respective Dockerfiles.
Run the following command to build and start the services:
docker-compose up --build
This will start both the ElasticSearch container (with the k-NN plugin) and the Flask app container. ElasticSearch will be accessible on http://localhost:9200
, and the Flask app will run on http://localhost:5000
.
To stop the services, use the following command:
docker-compose down
- ElasticSearch Dockerfile: Custom settings for ElasticSearch and the k-NN plugin are specified in
Dockerfile.elasticsearch
. - Flask App Dockerfile: The Flask app, including its dependencies, is configured in
Dockerfile.flask
.
- Use the VGG16 model to extract image features, which represent high-level patterns and semantics that allow for similarity-based retrieval.
Open a new terminal and run:
cd flask-app
python Feature_ExtractorManager.py
- Define the index mapping to store image feature vectors.
- Load the image features extracted from VGG16 into ElasticSearch by running:
cd flask-app
python ElasticManager.py
- Approximate k-NN: Use ElasticSearch's k-NN plugin to efficiently perform approximate k-NN (k-Nearest Neighbors) search for retrieving similar images based on extracted features.
- Dimensionality Reduction using PCA: Reduce the dimensionality of feature vectors using Principal Component Analysis (PCA) for improved search performance.
- Adding Tags for Fuzzy Search: Add tags to image data to support fuzzy search, narrowing down search results based on specific filters.
- ElasticSearch Docker Guide
- Machine Learning NLP Text Embedding Vector Search Example
- k-NN Search Documentation
This project is licensed under the MIT License - see the LICENSE file for details.