This repository contains an end-to-end text summarization project, leveraging state-of-the-art NLP techniques to summarize long-form text efficiently. The project is modular, scalable, and designed for seamless deployment.
- Features
- Folder Structure
- Installation
- Usage
- Configuration
- Docker Support
- AWS CICD Deployment with GitHub Actions
- License
- Customizable text summarization: Supports various text lengths and summarization parameters.
- End-to-end pipeline: From preprocessing to generating summaries.
- Dockerized Deployment: Easy deployment using Docker.
- Configurable Parameters: Flexible settings via a YAML configuration file.
- Modular Codebase: Easily extendable for further development.
.
├── .github/workflows # GitHub workflows for CI/CD
├── config # Configuration files
├── research # Research and exploration notebooks
├── src/textSummarizer # Core modules for text summarization
├── .gitignore # Git ignore rules
├── Dockerfile # Dockerfile for containerization
├── LICENSE # License file
├── README.md # Project documentation
├── app.py # Application entry point
├── main.py # Main pipeline script
├── params.yaml # Parameter configuration file
├── requirements.txt # Python dependencies
├── setup.py # Setup script for packaging
├── template.py # Template script for new modules
- Python 3.8+
- pip
- Docker (optional)
-
Clone the repository:
git clone https://github.com/FAbdullah17/End-to-End-Text-Summarizer.git cd End-to-End-Text-Summarizer -
Install dependencies:
pip install -r requirements.txt
-
(Optional) Install the package:
python setup.py install
Run the pipeline using the following command:
python main.pyStart the application:
python app.pyAccess the app at http://127.0.0.1:5000 in your web browser.
Update the params.yaml file to configure the model and other settings:
text_length: 1000
summary_length: 150
model_name: bert-based-summarizerdocker build -t text-summarizer .docker run -p 5000:5000 text-summarizer#with specific access
1. EC2 access : It is virtual machine
2. ECR: Elastic Container registry to save your docker image in aws
#Description: About the deployment
1. Build docker image of the source code
2. Push your docker image to ECR
3. Launch Your EC2
4. Pull Your image from ECR in EC2
5. Lauch your docker image in EC2
#Policy:
1. AmazonEC2ContainerRegistryFullAccess
2. AmazonEC2FullAccess
- Save the URI: 566373416292.dkr.ecr.us-east-1.amazonaws.com/text-s
#optinal
sudo apt-get update -y
sudo apt-get upgrade
#required
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
setting>actions>runner>new self hosted runner> choose os> then run command one by one
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_REGION = us-east-1
AWS_ECR_LOGIN_URI = demo>> 566373416292.dkr.ecr.ap-south-1.amazonaws.com
ECR_REPOSITORY_NAME = simple-app
This project is licensed under the MIT License. See the LICENSE file for details.