This project demonstrates how to create a next word prediction model using an LSTM (Long Short-Term Memory) neural network. The model is trained on a given corpus of text data and predicts the next word in a sequence. The project uses TensorFlow and Keras for building the model, and Streamlit for creating a simple web interface to interact with the model.
-
Clone the repository:
git clone https://github.com/Sahilkumar19/NEXT-WORD-PREDICTOR.git cd NEXT-WORD-PREDICTOR -
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Use hamlet.txt to train the model I have downloaded this from NLTK itself.
-
Run the training script:
python experiments.py
This script will preprocess the data, train the LSTM model, and save the trained model (
next_word_lstm.h5) and tokenizer (tokenizer.pickle) to disk.
-
Ensure the model and tokenizer files (
next_word_lstm.h5andtokenizer.pickle) are in the root directory. -
Run the Streamlit app:
streamlit run app.py
This will start a web server and open a web interface where you can input a sequence of words and get the predicted next word.
The model training process involves the following steps:
-
Data Preprocessing: The text data is tokenized and converted into sequences of word indices. These sequences are then padded to ensure uniform length.
-
Model Definition: An LSTM model is defined using Keras, consisting of embedding, LSTM, dropout, and dense layers.
-
Model Training: The model is trained on the preprocessed data using categorical cross-entropy loss and the Adam optimizer.
-
Model Saving: The trained model and tokenizer are saved to disk for later use in prediction.
The Streamlit app provides a simple web interface for interacting with the trained model. Users can input a sequence of words, and the app will display the predicted next word based on the input.
Contributions are welcome! Please fork the repository and submit a pull request to contribute to this project.
This project is licensed under the MIT License. See the LICENSE file for details.