Pravah (เคชเฅเคฐเคตเคพเคน) - Sanskrit word meaning "flow" or "current"
A beginner-friendly machine learning project for learning data science fundamentals through flood severity prediction.
Pravah is a comprehensive data science learning project that combines data exploration, machine learning, and web application development. This project focuses on building a binary classification model to predict flood severity using geographical and meteorological features.
- Predicts flood severity (Severe vs Non-Severe) based on environmental factors
- Interactive web application built with Streamlit for real-time predictions
- Complete ML pipeline from data exploration to model deployment
- Beginner-friendly with detailed explanations and learning opportunities
- ๐ Interactive Data Exploration - Visualize and understand flood patterns
- ๐ค Machine Learning Models - Random Forest and Logistic Regression
- ๐ฎ Real-time Predictions - Input features and get instant flood risk assessment
- ๐ Rich Visualizations - Correlation heatmaps, feature importance, ROC curves
- ๐จ Professional UI - Clean, responsive Streamlit interface
pravah_flood_detection/
โโโ ๐ app.py # Streamlit web application (COMPLETE)
โโโ ๐ flood_data_exploration.ipynb # Jupyter notebook for data analysis
โโโ ๐ flood_dataset_classification.csv # Flood dataset
โโโ ๐ฆ requirements.txt # Python dependencies
โโโ ๐ README.md # Project documentation
โโโ ๐ LICENSE # MIT License
โโโ ๐ง .gitignore # Git ignore rules
โโโ ๐ venv/ # Virtual environment
-
Clone and Setup
git clone https://github.com/Rajath-Raj/pravah_flood_detection.git cd pravah_flood_detection python -m venv venv .\venv\Scripts\activate # Windows source venv/bin/activate # macOS/Linux
-
Install Dependencies
pip install -r requirements.txt
-
Launch the App
streamlit run app.py
๐ฑ Open your browser to:
http://localhost:8501
- View dataset overview and key statistics
- Quick data preview and project status
- Understanding the flood classification problem
- Dataset information (shape, data types, memory usage)
- Statistical summaries and missing value analysis
- Target variable distribution with interactive charts
- Correlation Heatmap: See how features relate to each other
- Feature Distributions: Understand data patterns
- Scatter Plots: Explore relationships between variables
- Adjustable Parameters: Test size, number of trees, random state
- Real-time Training: Watch your model learn with progress indicators
- Performance Metrics: Accuracy, confusion matrix, classification report
- Feature Importance: See which factors matter most for flood prediction
- Interactive Form: Input latitude, longitude, elevation, slope, distance
- Instant Results: Get flood risk assessment with confidence scores
- Visual Feedback: See prediction probabilities with interactive charts
- โ Full Streamlit application with 5 interactive pages
- โ Professional UI with custom styling and responsive design
- โ Real-time model training and prediction capabilities
- โ Interactive visualizations with Plotly and Matplotlib
- โ Binary classification model (Severe vs Non-Severe floods)
- โ Feature engineering (created target variable from rainfall threshold)
- โ Model comparison (Random Forest vs Logistic Regression)
- โ Performance evaluation with multiple metrics
- โ Complete exploratory data analysis (EDA)
- โ Data cleaning and preprocessing
- โ Statistical analysis and visualization
- โ Feature correlation and importance analysis
- โ Virtual environment setup and dependency management
- โ Git version control with GitHub integration
- โ Clean project structure and documentation
- Data Science Fundamentals: Loading, cleaning, and analyzing real-world data
- Machine Learning: Binary classification, model training, evaluation
- Data Visualization: Creating meaningful charts and interactive plots
- Web Development: Building data science applications with Streamlit
- Python Proficiency: Working with pandas, numpy, scikit-learn, plotly
- Project Management: Structuring ML projects, version control, documentation
| Library | Purpose | Version |
|---|---|---|
| Streamlit | Web app framework | โฅ1.28.0 |
| Pandas | Data manipulation | โฅ1.3.0 |
| Scikit-learn | Machine learning | โฅ1.0.0 |
| Plotly | Interactive visualizations | โฅ5.15.0 |
| Seaborn/Matplotlib | Statistical plotting | โฅ0.11.0/โฅ3.4.0 |
- Responsive Design: Works on desktop and mobile
- Real-time Processing: Instant model training and predictions
- Interactive Visualizations: Plotly charts with zoom, hover, export
- Session Management: Trained models persist across page navigation
- Error Handling: Graceful handling of missing data or invalid inputs
- Data loading and exploration with pandas
- Statistical analysis and visualization
- Feature engineering and preprocessing
- Machine learning model training and evaluation
- Python programming for data science
- Working with Jupyter notebooks
- Building web applications with Streamlit
- Version control with Git and GitHub
- Understanding real-world data science problems
- Designing ML solutions for environmental challenges
- Interpreting model results and making decisions
Our flood prediction models achieve:
- Accuracy: 94% on test data
- Feature Importance: Rainfall, elevation, and slope are key predictors
- Real-time Predictions: Sub-second response time
- User Experience: Intuitive interface for non-technical users
- Model Persistence: Save/load trained models
- Advanced Algorithms: XGBoost, Neural Networks
- Model Comparison Dashboard: Side-by-side algorithm comparison
- Data Upload Feature: Use your own datasets
- API Development: REST API for predictions
- Docker Deployment: Containerized application
- Cloud Deployment: Deploy to Heroku/Streamlit Cloud
- Advanced Visualizations: Geospatial flood risk maps
This project is licensed under the MIT License - see the LICENSE file for details.
Rajath Raj
- ๐ง GitHub: @Rajath-Raj
- ๐ Project Link: pravah_flood_detection
- ๐ Live Demo: Run
streamlit run app.pylocally
| Feature | Screenshot | Description |
|---|---|---|
| ๐ Home Page | ![]() |
|
| โ Non-Severe Prediction | ![]() |
|
| Green checkmark showing low flood risk prediction | ||
| ๐จ Severe Prediction | ![]() |
|
| Red warning showing high flood risk prediction |
- ๐ Full-stack data science project from data to deployment
- ๐ฏ Machine learning model with 85%+ accuracy
- ๐ฅ๏ธ Interactive web application for real-time predictions
- ๐ Comprehensive data analysis with rich visualizations
- ๐ Learning-focused with detailed explanations throughout
๐ Ready to explore flood prediction with machine learning? Clone, run, and start predicting! ๐โก
Built with โค๏ธ for learning data science


