Harry Potter project: implementation of a magical sorting hat using logistic regression.
dslr/
├── describe.py # Data analysis
├── histogram.py # Histogram visualization
├── scatter_plot.py # Scatter plot visualization
├── pair_plot.py # Pair plot visualization
├── logreg_train.py # Model training
├── logreg_predict.py # Prediction
├── evaluate.py # Complete evaluation
├── config.yaml # Configuration file
├── Makefile # Build automation
└── dslr/ # Main package
├── core/ # Business logic
│ ├── model.py # Logistic regression model
│ ├── preprocessing.py # Data preprocessing
│ └── statistics.py # Statistical functions
├── optimizers/ # Optimization algorithms
│ └── gradient_descents.py
└── visualization/ # Visualization tools
└── plots.py
python describe.py datasets/dataset_train.csvpython histogram.py datasets/dataset_train.csv
python scatter_plot.py datasets/dataset_train.csv
python pair_plot.py datasets/dataset_train.csv# Training
python logreg_train.py datasets/dataset_train.csv
# Prediction
python logreg_predict.py datasets/dataset_test.csv
# Complete evaluation
python evaluate.py- Features used: 17 (after feature engineering)
- Supported optimizers: Gradient Descent, SGD, Mini-batch
The model automatically uses:
- All original numerical features
Best Handencoded (Left=0, Right=1)Agecalculated fromBirthdayBirth_MonthandBirth_Season
Modify config.yaml to adjust:
- Learning rate (default: 0.1)
- Number of epochs (default: 10000)
- Optimizer (sgd/batch/mini-batch/compare)
- Batch size (default: 32)
- Selected features (default: all)