Welcome to the Mathematics for Data Science repository! This repository offers an in-depth guide to the essential and advanced mathematical concepts in Linear Algebra, Probability, and Statistics, which are foundational to Data Science and Machine Learning.
This roadmap is designed to help learners and practitioners navigate through the key concepts and operations relevant to ML algorithms and data structures.
This repository is crafted as an exhaustive resource for grasping the mathematical foundations critical for success in Data Science and Machine Learning. It aims to cater to both beginners venturing into the field and seasoned professionals seeking to enhance their understanding of Linear Algebra, Probability, and Statistics. Through this roadmap, we delve into the core mathematical principles, facilitating a comprehensive learning experience and offering resources for extended learning.
- What are Scalars?
- Introduction to Vectors
- Row Vector and Column Vector
- Vector Operations
- Distance from Origin
- Euclidean Distance between 2 Vectors
- Scalar and Vector Addition/Subtraction (Shifting)
- Scalar and Vector Multiplication/Division (Scaling)
- Vector and Vector Addition/Subtraction
- Advanced Vector Operations
- Dot Product of 2 Vectors
- Angle between 2 Vectors
- Unit Vectors
- Projection of a Vector
- Basis Vectors
- Vector Properties
- Equation of a Line in n-Dimensions
- Vector Norms
- Linear Independence
- Vector Spaces
- Foundations of Matrices
- What are Matrices?
- Types of Matrices
- Orthogonal Matrices
- Symmetric Matrices
- Diagonal Matrices
- Matrix Operations
- Matrix Equality
- Scalar Operations on Matrices
- Matrix Addition and Subtraction
- Matrix Multiplication
- Transpose of a Matrix
- Determinants and Inverses
- Determinant
- Minor and Cofactor
- Adjoint of a Matrix
- Inverse of a Matrix
- Advanced Matrix Concepts
- Rank of a Matrix
- Column Space and Null Space
- Change of Basis
- Solving a System of Linear Equations
- Linear Transformations
- 3D Linear Transformations
- Matrix Multiplication as Composition
- Linear Transformation of Non-square Matrices
- Dot and Cross Products
- Understanding Dot Product
- Exploring Cross Product
- Introduction to Tensors
- Importance of Tensors in Deep Learning
- Tensor Operations
- Data Representation using Tensors
- Basics of Eigenvalues and Eigenvectors
- Eigenfaces
- Principal Component Analysis (PCA)
- Decomposition Methods
- LU Decomposition
- QR Decomposition
- Eigen Decomposition
- Singular Value Decomposition (SVD)
- Non-Negative Matrix Factorization
- Further Exploration in Linear Algebra
- Moore-Penrose Pseudoinverse
- Quadratic Forms
- Positive Definite Matrices
- Hadamard Product
- Basic Terms like Random Experiment, Trial, Outcome, Sample Space, Event
- Types of Events
- Empirical Probability Vs Theoretical Probability
- What is a Random Variable
- Probability Distribution of a Random Variable
- Mean of a Random Variable
- Variance of a Random Variable
- Venn Diagrams
- Joint Probability
- Marginal Probability
- Conditional Probability
- Independent Events
- Mutually Exclusive Events
- Bayes Theorem
- What is Stats/Types of Stats
- Population Vs Sample
- Types of Data
- Measures of Central Tendency
- Mean
- Median
- Mode
- Weighted Mean
- Trimmed Mean
- Measure of Dispersion
- Range
- Variance
- Standard Deviation
- Coefficient of Variation
- Quantiles and Percentiles
- 5 number summary and BoxPlot
- Skewness
- Kurtosis
- Plotting Graphs
- Univariate Analysis
- Bivariate Analysis
- Multivariate Analysis
- Covariance
- Covariance Matrix
- Pearson Correlation Coefficient
- Spearman Correlation Coefficient
- Correlation and Causation
- Random Variables
- What are Probability Distributions
- Why are Probability Distributions important
- Probability Distribution Functions and it's types
- Probability Mass Function (PMF)
- CDF of PMF
- Probability Density Function(PDF)
- CDF of PDF
- Density Estimation
- Parametric Density Estimation
- Non-Parametric Density Estimation
- Kernel Density Estimation(KDE)
- How to use PDF/PMF and CDF in Analysis
- 2D Density Plots
- Normal Distribution
- Properties of Normal Distribution
- CDF of Normal Distribution
- Standard Normal Variate
- Uniform Distribution
- Bernoulli Distribution
- Binomial Distribution
- Multinomial Distribution
- Log Normal Distribution
- Pareto Distribution
- Chi-square Distribution
- Student's T Distribution
- Poisson Distribution
- Beta Distribution
- Gamma Distribution
- Transformations
- Point Estimates
- Confidence Intervals
- Confidence Interval(Sigma Known)
- Confidence Interval(Sigma Unknown)
- Interpreting Confidence Interval
- Margin of Error and factors affecting it
- Sampling Distribution
- What is CLT
- Standard Error
- What is Hypothesis Testing?
- Null and Alternate Hypothesis
- Steps involved in a Hypothesis Test
- Performing Z-test
- Rejection Region Approach
- Type 1 Vs Type 2 Errors
- One Sided vs 2 sided tests
- Statistical Power
- P-value
- How to interpret P-values
- Z-test
- T-test
- Single Sample T-test
- Independent 2 sample t-test
- Paired 2 sample t-test
- Chi-square Test
- Chi-square Goodness of Fit Test
- Chi-square Test of Independence
- ANOVA
- One Way Anova
- Two Way Anova
- F-test
- Levene Test
- Shapiro Wilk Test
- K-S Test
- Fisher's Test
- Chebyshev's Inequality
- QQ Plot
- Sampling
- Resampling Techniques
- Bootstraping
- Standardization
- Normalization
- Statistical Moments
- Bayesian Statistics
- A/B Testing
- Law of Large Numbers
Contributions to this repository are welcome! If you find any errors, have suggestions for improvement, or want to add new topics, feel free to fork this repository, make your changes, and submit a pull request. Let's collaborate to make this resource even better.