Skip to content

This repository offers an in-depth guide to the essential and advanced mathematical concepts in Linear Algebra, Probability, and Statistics, which are foundational to Data Science and Machine Learning.

License

Notifications You must be signed in to change notification settings

thinklikeacto/mathematics-for-data-science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Mathematics for Data Science

Welcome to the Mathematics for Data Science repository! This repository offers an in-depth guide to the essential and advanced mathematical concepts in Linear Algebra, Probability, and Statistics, which are foundational to Data Science and Machine Learning.

This roadmap is designed to help learners and practitioners navigate through the key concepts and operations relevant to ML algorithms and data structures.

Table of Contents

Introduction

This repository is crafted as an exhaustive resource for grasping the mathematical foundations critical for success in Data Science and Machine Learning. It aims to cater to both beginners venturing into the field and seasoned professionals seeking to enhance their understanding of Linear Algebra, Probability, and Statistics. Through this roadmap, we delve into the core mathematical principles, facilitating a comprehensive learning experience and offering resources for extended learning.

Linear Algebra Roadmap

Scalars

  • What are Scalars?

Vectors

  • Introduction to Vectors
    • Row Vector and Column Vector
  • Vector Operations
    • Distance from Origin
    • Euclidean Distance between 2 Vectors
    • Scalar and Vector Addition/Subtraction (Shifting)
    • Scalar and Vector Multiplication/Division (Scaling)
    • Vector and Vector Addition/Subtraction
  • Advanced Vector Operations
    • Dot Product of 2 Vectors
    • Angle between 2 Vectors
    • Unit Vectors
    • Projection of a Vector
    • Basis Vectors
  • Vector Properties
    • Equation of a Line in n-Dimensions
    • Vector Norms
    • Linear Independence
    • Vector Spaces

Matrices

  • Foundations of Matrices
    • What are Matrices?
    • Types of Matrices
      • Orthogonal Matrices
      • Symmetric Matrices
      • Diagonal Matrices
  • Matrix Operations
    • Matrix Equality
    • Scalar Operations on Matrices
    • Matrix Addition and Subtraction
    • Matrix Multiplication
    • Transpose of a Matrix
  • Determinants and Inverses
    • Determinant
    • Minor and Cofactor
    • Adjoint of a Matrix
    • Inverse of a Matrix
  • Advanced Matrix Concepts
    • Rank of a Matrix
    • Column Space and Null Space
    • Change of Basis
    • Solving a System of Linear Equations
    • Linear Transformations
    • 3D Linear Transformations
    • Matrix Multiplication as Composition
    • Linear Transformation of Non-square Matrices
  • Dot and Cross Products
    • Understanding Dot Product
    • Exploring Cross Product

Tensors

  • Introduction to Tensors
    • Importance of Tensors in Deep Learning
    • Tensor Operations
    • Data Representation using Tensors

Eigenvalues and Eigenvectors

  • Basics of Eigenvalues and Eigenvectors
    • Eigenfaces
    • Principal Component Analysis (PCA)

Matrix Factorization Techniques

  • Decomposition Methods
    • LU Decomposition
    • QR Decomposition
    • Eigen Decomposition
    • Singular Value Decomposition (SVD)
    • Non-Negative Matrix Factorization

Advanced Topics

  • Further Exploration in Linear Algebra
    • Moore-Penrose Pseudoinverse
    • Quadratic Forms
    • Positive Definite Matrices
    • Hadamard Product

Probability Roadmap

What is Probability?

  • Basic Terms like Random Experiment, Trial, Outcome, Sample Space, Event
  • Types of Events
  • Empirical Probability Vs Theoretical Probability

Random Variable

  • What is a Random Variable
    • Probability Distribution of a Random Variable
    • Mean of a Random Variable
    • Variance of a Random Variable

Contingency Tables in Probability

  • Venn Diagrams
  • Joint Probability
  • Marginal Probability
  • Conditional Probability

Bayes Theorem

  • Independent Events
  • Mutually Exclusive Events
  • Bayes Theorem

Statistics Roadmap

Descriptive Statistics

  • What is Stats/Types of Stats
  • Population Vs Sample
  • Types of Data
  • Measures of Central Tendency
    • Mean
    • Median
    • Mode
    • Weighted Mean
    • Trimmed Mean
  • Measure of Dispersion
    • Range
    • Variance
    • Standard Deviation
    • Coefficient of Variation
  • Quantiles and Percentiles
  • 5 number summary and BoxPlot
  • Skewness
  • Kurtosis
  • Plotting Graphs
    • Univariate Analysis
    • Bivariate Analysis
    • Multivariate Analysis

Correlation

  • Covariance
  • Covariance Matrix
  • Pearson Correlation Coefficient
  • Spearman Correlation Coefficient
  • Correlation and Causation

Probability Distributions

  • Random Variables
  • What are Probability Distributions
  • Why are Probability Distributions important
  • Probability Distribution Functions and it's types
    • Probability Mass Function (PMF)
    • CDF of PMF
    • Probability Density Function(PDF)
    • CDF of PDF
    • Density Estimation
    • Parametric Density Estimation
    • Non-Parametric Density Estimation
    • Kernel Density Estimation(KDE)
  • How to use PDF/PMF and CDF in Analysis
  • 2D Density Plots

Types of Probability Distributions

  • Normal Distribution
    • Properties of Normal Distribution
    • CDF of Normal Distribution
    • Standard Normal Variate
  • Uniform Distribution
  • Bernoulli Distribution
  • Binomial Distribution
  • Multinomial Distribution
  • Log Normal Distribution
  • Pareto Distribution
  • Chi-square Distribution
  • Student's T Distribution
  • Poisson Distribution
  • Beta Distribution
  • Gamma Distribution
  • Transformations

Confidence Intervals

  • Point Estimates
  • Confidence Intervals
    • Confidence Interval(Sigma Known)
    • Confidence Interval(Sigma Unknown)
  • Interpreting Confidence Interval
  • Margin of Error and factors affecting it

Central Limit Theorem

  • Sampling Distribution
  • What is CLT
  • Standard Error

Hypothesis Tests

  • What is Hypothesis Testing?
  • Null and Alternate Hypothesis
  • Steps involved in a Hypothesis Test
  • Performing Z-test
  • Rejection Region Approach
  • Type 1 Vs Type 2 Errors
  • One Sided vs 2 sided tests
  • Statistical Power
  • P-value
  • How to interpret P-values

Types of Hypothesis Tests

  • Z-test
  • T-test
    • Single Sample T-test
    • Independent 2 sample t-test
    • Paired 2 sample t-test
  • Chi-square Test
  • Chi-square Goodness of Fit Test
  • Chi-square Test of Independence
  • ANOVA
    • One Way Anova
    • Two Way Anova
  • F-test
  • Levene Test
  • Shapiro Wilk Test
  • K-S Test
  • Fisher's Test

Miscellaneous Topics

  • Chebyshev's Inequality
  • QQ Plot
  • Sampling
  • Resampling Techniques
  • Bootstraping
  • Standardization
  • Normalization
  • Statistical Moments
  • Bayesian Statistics
  • A/B Testing
  • Law of Large Numbers

Contributing

Contributions to this repository are welcome! If you find any errors, have suggestions for improvement, or want to add new topics, feel free to fork this repository, make your changes, and submit a pull request. Let's collaborate to make this resource even better.

About

This repository offers an in-depth guide to the essential and advanced mathematical concepts in Linear Algebra, Probability, and Statistics, which are foundational to Data Science and Machine Learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published