-
Notifications
You must be signed in to change notification settings - Fork 0
Machine Learning & Data Science Supervised Learning Bootstrapping
Bootstrapping is a statistical resampling technique used to produce a number of representative datasets from a single dataset using sampling with replacement. It is particularly useful when the underlying distribution of the dataset is unknown, or when analytical methods of estimating distribution parameters is not possible.
Bootstrapping can provide confidence intervals for a statistic without relying on strong parametric assumptions (e.g., normality of data).
The variability of the bootstrap samples can be used to estimate the standard error of a statistic.
In machine learning, bootstrap sampling is used in bagging algorithms like Random Forest to train models on different subsets of the data.
Bootstrapping allows for hypothesis testing by creating a null distribution from the resampled data.