How to Perform Bootstrapping in R
How to Perform Bootstrapping in R, Bootstrapping is a method for estimating the standard error of any statistic and generating a confidence interval for the statistic.
The basic bootstrapping procedure is as follows:
Take k repeated replacement samples from a given dataset.
Calculate the statistic of interest for each sample.
These yields k different estimates for a given statistic, which you can then use to calculate the statistic’s standard error and create a confidence interval.
We can perform bootstrapping in R by calling the following boot library functions:
1. Generate bootstrap samples.
boot(data, statistic, R, …)
where:
data: A vector, matrix, or data frame
statistic: A function that produces the statistic(s) to be bootstrapped
R: Number of bootstrap replicates
2. Create a confidence interval using the bootstrap method.
boot.ci(bootobject, conf, type)
where:
bootobject: An object returned by the boot() function
conf: The confidence interval to be computed. The default value is 0.95.
type: The type of confidence interval to compute. Options include “norm”, “basic”, “stud”, “perc”, “bca” and “all” – Default is “all”
The examples below demonstrate how to use these functions in practice.
How to test the significance of a mediation effect (datasciencetut.com)
Bootstrapping a Single Statistic
The code below demonstrates how to compute the standard error for the R-squared of a simple linear regression model:
set.seed(123) library(boot)
Now we can define a function to calculate R-squared
rsq_function <- function(formula, data, indices) { d <- data[indices,] #allows boot to select sample fit <- lm(formula, data=d) return(summary(fit)$r.square) }
Let’s perform bootstrapping with 3000 replications
reps <- boot(data=mtcars, statistic=rsq_function, R=3000, formula=mpg~disp)
Ready to view the results of bootstrapping
How to Analyze Likert Scale Data? – Data Science Tutorials
reps
ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = mtcars, statistic = rsq_function, R = 3000, formula = mpg ~ disp) Bootstrap Statistics : original bias std. error t1* 0.7183433 0.003027851 0.06410851
We can see from the results:
This regression model’s estimated R-squared is 0.7183433.
This estimate has a standard error of 0.06513426.
We can also quickly see the distribution of the bootstrapped samples:
Similarity Measure Between Two Populations-Brunner Munzel Test – Data Science Tutorials
plot(reps)
We can also use the following code to compute the 95% confidence interval for the model’s estimated R-squared:
Adjusted bootstrap percentile (BCa) interval calculation
boot.ci(reps, type="bca")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 3000 bootstrap replicates CALL : boot.ci(boot.out = reps, type = "bca") Intervals : Level BCa 95% ( 0.5474, 0.8160 )
We can see from the output that the 95% bootstrapped confidence interval for the true R-squared values is (.5350, .8188).
How to Use Italic Font in R – Data Science Tutorials