Bootstrap Confidence Interval R
Bootstrap Confidence Interval R, Bootstrapping is a technique for estimating the standard error of any statistic and generating a confidence interval for it.
The following is the fundamental bootstrapping procedure,
From a given dataset, take k repeated samples using replacement and calculate the statistic you’re interested in for each sample.
This yields k alternative estimates for a given statistic, which can then be used to determine the statistic’s standard error and build a confidence interval.
Animated Graph GIF with gganimate & ggplot » finnstats
Bootstrap Confidence Interval R
The following functions from the boot library can be used to conduct bootstrapping in R.
Syntax bootstrapped samples:
boot(data, statistic, R, …)
where:
data: A vector, matrix, or data frame
statistic: A function that generates the bootstrapped statistic(s).
R: Number of bootstrap replicates
Syntax bootstrapped confidence interval:
boot.ci(bootobject, conf, type)
where:
bootobject: An object returned by the boot() function
conf: Calculate the confidence interval. The default value is 0.95.
type: Calculate the type of confidence interval. “norm,” “basic,” “stud,” “perc,” “bca,” and “all” are the available options; “all” is the default.
The examples below demonstrate how to utilize these functions in practice.
What is the future of data analytics? » finnstats
Approach1: Bootstrapping a Single Statistic
The following code demonstrates how to calculate the standard error of a simple linear regression model’s R-squared.
set.seed(123) library(boot)
R-squared should be calculated using a function that you define.
R2function <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data=d) return(summary(fit)$r.square) }
Let’s perform the bootstrapping with 5000 replications
rep <- boot(data=mtcars, statistic=R2function, R=5000, formula=mpg~disp)
Okay, now we can view the results of bootstrapping
rep ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = mtcars, statistic = R2function, R = 5000, formula = mpg ~ disp) Bootstrap Statistics : original bias std. error t1* 0.7183433 0.0033729 0.06259112
From the results we can see:
This regression model’s calculated R-squared is 0.7183433.
This estimate’s standard error is 0.06259112.
We can also immediately see how the bootstrapped samples are distributed.
Boosting in Machine Learning-Complete Guide » finnstats
plot(reps)
In R, create a histogram of bootstrapped samples.
To obtain the 95 percent confidence interval for the estimated R-squared of the model, we may use the following code.
Calculate the BCa interval (adjusted bootstrap percentile).
boot.ci(rep, type="bca")
Based on 5000 bootstrap replicates
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS CALL : boot.ci(boot.out = rep, type = "bca") Intervals : Level BCa 95% ( 0.5585, 0.8163 ) Calculations and Intervals on Original Scale
We can observe from the result that the genuine R-squared values’ 95 percent bootstrapped confidence interval is (0.55585, 0.8163).
tidyverse in r – Complete Tutorial » Unknown Techniques » finnstats
Approach2: Bootstrap Multiple Statistics
set.seed(123) library(boot)
Define the function
func <- function(data, i){ df <- data[i, ] c(cor(df[, 2], df[, 3]), median(df[, 2]), mean(df[, 1]) ) }
Let’s perform the bootstrapping with 5000 replications
b <- boot(mtcars, func, R = 5000)
now we can view the results of bootstrapping
print(b) Call: boot(data = mtcars, statistic = func, R = 5000) Bootstrap Statistics : original bias std. error t1* 0.9020329 0.002333169 0.02068911 t2* 6.0000000 0.420800000 0.90082721 t3* 20.0906250 0.001976250 1.06489684
boot.ci(b, type="bca")
Based on 5000 bootstrap replicates.
Regression Analysis » Aim » Assumptions » Coefficients » finnstats
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS CALL : boot.ci(boot.out = b, type = "bca") Intervals : Level BCa 95% ( 0.8522, 0.9363 ) Calculations and Intervals on Original Scale
boot.ci(b)
CALL : boot.ci(boot.out = b) Intervals : Level Normal Basic Studentized 95% ( 0.8596, 0.9404 ) ( 0.8624, 0.9432 ) ( 0.8619, 0.9406 ) Level Percentile BCa 95% ( 0.8609, 0.9416 ) ( 0.8540, 0.9364 ) Calculations and Intervals on Original Scale