Bootstrap Confidence Interval R

by finnstats

Bootstrap Confidence Interval R, Bootstrapping is a technique for estimating the standard error of any statistic and generating a confidence interval for it.

The following is the fundamental bootstrapping procedure,

From a given dataset, take k repeated samples using replacement and calculate the statistic you’re interested in for each sample.

This yields k alternative estimates for a given statistic, which can then be used to determine the statistic’s standard error and build a confidence interval.

Animated Graph GIF with gganimate & ggplot » finnstats

Bootstrap Confidence Interval R

The following functions from the boot library can be used to conduct bootstrapping in R.

Syntax bootstrapped samples:

boot(data, statistic, R, …)

where:

data: A vector, matrix, or data frame

statistic: A function that generates the bootstrapped statistic(s).

R: Number of bootstrap replicates

Syntax bootstrapped confidence interval:

boot.ci(bootobject, conf, type)

where:

bootobject: An object returned by the boot() function

conf: Calculate the confidence interval. The default value is 0.95.

type: Calculate the type of confidence interval. “norm,” “basic,” “stud,” “perc,” “bca,” and “all” are the available options; “all” is the default.

The examples below demonstrate how to utilize these functions in practice.

What is the future of data analytics? » finnstats

Approach1: Bootstrapping a Single Statistic

The following code demonstrates how to calculate the standard error of a simple linear regression model’s R-squared.

set.seed(123)
library(boot)

R-squared should be calculated using a function that you define.

R2function <- function(formula, data, indices) {
  d <- data[indices,]
  fit <- lm(formula, data=d)
  return(summary(fit)$r.square)
}

Let’s perform the bootstrapping with 5000 replications

rep <- boot(data=mtcars, statistic=R2function, R=5000, formula=mpg~disp)

Okay, now we can view the results of bootstrapping

rep
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = mtcars, statistic = R2function, R = 5000, formula = mpg ~
    disp)
Bootstrap Statistics :
     original    bias    std. error
t1* 0.7183433 0.0033729  0.06259112

From the results we can see:

This regression model’s calculated R-squared is 0.7183433.

This estimate’s standard error is 0.06259112.

We can also immediately see how the bootstrapped samples are distributed.

Boosting in Machine Learning-Complete Guide » finnstats

plot(reps)

In R, create a histogram of bootstrapped samples.

To obtain the 95 percent confidence interval for the estimated R-squared of the model, we may use the following code.

Calculate the BCa interval (adjusted bootstrap percentile).

boot.ci(rep, type="bca")

Based on 5000 bootstrap replicates

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
CALL :
boot.ci(boot.out = rep, type = "bca")
Intervals :
Level       BCa         
95%   ( 0.5585,  0.8163 ) 
Calculations and Intervals on Original Scale

We can observe from the result that the genuine R-squared values’ 95 percent bootstrapped confidence interval is (0.55585, 0.8163).

tidyverse in r – Complete Tutorial » Unknown Techniques » finnstats

Approach2: Bootstrap Multiple Statistics

set.seed(123)
library(boot)

Define the function

func <- function(data, i){   
df <- data[i, ]   
c(cor(df[, 2], df[, 3]),     
median(df[, 2]),    
mean(df[, 1])  
) }

Let’s perform the bootstrapping with 5000 replications

b <- boot(mtcars, func, R = 5000)

now we can view the results of bootstrapping

print(b)
Call:
boot(data = mtcars, statistic = func, R = 5000)
Bootstrap Statistics :
      original      bias    std. error
t1*  0.9020329 0.002333169  0.02068911
t2*  6.0000000 0.420800000  0.90082721
t3* 20.0906250 0.001976250  1.06489684

boot.ci(b, type="bca")

Based on 5000 bootstrap replicates.

Regression Analysis » Aim » Assumptions » Coefficients » finnstats

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
CALL :
boot.ci(boot.out = b, type = "bca")
Intervals :
Level       BCa         
95%   ( 0.8522,  0.9363 ) 
Calculations and Intervals on Original Scale

boot.ci(b)

CALL :
boot.ci(boot.out = b)
Intervals :
Level Normal Basic Studentized
95% ( 0.8596, 0.9404 ) ( 0.8624, 0.9432 ) ( 0.8619, 0.9406 )
Level Percentile BCa
95% ( 0.8609, 0.9416 ) ( 0.8540, 0.9364 )
Calculations and Intervals on Original Scale

Are you looking for Data Analysis Job Vacancies?