How to Calculate a Bootstrap Standard Error in R

Bootstrap Standard Error in R, Bootstrapping is a technique for calculating the standard error of a mean.

The following is the basic procedure for calculating a bootstrapped standard error.

Model Selection in Machine Learning » finnstats

From a given dataset, take k repeated samples using replacement and calculate the standard error for each sample: s/√n

As a result, there are k distinct standard error estimates. Take the mean of the k standard errors to get the bootstrapped standard error.

The following examples show how to calculate a bootstrapped standard error in R using two distinct methods.

Approach 1: Boot Package

The boot() function from the boot library is one technique to calculate a bootstrap standard error in R.

In R, the following code demonstrates how to compute a bootstrap standard error for a given dataset.

Let’s take the example reproducible

set.seed(123)

Now load the boot library

library(boot)

We can define the dataset

x <- c(112, 64, 84, 78, 67, 221, 125, 219, 45, 79)

Let’s create a function to calculate mean

meanF <- function(x,i){mean(x[i])}

Okay, now we can calculate standard error using 500 bootstrapped samples

boot(x, meanF, 5000)
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = x, statistic = meanF, R = 5000)
Bootstrap Statistics :
    original   bias    std. error
t1*    109.4 -0.13972    18.41172

The “original” number of 109.4 represents the dataset’s mean. The bootstrap standard error of the mean is represented by the value 18.41 in the “std. error” column.

NLP Courses Online (Natural Language Processing) » finnstats

In this example, we used 5000 bootstrapped samples to estimate the standard error of the mean, but we could have used 1,000, 10,000, or any other number of bootstrapped samples.

Approach 2: Own Formula

We can also construct our own code to calculate a bootstrapped standard error.

The code below demonstrates how to do so:

create a repeatable example

set.seed(123)

Let’s load the boot library

library(boot)

Now we can use the same dataset

x <- c(112, 64, 84, 78, 67, 221, 125, 219, 45, 79)
mean(replicate(500, sd(sample(x, replace=T))/sqrt(length(x))))
[1] 18.11736

18.11 is the bootstrapped standard error. This standard error looks a lot like the one determined in the previous example.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

3 − three =

finnstats