Systematic Random Sampling in R with Example

Systematic Random Sampling, researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.

Principal Component Analysis in R » finnstats

Systematic Random Sampling

Systematic sampling is a widely used sampling approach that involves a simple two-step procedure.

1. Sort the members of a population into some sort of order.

2. Select every nth member to be included in the sample from a random beginning point.

This article will show you how to use R to perform systematic sampling.

Decision tree regression and Classification » finnstats

Approach: Systematic Sampling in R

Assume a CEO wishes to gather a sample of 100 employees from a company with a total workforce of 800.

He opts for systematic sampling, in which he arranges each employee alphabetically by the last name, selects a random beginning point, and selects every eighth employee to be included in the sample.

The following code demonstrates how to generate a fictitious data frame in R.

Artificial Intelligence and Data Science » finnstats

Let’s make this a repeatable example.

set.seed(123)

We can now make a basic function that generates random last names.

rnames <- function(n = 5000) {do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))}

Now we can create a data frame

data<-data.frame(lastname = rnames(800),score = rnorm(800, mean=68, sd=2.4)) 

Let’s view the first six rows of the data frame.

summarize in r, Data Summarization In R » finnstats

head(data)
  lastname    score
1    OKGCZ 68.54907
2    SXKBR 68.19697
3    NKUCC 69.64463
4    CDAAI 65.27310
5    JGSHK 68.92277
6    RXHZT 69.18259

The following code demonstrates how to use systematic sampling to obtain a sample of 100 students:

To obtain a systematic sample, define a function.

getsys = function(N,n){
  k = ceiling(N/n)
  r = sample(1:k, 1)
  seq(r, r + k*(n-1), k)
}

To obtain a systematic sample

systsample<- data[getsys(nrow(data), 100), ]

Now we can view the first six rows of the data frame.

How to Make Boxplot in R-Quick Start Guide » finnstats

head(systsample)
  lastname    score
7     VRQCQ 66.57763
15    EJXCD 70.10791
23    GMVDS 66.85846
31    GMXRE 64.73415
39    ILJZI 72.58693
47    BQNTY 63.32294

Okey, view dimensions of the data frame.

Free Data Science Books » EBooks » finnstats

dim(systsample)
[1] 100   2

Conclusion

It’s worth noting that the sample’s first member was in row 7 of the original data frame. The next member of the sample is 8 rows after the previous one.

We can observe that the systematic sample we got is a data frame with 100 rows and 2 columns by using dim().

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

19 + seventeen =