Systematic Random Sampling in R with Example
Systematic Random Sampling, researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.
Principal Component Analysis in R » finnstats
Systematic Random Sampling
Systematic sampling is a widely used sampling approach that involves a simple two-step procedure.
1. Sort the members of a population into some sort of order.
2. Select every nth member to be included in the sample from a random beginning point.
This article will show you how to use R to perform systematic sampling.
Decision tree regression and Classification » finnstats
Approach: Systematic Sampling in R
Assume a CEO wishes to gather a sample of 100 employees from a company with a total workforce of 800.
He opts for systematic sampling, in which he arranges each employee alphabetically by the last name, selects a random beginning point, and selects every eighth employee to be included in the sample.
The following code demonstrates how to generate a fictitious data frame in R.
Artificial Intelligence and Data Science » finnstats
Let’s make this a repeatable example.
set.seed(123)
We can now make a basic function that generates random last names.
rnames <- function(n = 5000) {do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))}
Now we can create a data frame
data<-data.frame(lastname = rnames(800),score = rnorm(800, mean=68, sd=2.4))
Let’s view the first six rows of the data frame.
summarize in r, Data Summarization In R » finnstats
head(data)
lastname score 1 OKGCZ 68.54907 2 SXKBR 68.19697 3 NKUCC 69.64463 4 CDAAI 65.27310 5 JGSHK 68.92277 6 RXHZT 69.18259
The following code demonstrates how to use systematic sampling to obtain a sample of 100 students:
To obtain a systematic sample, define a function.
getsys = function(N,n){ k = ceiling(N/n) r = sample(1:k, 1) seq(r, r + k*(n-1), k) }
To obtain a systematic sample
systsample<- data[getsys(nrow(data), 100), ]
Now we can view the first six rows of the data frame.
How to Make Boxplot in R-Quick Start Guide » finnstats
head(systsample)
lastname score 7 VRQCQ 66.57763 15 EJXCD 70.10791 23 GMVDS 66.85846 31 GMXRE 64.73415 39 ILJZI 72.58693 47 BQNTY 63.32294
Okey, view dimensions of the data frame.
Free Data Science Books » EBooks » finnstats
dim(systsample) [1] 100 2
Conclusion
It’s worth noting that the sample’s first member was in row 7 of the original data frame. The next member of the sample is 8 rows after the previous one.
We can observe that the systematic sample we got is a data frame with 100 rows and 2 columns by using dim().