Systematic Sampling in R with example
Systematic Sampling in R, Systematic sampling is a sort of probability sampling in which individuals of a bigger population are chosen at random from a larger population but at a fixed, periodic interval.
The fixed periodic interval, also known as the sampling interval, is calculated by dividing the population size by the required sample size.
Researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.
Systematic sampling is a widely used sampling approach that involves a simple two-step procedure.
1. Sort the members of a population into some sort of order.
2. Select every nth member to be included in the sample from a random beginning point.
Systematic Sampling in R as an example
Assume a school manager wants to take a sample of 100 students from a school with a total enrollment of 500.
In systematic sampling, which requires alphabetizing each student by name, choosing a starting point at random, and picking every fifth student to be included in the sample.
The following code demonstrates how to generate a fictitious data frame in R:
Make this example repeatable.
set.seed(123)
develop a simple function for generating random names
Names <- function(n = 2000) { do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE)) }
Now we can create a data frame
df <- data.frame(name = Names(500), score = rnorm(500, mean=25, sd=5))
Let’s view the first six rows of the data frame
head(df)
name score 1 XAEIP 22.83760 2 RDPZK 23.21978 3 QIQER 27.75507 4 HVSTN 26.25443 5 LHDUA 23.26211 6 ZCMLO 30.46350
The following code demonstrates how to use systematic sampling to obtain a sample of 100 students:
To obtain a systematic sample, define a function.
sys = function(N,n){ k = ceiling(N/n) r = sample(1:k, 1) seq(r, r + k*(n-1), k) }
assemble a systematic sample
sys_sample<-df[sys(nrow(df), 100), ]
Now we can view the first six rows of the data frame
head(sys_sample)
name score 3 QIQER 27.75507 8 FGNNE 19.50552 13 BSSUH 28.75092 18 JFSIS 24.28128 23 RAXJU 18.27119 28 THUAR 29.22662
dim(sys_sample) [1] 100 2
It’s worth noting that the sample’s first member was in row 3 of the original data frame. The next member of the sample is 5 rows after the previous one.
We can observe that the systematic sample we got is a data frame with 100 rows and 2 columns by using dim().