# Systematic Sampling in R with example

Systematic Sampling in R, Systematic sampling is a sort of probability sampling in which individuals of a bigger population are chosen at random from a larger population but at a fixed, periodic interval.

The fixed periodic interval, also known as the sampling interval, is calculated by dividing the population size by the required sample size.

Researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.

Systematic sampling is a widely used sampling approach that involves a simple two-step procedure.

1. Sort the members of a population into some sort of order.

2. Select every nth member to be included in the sample from a random beginning point.

## Systematic Sampling in R as an example

Assume a school manager wants to take a sample of 100 students from a school with a total enrollment of 500.

In systematic sampling, which requires alphabetizing each student by name, choosing a starting point at random, and picking every fifth student to be included in the sample.

The following code demonstrates how to generate a fictitious data frame in R:

Make this example repeatable.

set.seed(123)

develop a simple function for generating random names

Names <- function(n = 2000) { do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE)) }

Now we can create a data frame

df <- data.frame(name = Names(500), score = rnorm(500, mean=25, sd=5))

Let’s view the first six rows of the data frame

head(df)

name score 1 XAEIP 22.83760 2 RDPZK 23.21978 3 QIQER 27.75507 4 HVSTN 26.25443 5 LHDUA 23.26211 6 ZCMLO 30.46350

The following code demonstrates how to use systematic sampling to obtain a sample of 100 students:

To obtain a systematic sample, define a function.

sys = function(N,n){ k = ceiling(N/n) r = sample(1:k, 1) seq(r, r + k*(n-1), k) }

assemble a systematic sample

sys_sample<-df[sys(nrow(df), 100), ]

Now we can view the first six rows of the data frame

head(sys_sample)

name score 3 QIQER 27.75507 8 FGNNE 19.50552 13 BSSUH 28.75092 18 JFSIS 24.28128 23 RAXJU 18.27119 28 THUAR 29.22662

dim(sys_sample) [1] 100 2

It’s worth noting that the sample’s first member was in row 3 of the original data frame. The next member of the sample is 5 rows after the previous one.

We can observe that the systematic sample we got is a data frame with 100 rows and 2 columns by using dim().