Systematic Sampling in R with example

Systematic Sampling in R, Systematic sampling is a sort of probability sampling in which individuals of a bigger population are chosen at random from a larger population but at a fixed, periodic interval.

The fixed periodic interval, also known as the sampling interval, is calculated by dividing the population size by the required sample size.

Researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.

Systematic sampling is a widely used sampling approach that involves a simple two-step procedure.

1. Sort the members of a population into some sort of order.

2. Select every nth member to be included in the sample from a random beginning point.

Systematic Sampling in R as an example

Assume a school manager wants to take a sample of 100 students from a school with a total enrollment of 500.

In systematic sampling, which requires alphabetizing each student by name, choosing a starting point at random, and picking every fifth student to be included in the sample.

The following code demonstrates how to generate a fictitious data frame in R:

Make this example repeatable.

`set.seed(123)`

develop a simple function for generating random names

```Names <- function(n = 2000) {
do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
}```

Now we can create a data frame

```df <- data.frame(name = Names(500),
score = rnorm(500, mean=25, sd=5))```

Let’s view the first six rows of the data frame

`head(df)`
```  name    score
1 XAEIP 22.83760
2 RDPZK 23.21978
3 QIQER 27.75507
4 HVSTN 26.25443
5 LHDUA 23.26211
6 ZCMLO 30.46350```

The following code demonstrates how to use systematic sampling to obtain a sample of 100 students:

To obtain a systematic sample, define a function.

```sys = function(N,n){
k = ceiling(N/n)
r = sample(1:k, 1)
seq(r, r + k*(n-1), k)
}```

assemble a systematic sample

`sys_sample<-df[sys(nrow(df), 100), ]`

Now we can view the first six rows of the data frame

`head(sys_sample)`
```    name    score
3  QIQER 27.75507
8  FGNNE 19.50552
13 BSSUH 28.75092
18 JFSIS 24.28128
23 RAXJU 18.27119
28 THUAR 29.22662```
```dim(sys_sample)
[1] 100   2```

It’s worth noting that the sample’s first member was in row 3 of the original data frame. The next member of the sample is 5 rows after the previous one.

We can observe that the systematic sample we got is a data frame with 100 rows and 2 columns by using dim().