Systematic Sampling in R with example

Systematic Sampling in R, Systematic sampling is a sort of probability sampling in which individuals of a bigger population are chosen at random from a larger population but at a fixed, periodic interval.

The fixed periodic interval, also known as the sampling interval, is calculated by dividing the population size by the required sample size.

Researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.

Systematic sampling is a widely used sampling approach that involves a simple two-step procedure.

1. Sort the members of a population into some sort of order.

2. Select every nth member to be included in the sample from a random beginning point.

Systematic Sampling in R as an example

Assume a school manager wants to take a sample of 100 students from a school with a total enrollment of 500.

In systematic sampling, which requires alphabetizing each student by name, choosing a starting point at random, and picking every fifth student to be included in the sample.

The following code demonstrates how to generate a fictitious data frame in R:

Make this example repeatable.

set.seed(123)

develop a simple function for generating random names

Names <- function(n = 2000) {
  do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
}

Now we can create a data frame

df <- data.frame(name = Names(500),
                 score = rnorm(500, mean=25, sd=5))

Let’s view the first six rows of the data frame

head(df)
  name    score
1 XAEIP 22.83760
2 RDPZK 23.21978
3 QIQER 27.75507
4 HVSTN 26.25443
5 LHDUA 23.26211
6 ZCMLO 30.46350

The following code demonstrates how to use systematic sampling to obtain a sample of 100 students:

To obtain a systematic sample, define a function.

sys = function(N,n){
  k = ceiling(N/n)
  r = sample(1:k, 1)
  seq(r, r + k*(n-1), k)
}

assemble a systematic sample

sys_sample<-df[sys(nrow(df), 100), ]

Now we can view the first six rows of the data frame

head(sys_sample)
    name    score
3  QIQER 27.75507
8  FGNNE 19.50552
13 BSSUH 28.75092
18 JFSIS 24.28128
23 RAXJU 18.27119
28 THUAR 29.22662
dim(sys_sample)
[1] 100   2

It’s worth noting that the sample’s first member was in row 3 of the original data frame. The next member of the sample is 5 rows after the previous one.

We can observe that the systematic sample we got is a data frame with 100 rows and 2 columns by using dim().

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

2 × 5 =

Ads Blocker Image Powered by Code Help Pro

Quality articles need supporters. Will you be one?

You currently have an Ad Blocker on.

Please support FINNSTATS.COM by disabling these ads blocker.

Powered By
Best Wordpress Adblock Detecting Plugin | CHP Adblock
Available for Amazon Prime