Cluster Sampling in R With Examples
Cluster Sampling in R, as discussed in one of our old posts, researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.
Cluster sampling, in which a population is divided into clusters and all members of particular clusters are chosen to be included in the sample, is a frequent sampling method.
This tutorial will show you how to use R to perform cluster sampling.
Approach: Cluster Sampling in R
Let’s say a consumer goods company wishes to conduct a survey of its clients. They choose four goods at random from the ten goods groups and ask each consumer to score their experience on a scale of one to ten.
The following code demonstrates how to interact with a dummy data frame in R.
create a repeatable example
Yes, let’s create a data frame
df <- data.frame(goods = rep(1:10, each=50),experience = rnorm(500, mean=5, sd=2.2))
Now we can view the first six rows of the data frame.
goods experience 1 1 5.2516591 2 1 4.3155235 3 1 5.6258145 4 1 2.0780667 5 1 6.1671238 6 1 0.3534712
And the code below demonstrates how to get a sample of customers by picking four goods at random and including every member of those goods in the sample.
Out of the ten goods groups, choose four at random.
clusters <- sample(unique(df$goods), size=4, replace=F)
All participants of one of the four goods groups are included in the sample.
sample <- df[df$goods %in% clusters, ]
View how many customers came from each tour
2 3 7 10 50 50 50 50
We can observe from the output that:
The sample includes 50 customers from goods groups 2, 3, 7, and 10.
As a result, our sample is made up of 200 clients who arrived from four distinct goods groups.