Count Observations by Group in R

Count Observations by Group in R, want to count the number of observations by the group.

Fortunately, the count() function from the dplyr library makes this simple.

Using the data frame below, this tutorial shows numerous examples of how to utilize this function in practice.

Change ggplot2 Theme Color in R- Data Science Tutorials

Count Observations by Group in R

Let’s create a data frame

df <- data.frame(Q1 = c('A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'),
                 Q2 = c('G', 'G', 'F', 'G', 'F', 'F', 'F', 'G', 'G', 'F', 'F', 'F'),
                 Q3 = c(4, 13, 7, 8, 15, 15, 17, 9, 21, 22, 25, 31))
df
    Q1 Q2 Q3
1   A  G  4
2   A  G 13
3   A  F  7
4   B  G  8
5   B  F 15
6   B  F 15
7   B  F 17
8   B  G  9
9   C  G 21
10  C  F 22
11  C  F 25
12  C  F 31

Approach 1: Count by One Variable

The code below demonstrates how to count the total number of players in each team(Q1).

How to compare variances in R – Data Science Tutorials

total observations by the ‘Q1’ variable

library(dplyr)
df %>% count(Q1)
   Q1 n
1  A 3
2  B 5
3  C 4

We can observe from the output that:

There are three players on Team A.

Team B consists of five players.

There are four players on Team C.

This single count() function gives us a good indication of how many players are in each squad.

It’s worth noting that we can sort the counts if we want to.

How to draw heatmap in r: Quick and Easy way – Data Science Tutorials

count total observations by the ‘Q1’ variable

df %>% count(Q1, sort=TRUE)
   Q1 n
1  B 5
2  C 4
3  A 3

Approach 2: Count by Multiple Variables

We can sort by many variables as well.

‘Q1’ and ‘Q3’ are used to count the total number of observations.

df %>% count(Q1, Q3)
   Q1 Q3 n
1   A  4 1
2   A  7 1
3   A 13 1
4   B  8 1
5   B  9 1
6   B 15 2
7   B 17 1
8   C 21 1
9   C 22 1
10  C 25 1
11  C 31 1

Approach 3: Weighted Count

Another variable can be used to “weight” the numbers of one variable. The following code, for example, demonstrates how to tally the total observations per team using the variable ‘Q3’ as the weight.

5 Free Books to Learn Statistics For Data Science – Data Science Tutorials

df %>% count(Q1, wt=Q3)
   Q1  n
1  A 24
2  B 64
3  C 99

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

thirteen − 1 =