Group by Count in R

by finnstats

Group by Count in R, The R programming language’s dplyr package has a function called group_by() that groups data frames.

Group by Count in R

In R, group by the count of multiple columns and single columns can be done in a variety of ways, including utilizing the dplyr package’s group_by() function and the aggregate() function to count the number of occurrences inside a group.

The Group_by() function by itself will not produce any results. It should be followed by the summarise() method, which should do the required action.

Let’s start by making a data frame.

df<-data.frame(Name=c('A','B','C','D','E','F','G','H','I','J','K','L'),
              State=c('A1','A2','A3','A4','A5','A6','A7','A8','A9','A10','A11','A12'),
              Sales=c(154,224,311,112,123,157,985,321,118,156,158,614))
df

Name State Sales
1     A    A1   154
2     B    A2   224
3     C    A3   311
4     D    A4   112
5     E    A5   123
6     F    A6   157
7     G    A7   985
8     H    A8   321
9     I    A9   118
10    J   A10   156
11    K   A11   158
12    L   A12   614

Groupby using aggregate() syntax:

aggregate(x, by, FUN, …, simplify = TRUE, drop = TRUE)

X: data frame

by: a collection of grouping elements that are used to group the subgroups

FUN: a method for calculating summary statistics

Approach 1: group by aggregate

The aggregate function is listed below, along with the parameter by – by which it will be aggregated – and the function length.

Groupby count of single column

aggregate(df$Sales, by=list(df$State), FUN=length)

so the grouped dataframe will be

   Group.1 x
1       A1 1
2      A10 1
3      A11 1
4      A12 1
5       A2 1
6       A3 1
7       A4 1
8       A5 1
9       A6 1
10      A7 1
11      A8 1
12      A9 1

Approach 2: groupby using dplyr

summarise() uses the n() function to find the count of sales using the group_by() function, which accepts the “state” column as an argument.

library(dplyr)
df %>% group_by(State) %>% summarise(count_sales = n())

so the grouped data frame will be

State count_sales
   <chr>       <int>
 1 A1              1
 2 A10             1
 3 A11             1
 4 A12             1
 5 A2              1
 6 A3              1
 7 A4              1
 8 A5              1
 9 A6              1
10 A7              1
11 A8              1
12 A9              1

Chi Square for Independence-Mantel–Haenszel test in R »