Group by Count in R
Group by Count in R, The R programming language’s dplyr package has a function called group_by() that groups data frames.
Group by Count in R
In R, group by the count of multiple columns and single columns can be done in a variety of ways, including utilizing the dplyr package’s group_by() function and the aggregate() function to count the number of occurrences inside a group.
The Group_by() function by itself will not produce any results. It should be followed by the summarise() method, which should do the required action.
Let’s start by making a data frame.
df<-data.frame(Name=c('A','B','C','D','E','F','G','H','I','J','K','L'), State=c('A1','A2','A3','A4','A5','A6','A7','A8','A9','A10','A11','A12'), Sales=c(154,224,311,112,123,157,985,321,118,156,158,614)) df
Name State Sales 1 A A1 154 2 B A2 224 3 C A3 311 4 D A4 112 5 E A5 123 6 F A6 157 7 G A7 985 8 H A8 321 9 I A9 118 10 J A10 156 11 K A11 158 12 L A12 614
Groupby using aggregate() syntax:
aggregate(x, by, FUN, …, simplify = TRUE, drop = TRUE)
X: data frame
by: a collection of grouping elements that are used to group the subgroups
FUN: a method for calculating summary statistics
Approach 1: group by aggregate
The aggregate function is listed below, along with the parameter by – by which it will be aggregated – and the function length.
Groupby count of single column
aggregate(df$Sales, by=list(df$State), FUN=length)
so the grouped dataframe will be
Group.1 x 1 A1 1 2 A10 1 3 A11 1 4 A12 1 5 A2 1 6 A3 1 7 A4 1 8 A5 1 9 A6 1 10 A7 1 11 A8 1 12 A9 1
Approach 2: groupby using dplyr
summarise() uses the n() function to find the count of sales using the group_by() function, which accepts the “state” column as an argument.
library(dplyr) df %>% group_by(State) %>% summarise(count_sales = n())
so the grouped data frame will be
State count_sales <chr> <int> 1 A1 1 2 A10 1 3 A11 1 4 A12 1 5 A2 1 6 A3 1 7 A4 1 8 A5 1 9 A6 1 10 A7 1 11 A8 1 12 A9 1
Chi Square for Independence-Mantel–Haenszel test in R »