Quantiles by Group calculation in R with examples

Quantiles by Group calculation in R, Quantiles are numbers in statistics that divide a ranking dataset into equal groups.

In R, we can use the following functions from the dplyr package to calculate quantiles grouped by a certain variable.

library(dplyr)

Identify the quantiles that you’re interested in.

q<-c(0.25, 0.5, 0.80)

Quantiles are calculated by grouping variables.

The following examples show how to use this syntax in practice.

df %>%
  group_by(grouping_variable) %>%
  summarize(quant25 = quantile(numeric_variable, probs = q[1]),
            quant50 = quantile(numeric_variable, probs = q[2]),
            quant80 = quantile(numeric_variable, probs = q[3]))

Quantiles by Group calculation in R

The following code demonstrates how to calculate the quantiles for a dataset’s number of victories sorted by team.

library(dplyr)

Now we can create a data frame

df <- data.frame(team=c('X', 'X', 'X', 'X', 'X', 'X', 'X', 'X',
                        'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y',
                        'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'),
                 wins=c(12, 14, 24, 15, 8, 5, 13, 13, 13, 15, 12, 13,
                        10, 19, 19, 8, 12, 16, 15, 21, 20, 10, 15, 11))

Let’s see the first six rows of the data frame

head(df)
team wins
1    X   12
2    X   14
3    X   24
4    X   15
5    X    8
6    X    5

Identify the quantiles that you’re interested in.

q<-c(0.25, 0.5, 0.80)

Let’s calculate the quantiles by the grouping variable.

df %>%
  group_by(team) %>%
  summarize(quant25 = quantile(wins, probs = q[1]),
            quant50 = quantile(wins, probs = q[2]),
            quant80 = quantile(wins, probs = q[3]))
team  quant25 quant50 quant80
  <chr>   <dbl>   <dbl>   <dbl>
1 C        11.8      15    18.4
2 X        11        13    14.6
3 Y        11.5      13    17.4

It’s worth noting that we can specify whatever number of quantiles we want:

define interest quantiles

q<-c(0.2, 0.4, 0.6, 0.8)

Now we can calculate quantiles by the grouping variable

df %>%
  group_by(team) %>%
  summarize(quant20 = quantile(wins, probs = q[1]),
            quant40 = quantile(wins, probs = q[2]),
            quant60 = quantile(wins, probs = q[3]),
            quant80 = quantile(wins, probs = q[4]))
team  quant20 quant40 quant60 quant80
  <chr>   <dbl>   <dbl>   <dbl>   <dbl>
1 C        11.4    14.4    15.2    18.4
2 X         9.6    12.8    13.2    14.6
3 Y        10.8    12.8    13.4    17.4

We also have the option of calculating only one quantile per group. For example, here’s how to figure out what the 95th percentile of each team’s victories is:

Calculate the team’s 95th percentile of victories.

Control Chart in Quality Control-Quick Guide – Data Science Tutorial

df %>%
  group_by(team) %>%
  summarize(quant95 = quantile(wins, probs = 0.95))
team  quant95
  <chr>   <dbl>
1 C        20.6
2 X        20.8
3 Y        19

Cool, it’s working well.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

nine + six =

Ads Blocker Image Powered by Code Help Pro

Quality articles need supporters. Will you be one?

You currently have an Ad Blocker on.

Please support FINNSTATS.COM by disabling these ads blocker.

Powered By
100% Free SEO Tools - Tool Kits PRO