Group wise Correlation in R

Group-wise Correlation in R, To calculate the correlation between two variables by the group in R, use the basic syntax below.

library(dplyr)
df %>%
  group_by(group) %>%
  summarize(cor=cor(var1, var2))

This syntax computes the correlation between var1 and var2 when they are grouped by group var.

The example below demonstrates how to utilize this syntax in practice.

Calculate Group wise Correlation in R

Let’s say we have the following data frame, which contains information about basketball players from different teams.

match Function in R with examples » finnstats

Let’s create a data frame

df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 Score=c(68, 82, 79, 18, 45, 56, 80, 78),
                 Grade=c(2, 7, 9, 3, 1, 3, 4, 2))
df

Now we can view the data frame

    team Score Grade
1    A    68     2
2    A    82     7
3    A    79     9
4    A    18     3
5    B    45     1
6    B    56     3
7    B    80     4
8    B    78     2

To calculate the correlation between score and grade, organized by team, we can use the dplyr package’s syntax.

eXtreme Gradient Boosting in R » Ultimate Guide » finnstats

library(dplyr)
df %>%
  group_by(team) %>%
  summarize(cor=cor(Score, Grade))
# A tibble: 2 x 2
  team    cor
1 A     0.604
2 B     0.628

Conclusion

The correlation coefficient between Score and Grade for team A is 0.604, according to the output.

For team B, the correlation coefficient between Score and Grade is 0.628.

Because both correlation coefficients are positive, we may conclude that there is a positive association between score and grade for both teams.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

eighteen + four =