Group-wise Correlation in R, To calculate the correlation between two variables by the group in R, use the basic syntax below.

library(dplyr) df %>% group_by(group) %>% summarize(cor=cor(var1, var2))

This syntax computes the correlation between var1 and var2 when they are grouped by group var.

The example below demonstrates how to utilize this syntax in practice.

## Calculate Group wise Correlation in R

Let’s say we have the following data frame, which contains information about basketball players from different teams.

match Function in R with examples » finnstats

Let’s create a data frame

df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), Score=c(68, 82, 79, 18, 45, 56, 80, 78), Grade=c(2, 7, 9, 3, 1, 3, 4, 2)) df

Now we can view the data frame

team Score Grade 1 A 68 2 2 A 82 7 3 A 79 9 4 A 18 3 5 B 45 1 6 B 56 3 7 B 80 4 8 B 78 2

To calculate the correlation between score and grade, organized by team, we can use the dplyr package’s syntax.

eXtreme Gradient Boosting in R » Ultimate Guide » finnstats

library(dplyr) df %>% group_by(team) %>% summarize(cor=cor(Score, Grade))

# A tibble: 2 x 2 team cor 1 A 0.604 2 B 0.628

## Conclusion

The correlation coefficient between Score and Grade for team A is 0.604, according to the output.

For team B, the correlation coefficient between Score and Grade is 0.628.

Because both correlation coefficients are positive, we may conclude that there is a positive association between score and grade for both teams.