Group wise Correlation in R
Group-wise Correlation in R, To calculate the correlation between two variables by the group in R, use the basic syntax below.
library(dplyr) df %>% group_by(group) %>% summarize(cor=cor(var1, var2))
This syntax computes the correlation between var1 and var2 when they are grouped by group var.
The example below demonstrates how to utilize this syntax in practice.
Calculate Group wise Correlation in R
Let’s say we have the following data frame, which contains information about basketball players from different teams.
Let’s create a data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), Score=c(68, 82, 79, 18, 45, 56, 80, 78), Grade=c(2, 7, 9, 3, 1, 3, 4, 2)) df
Now we can view the data frame
team Score Grade 1 A 68 2 2 A 82 7 3 A 79 9 4 A 18 3 5 B 45 1 6 B 56 3 7 B 80 4 8 B 78 2
To calculate the correlation between score and grade, organized by team, we can use the dplyr package’s syntax.
library(dplyr) df %>% group_by(team) %>% summarize(cor=cor(Score, Grade))
# A tibble: 2 x 2 team cor 1 A 0.604 2 B 0.628
The correlation coefficient between Score and Grade for team A is 0.604, according to the output.
For team B, the correlation coefficient between Score and Grade is 0.628.
Because both correlation coefficients are positive, we may conclude that there is a positive association between score and grade for both teams.