How to Calculate Phi Coefficient in R
Calculate Phi Coefficient in R, first, we need to understand what is Phi Coefficient?
It is a measurement of the degree of association between two binary variables. When it’s coming to inference it is similar to the correlation coefficient.
eXtreme Gradient Boosting in R » Ultimate Guide »
Rule of Thumb
Based on a general rule of thumb for correlation coefficients apply to the Phi coefficient.
The value lies between -1.0 to -0.7 indicate a strong negative association.
The value lies between -0.7 to -0.3 indicate a weak negative association.
The value lies between -0.3 to +0.3 indicate a little or no association.
The value lies between +0.3 to +0.7 indicate a weak positive association.
The value lies between +0.7 to +1.0 indicate a strong positive association.
How to clean the datasets in R? » janitor Data Cleansing »
Formula
A Phi Coefficient is sometimes called a mean square contingency coefficient.
Let’s take a 2 by 2 contingency table.
Y=0 | Y=1 | |
X=0 | A | B |
X=1 | C | D |
The Phi Coefficient can be calculated as:
Φ = (AD-BC) / √(A+B)(C+D)(A+C)(B+D)
Calculating Phi Coefficient in R
Let’s take an simple example, we want to know whether or not gender is associated with having ice cream preferences. Let’s take an example of 50 people’s survey based on their ice cream preferences..
Phi Coefficient example calculation, let’s create a 2×2 table in R
data<-matrix(c(10, 8, 14, 18), nrow = 2) data
[,1] [,2] [1,] 10 14 [2,] 8 18
Calculate Phi Coefficient in R, let’s make use of phi() function from the psych package to calculate the Phi Coefficient between the gender variables.
library(psych) phi(data, digits = 3) [1] 0.113
The Phi coefficient is 0.113.
Data Analysis in R pdf tools & pdftk » Read, Merge, Split, Attach »
Conclusion
As we mentioned above, Phi coefficient inference is similar to a Pearson Correlation Coefficient
-1 indicates a negative relationship between the two variables.
0 indicates no association between the two variables.
1 indicates a positive relationship between the two variables.
In this case, Phi is 0.113, lies between -0.3 to +0.3 indicate little or no association. This indicate that there is no association observed (in icecream preferences) between gender.