Post-Hoc Pairwise Comparisons in R -Quick Guide

Post-Hoc Pairwise Comparisons in R, To see if there is a statistically significant difference between the means of three or more independent groups, a one-way ANOVA is utilized.

The following null and alternative hypotheses are used in a one-way ANOVA.

H0: The means of all the groups are the same.

HA: Not every group’s mean is the same.

We reject the null hypothesis and conclude that not all of the group means are equal if the overall p-value of the ANOVA is less than a specific significance level (e.g. <0.05).

Area Under Curve in R (AUC) » finnstats

We may then run posthoc pairwise comparisons to see which group means are different.

Post-Hoc Pairwise Comparisons in R

The following example demonstrates how to execute posthoc pairwise comparisons in R.

1. Tukey’s Method

2. Scheffe’s Method

3. The Bonferroni Method

4. The Holm Approach

Example: One-Way ANOVA in R

Let’s say a teacher wants to see if three distinct studying methods result in various exam results among students. He/She puts this to the test by randomly assigning 10 students to each study strategy and recording their exam results.

To test for differences in mean exam scores across the three groups, we may use the R code below to do a one-way ANOVA.

Statistical Hypothesis Testing-A Step by Step Guide » finnstats

Let’s create a data frame first

df <- data.frame(technique = rep(c("A", "B", "C"), each=10),
                 score = c(56, 106, 102, 108, 103, 102, 73, 94, 108, 99,
                           77, 72, 73, 73, 77, 74, 77, 90, 92, 98,
                           76, 58, 77, 87, 88, 80, 81, 85, 85, 88))

Now we can execute one-way ANOVA

model <- aov(score ~ technique, data = df)
model
Call:
   aov(formula = score ~ technique, data = df)
Terms:
                technique Residuals
Sum of Squares    304.267  6858.700
Deg. of Freedom         2        27
Residual standard error: 15.93819
Estimated effects may be unbalanced

Let’s view the summary of ANOVA

summary(model)
         Df Sum Sq Mean Sq F value Pr(>F) 
technique    2   1441   720.4   4.665 0.0182 *
Residuals   27   4170   154.4                
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We will reject the null hypothesis that the mean exam score is the same for each studying approach because the overall p-value of the ANOVA (0.0182) is less than 0.05.

How to Calculate a Bootstrap Standard Error in R » finnstats

After that, we may do posthoc pairwise comparisons to see whether groups have different means.

1. Tukey’s Method

When the sample sizes of each group are equal, the Tukey posthoc procedure is the best choice.

To conduct the Tukey posthoc method in R, we can utilize the built-in TukeyHSD() function:

Now perform the Tukey posthoc method

TukeyHSD(model, conf.level=.95)
  Tukey multiple comparisons of means
    95% family-wise confidence level
Fit: aov(formula = score ~ technique, data = df)
$technique
     diff       lwr        upr     p adj
B-A -14.8 -28.57923 -1.0207747 0.0334100
C-A -14.6 -28.37923 -0.8207747 0.0362015
C-B   0.2 -13.57923 13.9792253 0.9992862

The p-value (“p adj”) smaller than 0.05 is for the difference between B and A, C and A, as seen in the output.

As a result, we may conclude that the difference in mean exam scores between students who used A and students who utilized B is statistically significant.

What all Data Science Soft Skills Required? » finnstats

the difference in mean exam scores between students who used C and students who utilized B is statistically significant.

2. Scheffe’s Method

When comparing group means, the Scheffe technique is the most conservative posthoc pairwise comparison approach and produces the broadest confidence intervals.

To execute the Scheffe posthoc procedure in R, we can use the ScheffeTest() function from the DescTools package:

library(DescTools)
ScheffeTest(model)
Posthoc multiple comparisons of means: Scheffe Test
    95% family-wise confidence level
$technique
     diff    lwr.ci     upr.ci   pval   
B-A -14.8 -29.19395 -0.4060463 0.0429 * 
C-A -14.6 -28.99395 -0.2060463 0.0463 * 
C-B   0.2 -14.19395 14.5939537 0.9994   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the output, we can see that there are two p-values less than 0.05.

The difference in mean exam scores between students who used A and students who utilized B is statistically significant.

The difference in mean exam scores between students who used A and students who utilized C is statistically significant.

3. The Bonferroni Method

When you want to make a set of planned pairwise comparisons, the Bonferroni procedure is the way to go.

To perform the Bonferroni posthoc procedure in R, use the following syntax.

15 Essential packages in R for Data Science » finnstats

apply the Bonferroni posthoc test

pairwise.t.test(df$score, df$technique, p.adj='bonferroni')
Pairwise comparisons using t-tests with pooled SD
data:  df$score and df$technique
  A     B   
B 0.039 -   
C 0.042 1.000
P value adjustment method: bonferroni

From the output, we can see that the p-value less than 0.05 is for the difference between A &B and A & C.

4. The Holm Approach

When you have a set of planned pairwise comparisons to make ahead of time, the Holm technique is also utilized, and it has more power than the Bonferroni method, therefore it’s generally favored.

To conduct the Holm posthoc approach in R, use the following syntax.

Stock Market Predictions Next Week » finnstats

use the Holm method for posthoc analysis

pairwise.t.test(df$score, df$technique, p.adj='holm')
               Pairwise comparisons using t tests with pooled SD 

data:  df$score and df$technique 

  A     B    
B 0.039 -    
C 0.042 1.000

P value adjustment method: bonferroni 

From the output, we can see that the p-value less than 0.05 is for the difference between A &B and A & C.

Have you found this article to be interesting? I’d be glad if you could forward it to a friend or share it on Twitter or Linked In to help it spread.

You may also like...

2 Responses

  1. Anonymous says:

    I am interested on this post, i always receive better things for further improvements of R soft ware analysis skills

Leave a Reply

Your email address will not be published. Required fields are marked *

ten − three =