# Repeated Measures of ANOVA in R Complete Tutorial

Repeated Measures of ANOVA in R, in this tutorial we are going to discuss one-way and two-way repeated measures of ANOVA.

In this case, the same individuals are measured the same outcome variable under different time points or conditions.

This test is also known as a within-subjects ANOVA or ANOVA with repeated measures

Data Science Jobs

## Assumptions

1. No significant outlier

To identify the outlier, you can use the box plot method or the three sigma limit method.

2. Normality Assumption

The outcome of the dependent variable should be normally distributed in each cell of the design.

Based on the Shapiro Wilks test can use it for the same.

3. Assumption of Sphericity

These can be checked based on the anova_test function. The variance of the differences between groups should be equal.

Sphericity can be checked using Mauchly’s test of sphericity, which is automatically reported when using the R function anova_test()

The above assumptions are not met then you can use an alternative of Repeated Measures ANOVA in the nonparametric method (Friedman Test).

### One-way Repeated Measures of ANOVA in R

#### Prerequisites

``````library(xlsx)
library(rstatix)
library(reshape)
library(tidyverse)
library(dplyr)
library(ggpubr)
library(plyr)
library(datarium)``````

#### Getting Data

```data<-read.xlsx("D:/RStudio/data.xlsx",sheetName="Sheet1")
data <- data %>%
gather(key = "time", value = "score", T0, T1, T2) %>%
convert_as_factor(id, time)
```  id time       score
1 S1   T0 7.454333333
2 S2   T0 8.030000000
3 S3   T0 9.911666667```

#### Objective

The objective is to identify any significant difference between time points exit or not.

#### Summary statistics

```summary<-data %>%
group_by(time) %>%
get_summary_stats(score, type = "mean_sd")
data.frame(summary)```
```    time variable  n  mean sd
1   T0    score 24 7.853 3.082
2   T1    score 24 9.298 2.090
3   T2    score 24 5.586 0.396```

#### Visualization

Create a box plot and add points corresponding to individual values:

You can use a box plot also for outlier identification.

Based on boxplot no outlier detected in the dataset.

#### Outlier Detection

```outlier<-data %>%
group_by(time) %>%
identify_outliers(score)
data.frame(outlier)```
``` time       id         score      is.outlier is.extreme
<0 rows> (or 0-length row.names)```

These indicates no outlier in the data set.

How to clean data sets?

#### Normality Checking

You can use Shapiro-Wilk’s test for normality checking, Please note if the data set is small then Shapiro-Wilk’s test is ideal otherwise go for the QQ plot.

``````normality<-data %>%
group_by(time) %>%
shapiro_test(score)
data.frame(normality)``````
```    time variable    statistic      p
1   T0    score 0.9540460218 0.33073045010
2   T1    score 0.9868670126 0.98282491871
3   T2    score 0.9261641361 0.08004275234```

The normality assumption can be checked by computing the Shapiro-Wilk test for each time point. If the data is normally distributed, the p-value should be greater than 0.05.

Tested data was normally distributed at each time point, as assessed by Shapiro-Wilk’s test (p > 0.05).

If your sample size is greater than 50, the normal QQ plot is preferred because at larger sample sizes the Shapiro-Wilk test becomes very sensitive even to a minor deviation from normality.

`ggqqplot(data, "score", ggtheme = theme_bw()) +facet_grid(time ~ Treatment, labeller = "label_both")`

#### Sphericity Assumption

``````res<-anova_test(data=data,dv=score,wid=id,within=time)
get_anova_table(res) ``````
```ANOVA Table (type III tests)
Effect  DFn   DFd      F          p p<.05   ges
1   time 1.42 32.73 24.436 0.00000297     * 0.343```

The data set score was significantly different at the different time points during the diet, F(1.42, 33) = 24.4, p < 0.0001, eta2[g] = 0.34

How to do data reshaping in R?

Based on the anova_test function sphericity correction automatically applied to the output. The default is to apply automatically the Greenhouse-Geisser sphericity correction to only within-subject factors violating the sphericity assumption (i.e., Mauchly’s test p-value is significant, p <= 0.05).

#### Pair Wise comparison

```pair<-data %>%
pairwise_t_test( value~time,paired=TRUE, p.adjust.method = "bonferroni" )
data.frame(pair) ```
```  . .y. group1 group2 n1 n2     statistic df       p    p.adj     p.adj.signif
1 value     t1     t2 10 10  -4.967617803  9 0.000772000 0.002000           **
2 value     t1     t3 10 10 -13.228148424  9 0.000000334 0.000001         ****
3 value     t2     t3 10 10  -4.867815675  9 0.000886000 0.003000           **```

### Summary

Based on the repeated measures of ANOVA, all the tested pairwise differences between time points are statistically significant.

### Two-way Repeated Measures of ANOVA in R

The two-way repeated measures ANOVA can be performed in order to determine whether there is a significant interaction between treatment and time on the score.

#### Getting data

``````data<-read.xlsx("D:/RStudio/data.xlsx",sheetName="Sheet1") data <- data %>%
gather(key = "time", value = "score", T0, T1, T2) %>%
convert_as_factor(id, Treatment,time)

id Treatment time score
S1    A       T0    7.5
S2    A       T0    8.0
S3    A       T0    9.9```

#### Summary statistics

``````summary<-data %>%
group_by(Treatment,time) %>%
get_summary_stats(score, type = "mean_sd")
data.frame(summary)``````
```Treatment time variable n mean sd
A      T0     score 10 7.2 2.49
A      T1     score 10 8.7 2.00
A      T2     score 10 5.7 0.27
B      T0     score 10 7.8 3.01
B      T1     score 10 9.1 2.05
B      T2     score 10 5.7 0.50```

#### Outlier Detection

``````outlier<-data %>%
group_by(Treatment,time) %>%
identify_outliers(score)
data.frame(outlier)``````
```Treatment time id score is.outlier is.extreme
A          T1   S7 6.0   TRUE        FALSE
A          T1   S9 4.7   TRUE        TRUE```

Outlier got detected in the data frame for the subject S7 and S9. Need to remove or revisit the data points in the dataset for further analysis.

Data points are revisited and re analyzed.

tidyverse in R complete tutorial

Summary statistics need to reanalyze again

``````summary<-data %>%
group_by(Treatment,time) %>%
get_summary_stats(score, type = "mean_sd")
data.frame(summary)``````
```Treatment time variable n mean sd
A          T0   score 10 7.2 2.49
A          T1   score 10 8.7 1.06
A          T2   score 10 5.7 0.27
B          T0   score 10 7.8 3.01
B          T1   score 10 9.1 2.05
B          T2   score 10 5.7 0.50```

From the corrected data need check outlier detection again.

``````outlier<-data %>%
group_by(Treatment,time) %>%
identify_outliers(score)
data.frame(outlier)``````
``` Treatment time id score is.outlier is.extreme
<0 rows> (or 0-length row.names) ```

You can see now no outlier observed in the current dataset.

#### Normality Checking

``````normality<-data %>%
group_by(Treatment,time) %>%
shapiro_test(score)
data.frame(normality)``````
```Treatment time variable statistic p
A       T0   score   0.89  0.169
A       T1   score   0.92  0.347
A       T2   score   0.85  0.065
B       T0   score   0.96  0.739
B       T1   score   0.93  0.411
B       T2   score   0.81  0.020```

The data set score was normally distributed at each time point (p > 0.05), except for treatment B at time point T2, as assessed by Shapiro-Wilk’s test.

#### QQ Plot

Based on Shapiro-Wilk’s analysis at time point T2 non-normality observed, so further you need to check the QQ plot for the verification.

From the plot above, as all the points fall approximately along the reference line, we can assume normality.

#### Sphericity Assumption

``````res.aov <- anova_test(
data = data, dv = score, wid = id,
within = c(Treatment, time)
)
get_anova_table(res.aov)``````
```ANOVA Table (type III tests)
Effect        DFn   DFd  F     p     p<.05  ges
Treatment      1.0   9  0.25 6.3e-01       0.009
time           2.0  18 26.30 4.5e-06     * 0.354
Treatment:time 1.2  11  0.12 7.9e-01       0.004```

Observed statistically significant two-way interactions between treatment and time p < 0.005. If the interaction is not significant then simply execute pairwise t test comparison (Code mentioned in the end of the tutorial).

#### Effect of treatment at each time point

``````one.way <- data %>%
group_by(time) %>%
anova_test(dv = score, wid = id, within = Treatment) %>%
get_anova_table() %>%
one.way``````
```time Effect DFn DFd F p p..05 ges p.adj
T0   Treatment 1 9 0.169 0.69 0.01200 1
T1   Treatment 1 9 0.244 0.63 0.01600 1
T2   Treatment 1 9 0.003 0.96 0.00021 1```

#### Pairwise comparisons between treatment groups at each time point

``````pwc <- data %>%
group_by(time) %>%
pairwise_t_test(
score ~ Treatment, paired = TRUE,
)
data.frame(pwc)``````
```time .y. group1 group2 n1 n2 statistic df p  p.adj p.adj.signif
T0   score A     B    10  10 -0.411    9 0.69 0.69     ns
T1   score A     B    10  10 -0.494    9 0.63 0.63     ns
T2   score A     B    10  10 -0.051    9 0.96 0.96     ns```

No significant difference was observed between samples for any of the tested time points.

Rank order analysis in R

#### Effect of time at each level of treatment

``````one.way2 <- data %>%
group_by(Treatment) %>%
anova_test(dv = score, wid = id, within = time) %>%
get_anova_table() %>%
data.frame(one.way2)``````
```Treatment Effect DFn  DFd F p p..05  ges p.adj
A       time   1.3 11 11.0 0.005 * 0.40 0.01
B       time   1.3 11 8.5  0.010 * 0.33 0.02```

#### Pairwise comparisons between time points

``````pwc2 <- data %>%
group_by(Treatment) %>%
pairwise_t_test(
score ~ time, paired = TRUE,
)
data.frame(pwc2)``````
```Treatment .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif
A     score  T0    T1    10  10 -2.2     9  5.2e-02 1.6e-01 ns
A     score  T0    T2    10  10 1.9      9  9.0e-02 2.7e-01 ns
A     score  T1    T2    10  10 8.2      9  1.8e-05 5.5e-05 ****
B     score  T0    T1    10  10 -2.4     9  3.7e-02 1.1e-01 ns
B     score  T0    T2    10  10 2.0      9  8.1e-02 2.4e-01 ns
B     score  T1    T2    10  10 4.3      9  2.0e-03 6.0e-03 **```

A significant difference was observed between time points T1 and T2 for treatments A & B (p<0.05).

If the interaction effect from ANOVA is not significant then you can simply execute a pairwise t-test based on the below command.

#### Comparisons for treatment variable

``````data %>%
pairwise_t_test(
score ~ Treatment, paired = TRUE,
)``````

#### Comparisons for the time variable

``````data %>%
pairwise_t_test(
score ~ time, paired = TRUE,
)``````

## Conclusion

In this tutorial, we discussed repeated measures of ANOVA in R with codes. In the same way, you can execute Three-way repeated measures of ANOVA.

### 2 Responses

1. Anonymous says:

This is very relevant topic with appropriate example dataset. However, in addition to posting as it is it better to add ppt or pdf

2. Anonymous says:

Please make the example dataset avaliable otherwise we cannot replicate your script.