# How to Perform a Lack of Fit Test in R-Quick Guide

Lack of Fit Test in R, A lack of fit test is used to determine whether a full regression model fits a dataset significantly better than a reduced version of the model.

Consider the following regression model, which has four predictor variables.

`Y = β0 + β1×1 + β2×2 + β3×3 + β4×4 + ε`

A nested model is demonstrated by the following model, which contains only two of the original predictor variables.

`Y = β0 + β1×1 + β2×2 + ε`

We can use a Lack of Fit Test with the following null and alternative hypotheses to see if these two models differ significantly.

## Hypothesis

H0: The full model and the nested model both fit the data equally well. As a result, the nested model should be used.

H1: In terms of data fit, the full model significantly outperforms the nested model. As a result, you must employ the entire model.

### Step 1: Create a Dataset

We can make use of mtcars data set. Let’s load the data set first.

```data(mtcars)
```                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1```

### Step 2: Fit Two Different Models to the Dataset

The dataset will then be fitted with two different regression models.

Now we can fit the full model

`fullmodel <- lm(mpg ~ cyl + disp + hp + wt, data = mtcars)`

Let’s fit a reduced model

`reducedmodel <- lm(mpg ~ cyl + disp, data = mtcars)`

### Step 3: Perform a Lack of Fit Test

The anova() command will then be used to perform a lack of fit test between the two models.

Lack of Fit Test in R

`anova(fullmodel, reducedmodel)`
```Analysis of Variance Table
Model 1: mpg ~ cyl + disp + hp + wt
Model 2: mpg ~ cyl + disp
Res.Df    RSS Df Sum of Sq      F   Pr(>F)
1     27 170.44
2     29 270.74 -2    -100.3 7.9439 0.001936 **

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1```

The F test statistic is 7.9439, with a corresponding p-value of 0.001936.

We can reject the null hypothesis of the test because this p-value is less than 0.05 and conclude that the full model provides a statistically significantly better fit than the reduced model.

Likelihood Ratio Test in R with Example »