Likelihood Ratio Test in R with Example

Likelihood Ratio Test in R, The likelihood-ratio test in statistics compares the goodness of fit of two nested regression models based on the ratio of their likelihoods, specifically one obtained by maximization over the entire parameter space and another obtained after imposing some constraint.

A nested model is simply a subset of the predictor variables in the overall regression model.

For instance, consider the following regression model with four predictor variables.

Y = β0 + β1×1 + β2×2 + β3×3 + β4×4 + ε

The following model, with only two of the original predictor variables, is an example of a nested model.

Y = β0 + β1×1 + β2×2 + ε

To see if these two models differ significantly, we can use a likelihood ratio test with the following null and alternative hypotheses.

How to Remove Outliers in R »

Hypothesis

H0: Both the full and nested models fit the data equally well. As a result, you should employ the nested model.

H1: The full model significantly outperforms the nested model in terms of data fit. As a result, you should use the entire model.

If the p-value of the test is less than a certain threshold of significance (e.g., 0.05), we can reject the null hypothesis and conclude that the full model provides a significantly better fit.

The following example demonstrates how to run a likelihood ratio test in R.

How to Measure Heteroscedasticity in Regression? »

Example: Likelihood Ratio Test in R

The code below demonstrates how to fit the two regression models listed below in R using data from the built-in mtcars dataset.

Full model: mpg = β0 + β1cyl + β2disp + β3hp + β4wt

Reduced model: mpg = β0 + β1cyl + β2disp

To perform a likelihood ratio test on these two models, we will use the lrtest() function from the lmtest package.

Customer Segmentation K Means Cluster

Let’s load the package first,

library(lmtest)

Now we can fit the full model

fullmodel <- lm(mpg ~ cyl + disp + hp + wt, data = mtcars)

Let’s try now reduced fit model, we can remove hp and wt from this model.

reducedmodel <- lm(mpg ~ cyl + disp, data = mtcars)

We now have both models and are ready to run the likelihood ratio test to see if there are any differences between them.

Random Forest Feature Selection » Boruta Algorithm »

lrtest(fullmodel, reducedmodel)
Likelihood ratio test
Model 1: mpg ~ cyl + disp + hp + wt
Model 2: mpg ~ cyl + disp
  #Df  LogLik Df  Chisq Pr(>Chisq)   
1   6 -72.169                        
2   4 -79.573 -2 14.808  0.0006088 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The Chi-Squared test-statistic is 14.808 and the corresponding p-value is 0.0006088, as shown in the output.

We will reject the null hypothesis because the p-value is less than 0.05.

This indicates that the full model and the nested model do not fit the data equally well. As a result, we should employ the entire model.

Hope now you understand the purpose of the likelihood ratio test, however, we can try one more example.

Fit the full model with only two predictors

full <- lm(mpg ~ disp + carb, data = mtcars)

Another model (reduced) with only one predictor

Bubble Chart in R-ggplot & Plotly » (Code & Tutorial) »

reduced <- lm(mpg ~ disp, data = mtcars)

Now execute the likelihood ratio test for the above models

lrtest(full, model)
Likelihood ratio test
Model 1: mpg ~ disp + carb
Model 2: mpg ~ disp
  #Df  LogLik Df  Chisq Pr(>Chisq)  
1   4 -78.603                       
2   3 -82.105 -1 7.0034   0.008136 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We can see from the output that the likelihood ratio test has a p-value of 0. 008136. We would reject the null hypothesis because it is less than 0.05.

As a result, we can conclude that the model with two predictors outperforms the model with only one predictor in terms of fit.

Deep Belief Networks and Autoencoders »

You may also like...

Leave a Reply

Your email address will not be published.

12 − 2 =

error

Subscribe Now