Goodness of Fit Test- Jarque-Bera Test in R

by finnstats

Goodness of fit test, The Jarque-Bera test is a goodness-of-fit test that measures if sample data has skewness and kurtosis that are similar to a normal distribution.

The Jarque-Bera test statistic is always positive, and if it is not close to zero, it shows that the sample data do not have a normal distribution.

Goodness of Fit Test

The test statistic Jarque-Bera Test is defined as:

JB =[(n-k+1) / 6] * [S² + (0.25*(C-3)²)]

Under the null hypothesis of normality, Jarque-Bera Test(JB) ~ X²(2)

where n denotes the number of observations in the sample, k denotes the number of regressors (k=1 if not used in a regression), S denotes sample skewness, and C denotes sample kurtosis.

This tutorial describes how to execute a Jarque-Bera test in R.

Kruskal Wallis test in R-One-way ANOVA Alternative »

Jarque-Bera test in R

First, need to call tseries library in R.

library(“tseries”)

Let’s generate some random data and make use of the set.seed function for reproducibility.

Case Study 1:-

set.seed(123)

data <- rnorm(100)

The above function generates normally distributed random variables and we can expect the result is not significant. Let’s verify the same based on the Jarque-Bera test

LSTM Network in R » Recurrent Neural network »

Analyze data based on Jarque-Bera test

jarque.bera.test(data)

The above function generates the following outputs.

Jarque Bera Test
data:  data
X-squared = 0.16908, df = 2, p-value = 0.9189

This indicates that the test statistic is 0.16908, with a p-value of 0.9189. We would not be able to reject the null hypothesis that the data is normally distributed in this scenario.

Principal component analysis (PCA) in R »

Case Study 2:-

Now instead of rnorm make use of runif function and check the Jarque-Bera test in R.

set.seed(123)
data <- runif(100)

Execute Jarque-Bera test in R

jarque.bera.test(data)

It generates the following outputs.

Jarque Bera Test
data:  data
X-squared = 6.1759, df = 2, p-value = 0.0456

This indicates that the test statistic is 6.1759, with a p-value of 0.0456. We would reject the null hypothesis that the data is normally distributed in this circumstance.

We have enough evidence to conclude that the data in this scenario is not normally distributed.

KNN Algorithm Machine Learning » Classification & Regression »