Goodness of Fit Test- Jarque-Bera Test in R
Goodness of fit test, The Jarque-Bera test is a goodness-of-fit test that measures if sample data has skewness and kurtosis that are similar to a normal distribution.
The Jarque-Bera test statistic is always positive, and if it is not close to zero, it shows that the sample data do not have a normal distribution.
Goodness of Fit Test
The test statistic Jarque-Bera Test is defined as:
JB =[(n-k+1) / 6] * [S2 + (0.25*(C-3)2)]
Under the null hypothesis of normality, Jarque-Bera Test(JB) ~ X2(2)
where n denotes the number of observations in the sample, k denotes the number of regressors (k=1 if not used in a regression), S denotes sample skewness, and C denotes sample kurtosis.
This tutorial describes how to execute a Jarque-Bera test in R.
Kruskal Wallis test in R-One-way ANOVA Alternative »
Jarque-Bera test in R
First, need to call tseries library in R.
library(“tseries”)
Let’s generate some random data and make use of the set.seed function for reproducibility.
Case Study 1:-
set.seed(123)
data <- rnorm(100)
The above function generates normally distributed random variables and we can expect the result is not significant. Let’s verify the same based on the Jarque-Bera test
LSTM Network in R » Recurrent Neural network »
Analyze data based on Jarque-Bera test
jarque.bera.test(data)
The above function generates the following outputs.
Jarque Bera Test data: data X-squared = 0.16908, df = 2, p-value = 0.9189
This indicates that the test statistic is 0.16908, with a p-value of 0.9189. We would not be able to reject the null hypothesis that the data is normally distributed in this scenario.
Principal component analysis (PCA) in R »
Case Study 2:-
Now instead of rnorm make use of runif function and check the Jarque-Bera test in R.
set.seed(123) data <- runif(100)
Execute Jarque-Bera test in R
jarque.bera.test(data)
It generates the following outputs.
Jarque Bera Test data: data X-squared = 6.1759, df = 2, p-value = 0.0456
This indicates that the test statistic is 6.1759, with a p-value of 0.0456. We would reject the null hypothesis that the data is normally distributed in this circumstance.
We have enough evidence to conclude that the data in this scenario is not normally distributed.
KNN Algorithm Machine Learning » Classification & Regression »