Normality Test in R:-In statistics methods is classified into two like Parametric methods and Nonparametric methods.
The majority of the test like correlation, regression, t-test, and analysis of variance (ANOVA) assume some certain characteristics about the data. They require the data to follow a normal distribution.
The test data follow normal distribution then parametric tests are appropriate, because their validity depends on the distribution of the data.
Before doing a parametric test need to check basic assumptions, If any of the assumptions do not hold true nonparametric methods come to the rescue.
Shapiro-Wilk’s method is extensively advocated for normality check and it gives higher power than K-S. It is primarily based totally on the correlation between the information and the corresponding regular scores.
Load data into R
type year period service incidents
1 A 60 60 127 0
2 A 60 75 63 0
3 A 65 60 1095 3
4 A 65 75 1095 4
5 A 70 60 1512 6
6 A 70 75 3353 18
Subset data based on year to check data is normally distributed or not.
Normality Test in R
Before doing Shapiro-Wilk test, check graphical representation of the data.
It is clearly some of the outliers affected the distribution.
Basis plot information normality assumption failed and nonparametric methods are more appropriate.
In other cases remove the outlier and test again.
Shapiro-Wilk normality test in R
W = 0.52927, p-value =8.03e-06
From the output, the p-value < 0.05 implying that the distribution of the information is drastically special from regular distribution. In different words, we will not count on the normality.