# Normality Test in R

Normality Test in R:-In statistics methods is classified into two like Parametric methods and Nonparametric methods.

The majority of the test like correlation, regression, t-test, and analysis of variance (ANOVA) assume some certain characteristics about the data. They require the data to follow a normal distribution.

The test data follow normal distribution then parametric tests are appropriate, because their validity depends on the distribution of the data.

Before doing a parametric test need to check basic assumptions, If any of the assumptions do not hold true nonparametric methods come to the rescue.

Shapiro-Wilk’s method is extensively advocated for normality check and it gives higher power than K-S. It is primarily based totally on the correlation between the information and the corresponding regular scores.

Story of Null Hypothesis

```library(mass)

type year period service incidents

1    A   60     60     127         0

2    A   60     75      63         0

3    A   65     60    1095         3

4    A   65     75    1095         4

5    A   70     60    1512         6

6    A   70     75    3353        18

```data<-droplevels(subset(ships,ships\$year=="60"))
```

Subset data based on year to check data is normally distributed or not.

Normality Test in R

Before doing Shapiro-Wilk test, check graphical representation of the data.

```library(ggpubr)
ggqqplot(data\$service)```

It is clearly some of the outliers affected the distribution.

Basis plot information normality assumption failed and nonparametric methods are more appropriate.

In other cases remove the outlier and test again. ```shapiro.test(data\$service)
```

Shapiro-Wilk normality test in R

data:  data\$service

W = 0.52927, p-value =8.03e-06

## Conclusion

From the output, the p-value < 0.05 implying that the distribution of the information is drastically special from regular distribution. In different words, we will not count on the normality. 