# Chi-Square Goodness of fit formula in R

Chi-square goodness of fit formula, To see if a categorical variable follows a hypothesized distribution, a Chi-Square Goodness of Fit Test is utilized.

This lesson will show you how to use R to run a Chi-Square Goodness of Fit Test.

## Chi-square goodness of fit formula in R

Every day, an equal number of clients enter a business, according to a vendor. To test this theory, a corporate executive records the number of customers who visit the shop in a given week and discovers the following.

Monday: 250 customers, Tuesday: 230 customers, Wednesday: 265 customers, Thursday: 235 customers, and Friday: 223 customers

To evaluate if the data is consistent with the vendor claim, do the Chi-Square goodness of fit test in R using the instructions below.

### Gather information.

First, we’ll make two arrays to store our observed frequencies and expected customer proportions for each day.

observedfreq <- c(250, 230, 265, 235, 223) expectedprop <- c(0.2, 0.2, 0.2, 0.2, 0.2)

The expected frequency sum should be 1.

Use the Chi-Square Goodness of Fit Test to see if you’re a good fit.

Let’s see the null and alternative hypotheses for a Chi-Square Goodness of Fit Test are as follows.

H0: A variable follows a hypothesized distribution.

H1: A variable does not follow a hypothesized distribution.

The Chi-Square Goodness of Fit Test can then be performed using the chisq.test() function, which has the following syntax.

chisq.test(x, p)

where:

x: The observed frequencies are represented numerically as a vector.

p: a numerical vector of proportions to be expected.

In our example, the following code demonstrates how to utilize this function.

conduct a Chi-Square Goodness-of-Fit Test

chisq.test(x= observedfreq, p= expectedprop)

Chi-squared test for given probabilities data: observedfreq X-squared = 4.7265, df = 4, p-value = 0.3165

The p-value for the Chi-Square test is 0.3165, and the Chi-Square test statistic is 4.7.

The p-value is equivalent to a Chi-Square value with n-1 degrees of freedom, where n is the number of categories. degrees of freedom= 5-1 = 4 in this situation.

The Chi-Square to P-Value Calculator can be used to establish that the p-value for X2 = 4.7 with degrees of freedom= 4 is 0.3165.

## Conclusion

We cannot reject the null hypothesis since the p-value (0.3165) is not less than 0.05.

This means we don’t have enough evidence to conclude that the genuine customer distribution differs from the vendor’s claimed distribution.

What is neural network in machine learning? » finnstats

Subscribe to our newsletter!