Fisher’s exact test in R
Fisher’s exact test in R, Fisher’s exact test is a statistical test used to analyze the relationship between two categorical variables.
It is used when the sample size is small and the expected counts of the contingency table are less than 5.
In this article, we will demonstrate how to conduct Fisher’s exact test in R with examples.
Formulation of the Hypothesis:
Before conducting Fisher’s exact test, it is necessary to formulate the null and alternative hypotheses.
The null hypothesis (H0) assumes that there is no significant association between two categorical variables. It is usually written as:
H0: The two variables are independent.
The alternative hypothesis (H1) assumes that there is a significant association between two categorical variables.
It can be either one-tailed or two-tailed and is usually written as:
H1: The two variables are dependent.
In the following sections, we will provide examples of how to conduct Fisher’s exact test in R.
Jr. Investment Data Analyst-Omaha, NE (hybrid) »
Example 1: Two-Tailed Fisher’s Exact Test
In this example, we will use a dataset that contains information about the type of car owned and whether or not the owner has a valid driver’s license.
We want to test the hypothesis that the type of car and having a valid driver’s license are independent.
First, we need to load the dataset:
car_data <- read.csv("car_data.csv")
Next, we can create a contingency table to visualize the relationship between the two categorical variables:
cont_table <- table(car_data$Car_Type, car_data$License)
Then, we can conduct the two-tailed Fisher’s exact test using the ‘fisher.test’ function:
fisher_test <- fisher.test(cont_table, alternative = "two.sided")
Finally, we can extract the test statistic, p-value, confidence interval, and a conclusion based on the test results using the ‘summary’ function:
summary(fisher_test)
The output will display the test statistic, the p-value, the confidence interval, and a conclusion based on the test results.
In this case, because the p-value is greater than 0.05, we fail to reject the null hypothesis and conclude that there is no significant association between the type of car and having a valid driver’s license.
Example 2: One-Tailed Fisher’s Exact Test
In this example, we will use a dataset that contains information about the age of a person and whether or not they have a valid driver’s license.
We want to test the hypothesis that people over the age of 50 are more likely to have a valid driver’s license.
Let’s load the dataset:
age_data <- read.csv("age_data.csv")
Next, we can create a contingency table to visualize the relationship between the two categorical variables:
cont_table <- table(age_data$License, age_data$Age > 50)
Then, we can conduct the one-tailed Fisher’s exact test using the ‘fisher.test’ function:
fisher_test <- fisher.test(cont_table, alternative = "greater")
Finally, we can extract the test statistic, p-value, confidence interval, and a conclusion based on the test results using the ‘summary’ function:
summary(fisher_test)
The output will display the test statistic, the p-value, the confidence interval, and a conclusion based on the test results.
In this case, because the p-value is less than 0.05, we reject the null hypothesis and conclude that people over the age of 50 are significantly more likely to have a valid driver’s license.
Conclusion:
In this article, we have demonstrated how to conduct Fisher’s exact test in R using both the ‘fisher.test’ function.
Fisher’s exact test is a statistical test used to analyze the relationship between two categorical variables when the sample size is small and the expected counts of the contingency table are less than 5.
By utilizing the examples provided in this article, researchers can use Fisher’s exact test to test hypotheses related to categorical variables in their datasets.
ggpairs in R » Data Science Tutorials