How to create contingency tables in R?
Create contingency tables in R, Contingency tables are helpful for condensing a huge number of observations into smaller, more manageable tables.
We’ll learn about contingency tables and how to make them in this R tutorial. Complex/flat tables, cross-tabulation, and recovering original data from contingency tables will all be covered.
As you can see, this course will be jam-packed with information. So, without further ado, let’s get this party started.
What are Contingency Tables?
Contingency tables are helpful for condensing a huge number of observations into smaller, more manageable tables.
We’ll learn about contingency tables and how to make them in this R tutorial. Complex/flat tables, cross-tabulation, and recovering original data from contingency tables will all be covered.
As you can see, this course will be jam-packed with information. So, without further ado, let’s get this party started.
To make a contingency table in R, use the table() function. One of R’s most versatile functions is the table function. It can convert any data structure into a table as an argument.
Take a look at the following example:
ct1 <- table(mtcars$gear, mtcars$cyl, dnn=c("gears","cylinders")) ct1
cylinders gears 4 6 8 3 1 2 12 4 8 4 0 5 2 1 2
We use two categorical variables from the mtcars datasets in the example above. The number of gears and cylinders within the autos are the two factors.
The number of gears is listed as a row, while the number of cylinders is listed as a column in the resulting table.
Calculate the row totals of a contingency table
The margin.table() function can be used to calculate the totals of each row in a contingency table. Let’s have a look at an example of this:
margin.table(ct1, margin = 1)
gears 3 4 5 15 12 5
Calculate the column totals of a contingency table
Using the margin.table() function, we can calculate the totals of each of the columns in a contingency table in a similar way. Only the margin argument needs to be changed to 2. Here’s an example of what I’m talking about.
margin.table(ct1, margin = 2)
cylinders 4 6 8 11 7 14
The function addmargins
Another technique to determine the sum totals of the rows and columns of a contingency table is to use the addmargins() function.
The totals of all the rows and columns of the input contingency table are found using this function. Let’s have a look at a practical application of this function.
addmargins(ct1)
cylinders gears 4 6 8 Sum 3 1 2 12 15 4 8 4 0 12 5 2 1 2 5 Sum 11 7 14 32
Proportional contingency tables.
We can find the proportional weight of each value in a contingency table using the prop.tables() function. This is exemplified in the following example.
prop.table(ct1)
cylinders gears 4 6 8 3 0.03125 0.06250 0.37500 4 0.25000 0.12500 0.00000 5 0.06250 0.03125 0.06250
Proportionate rows are used to create contingency tables
We can also determine the row proportions in a contingency table bypassing margin = 1 as an input to the prop.table() function.
Detecting and Dealing with Outliers: First Step
prop.table(ct1, margin = 1)
cylinders gears 4 6 8 3 0.06666667 0.13333333 0.80000000 4 0.66666667 0.33333333 0.00000000 5 0.40000000 0.20000000 0.40000000
Proportional columns in R contingency tables
Using margin = 2 in the prop.table() function’s inputs, we may obtain the column proportions in a contingency table.
prop.table(ct1, margin = 2)
cylinders gears 4 6 8 3 0.09090909 0.28571429 0.85714286 4 0.72727273 0.57142857 0.00000000 5 0.18181818 0.14285714 0.14285714
Creating Flat Contingency tables in R
The ftable() function in R can be used to build simple or elaborate contingency tables. Let’s look at this in more detail using the following example.
ft1 <- ftable(mtcars[c("gear","vs","am","cyl")]) ft1
cyl 4 6 8 gear vs am 3 0 0 0 0 12 1 0 0 0 1 0 1 2 0 1 0 0 0 4 0 0 0 0 0 1 0 2 0 1 0 2 2 0 1 6 0 0 5 0 0 0 0 0 1 1 1 2 1 0 0 0 0 1 1 0 0
Cross Tabulation and The xtabs Function
Using R’s xtabs() function, we can generate a cross-tabulation contingency table. The function returns an object with the “table” and “xtabs” classes. Here’s an example of how to use the xtabs function.
c1 <- sample(letters[1:4],16,replace = TRUE) c2 <- sample(LETTERS[1:4],16,replace = TRUE) df1 <- data.frame(c1,c2) t1 <- table(df1$c1,df1$c2) t2 <- as.data.frame.matrix(t1) xt1 <- xtabs(A~B+C,t2) xt1
C B 0 1 2 0 0 0 3 1 0 0 0 2 2 0 0
Recovering data from contingency tables in R
The as.data.frame() function can be used to retrieve data from contingency tables prepared with the xtabs() function. A data frame object is the end product.
df2 <- as.data.frame(xt1) df2
B C Freq 1 0 0 0 2 1 0 0 3 2 0 2 4 0 1 0 5 1 1 0 6 2 1 0 7 0 2 3 8 1 2 0 9 2 2 0
Summary
Contingency tables are a useful tool for summarising data and identifying relationships and dependencies among variables. It’s a method of presenting data in a compressed format.
We learned what contingency tables are in this R tutorial. We looked at how to make contingency tables in R and how to use them to do things like add along their margins and calculate proportionate values.
In addition, we learned about flat contingency tables and how to make them in R.
Finally, we discovered cross-tabulation and how to extract data from a contingency table.