How to Find Unmatched Records in R

How to Find Unmatched Records in R?, To retrieve all rows in one data frame that do not have matching values in another data frame, use the anti_join() function from the dplyr package in R.

What Is the Best Way to Filter by Date in R? – Data Science Tutorials

The following is the fundamental syntax for this function.

anti_join(df1, df2, by='col_name')

The examples below demonstrate how to utilise this syntax in practise.

How to make a rounded corner bar plot in R? – Data Science Tutorials

Example 1: Use anti join() with One Column

Let’s pretend we have the following two R data frames:

Now we  data frames

df1 <- data.frame(team=c('A', 'B', 'C', 'D', 'E'),
                  points=c(102, 104, 129, 224, 436))
df2 <- data.frame(team=c('A', 'B', 'C', 'F', 'G'),
                  points=c(412, 514, 519, 233, 117))

To return all rows in the first data frame that do not have a matching team in the second data frame, we can use the anti_join() function.

How to get the last value of each group in R – Data Science Tutorials

library(dplyr)

Using the ‘team’ column, execute an anti-join.

anti_join(df1, df2, by='team')
  team points
1    D    224
2    E    436

We can see that in the second data frame, there are exactly two teams from the first data frame that do not have a corresponding team name.

Example 2: Use anti_join() with Multiple Columns

Let’s pretend we have the following two R data frames.

Change ggplot2 Theme Color in R- Data Science Tutorials

Let’s create the data frames

df1 <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                  position=c('G', 'G', 'F', 'G', 'F', 'C'),
                  points=c(182, 164, 159, 124, 136, 441))
df2 <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                  position=c('G', 'G', 'C', 'G', 'F', 'F'),
                  points=c(152, 154, 159, 322, 217, 522))

The anti_join() method can be used to return all rows in the first data frame that do not match a team or position in the second data frame.

How to perform the Kruskal-Wallis test in R? – Data Science Tutorials

library(dplyr)

Use the ‘team’ and ‘position’ columns to do an anti-join.

anti_join(df1, df2, by=c('team', 'position'))
  team position points
1    A        F    159
2    B        C    441

We can see that in the second data frame, there are exactly two records from the first data frame that do not have a corresponding team name and position.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

one + 16 =