How to Use “not in” operator in Filter

How to Use “not in” operator in Filter, To filter for rows in a data frame that is not in a list of values, use the following basic syntax in dplyr.

How to compare variances in R – Data Science Tutorials

df %>%
  filter(!col_name %in% c('value1', 'value2', 'value3', ...))

The examples below demonstrate how to utilize this syntax in practice.

Example 1: Rows that do not have a value in one column are filtered out.

Let’s say we have the following R data frame.

Two Sample Proportions test in R-Complete Guide – Data Science Tutorials

Let’s create a data frame

df <- data.frame(team=c('P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7', 'P8'),
                 points=c(110, 120, 80, 16, 105, 185, 112, 112),
                 assists=c(133, 128, 131, 139, 134,55,66,135),
                 rebounds=c(18, 18, 14, 13, 12, 15, 17, 12))

Now we can view the data frame

df
    team points assists rebounds
1   P1    110     133       18
2   P2    120     128       18
3   P3     80     131       14
4   P4     16     139       13
5   P5    105     134       12
6   P6    185      55       15
7   P7    112      66       17
8   P8    112     135       12

The following syntax demonstrates how to search for rows where the team name is not ‘P1’ or ‘P2’.

Get the first value in each group in R? – Data Science Tutorials

Find rows where the team name isn’t ‘P1’ or ‘P2’.

df %>%
  filter(!team %in% c('P1', 'P2'))
   team points assists rebounds
1   P3     80     131       14
2   P4     16     139       13
3   P5    105     134       12
4   P6    185      55       15
5   P7    112      66       17
6   P8    112     135       12

Example 2: Filter for rows that don’t have a value in more than one column

The following syntax demonstrates how to filter for rows with a team name that does not equal ‘P1’ and a position that does not equal ‘P3’.

Change ggplot2 Theme Color in R- Data Science Tutorials

filter for rows with a team name other than ‘P1’ and a position other than ‘P3’.

df <- data.frame(team=c('P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7', 'P8'),
                 points=c('A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'),
                 assists=c(133, 128, 131, 139, 134,55,66,135),
                 rebounds=c(18, 18, 14, 13, 12, 15, 17, 12))
df
   team points assists rebounds
1   P1      A     133       18
2   P2      A     128       18
3   P3      B     131       14
4   P4      B     139       13
5   P5      C     134       12
6   P6      C      55       15
7   P7      C      66       17
8   P8      D     135       12
df %>%
  filter(!team %in% c('P1') & !points %in% c('D'))
   team points assists rebounds
1   P2      A     128       18
2   P3      B     131       14
3   P4      B     139       13
4   P5      C     134       12
5   P6      C      55       15
6   P7      C      66       17

You may also like...

No Responses

  1. Another solution is to write a ‘not in’ operator, negating the %in% operator and then use it as you would use %in%

    `%!in%` <- Negate(`%in%`)

Leave a Reply

Your email address will not be published. Required fields are marked *

thirteen − seven =