How to do Conditional Mutate in R?

How to do Conditional Mutate in R, It’s common to wish to add a new variable based on a condition to an existing data frame. The mutate() and case when() functions from the dplyr package make this task fortunately simple.

Cumulative Sum calculation in R – Data Science Tutorials

With the following data frame, this lesson provides numerous examples of how to apply these functions.

How to do Conditional Mutate in R

Let’s create a data frame

df <- data.frame(player = c('P1', 'P2', 'P3', 'P4', 'P5'),
position = c('A', 'B', 'A', 'B', 'B'),
points = c(102, 215, 319, 125, 112),
rebounds = c(22, 12, 19, 23, 36))

Let’s view the data frame

df
   player position points rebounds
1     P1        A    102       22
2     P2        B    215       12
3     P3        A    319       19
4     P4        B    125       23
5     P5        B    112       36

Example 1: Based on one existing variable, create a new variable

A new variable called “score” can be created using the following code depending on the value in the “points” column.

Top Data Science Skills to Get You Hired »

library(dplyr)

Let’s define new variable ‘score’ using mutate() and case_when()

df %>%
  mutate(score = case_when(points < 105 ~ 'LOW',
  points < 212 ~ 'MED',
  points < 450 ~ 'HIGH'))
  player position points rebounds score
1     P1        A    102       22   LOW
2     P2        B    215       12  HIGH
3     P3        A    319       19  HIGH
4     P4        B    125       23   MED
5     P5        B    112       36   MED

Example 2: Based on a number of existing variables, create a new variable

The following code demonstrates how to make a new variable called “type” based on the player and position values in the player column.

Tips for Rearranging Columns in R – Data Science Tutorials

library(dplyr)

Now we can define the  new variable ‘Type’ using mutate() and case_when()

df %>%
  mutate(Type = case_when(player == 'P1' | player == 'P2' ~ 'starter',
  player == 'P3' | player == 'P4' ~ 'backup',
  position == 'B' ~ 'reserve'))
   player position points rebounds    Type
1     P1        A    102       22 starter
2     P2        B    215       12 starter
3     P3        A    319       19  backup
4     P4        B    125       23  backup
5     P5        B    112       36 reserve

In order to generate a new variable called “value” depending on the value in the points and rebounds columns, use the following code.

Best online course for R programming – Data Science Tutorials

library(dplyr)

Let’s define the new variable ‘value’ using mutate() and case_when()

df %>%
  mutate(value = case_when(points <= 102 & rebounds <=45 ~ 2,
  points <=215 & rebounds > 55 ~ 4,
  points < 225 & rebounds < 28 ~ 6,
  points < 325 & rebounds > 29 ~ 7,
  points >=25 ~ 9))
player position points rebounds value
1     P1        A    102       22     2
2     P2        B    215       12     6
3     P3        A    319       19     9
4     P4        B    125       23     6
5     P5        B    112       36     7

Hope now you are clear with the concept.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

1 + 12 =