How to do Conditional Mutate in R?

How to do Conditional Mutate in R, It’s common to wish to add a new variable based on a condition to an existing data frame. The mutate() and case when() functions from the dplyr package make this task fortunately simple.

Cumulative Sum calculation in R – Data Science Tutorials

With the following data frame, this lesson provides numerous examples of how to apply these functions.

How to do Conditional Mutate in R

Let’s create a data frame

df <- data.frame(player = c('P1', 'P2', 'P3', 'P4', 'P5'),
position = c('A', 'B', 'A', 'B', 'B'),
points = c(102, 215, 319, 125, 112),
rebounds = c(22, 12, 19, 23, 36))

Let’s view the data frame

df
   player position points rebounds
1     P1        A    102       22
2     P2        B    215       12
3     P3        A    319       19
4     P4        B    125       23
5     P5        B    112       36

Example 1: Based on one existing variable, create a new variable

A new variable called “score” can be created using the following code depending on the value in the “points” column.

Top Data Science Skills to Get You Hired »

library(dplyr)

Let’s define new variable ‘score’ using mutate() and case_when()

df %>%
  mutate(score = case_when(points < 105 ~ 'LOW',
  points < 212 ~ 'MED',
  points < 450 ~ 'HIGH'))
  player position points rebounds score
1     P1        A    102       22   LOW
2     P2        B    215       12  HIGH
3     P3        A    319       19  HIGH
4     P4        B    125       23   MED
5     P5        B    112       36   MED

Example 2: Based on a number of existing variables, create a new variable

The following code demonstrates how to make a new variable called “type” based on the player and position values in the player column.

Tips for Rearranging Columns in R – Data Science Tutorials

library(dplyr)

Now we can define the  new variable ‘Type’ using mutate() and case_when()

df %>%
  mutate(Type = case_when(player == 'P1' | player == 'P2' ~ 'starter',
  player == 'P3' | player == 'P4' ~ 'backup',
  position == 'B' ~ 'reserve'))
   player position points rebounds    Type
1     P1        A    102       22 starter
2     P2        B    215       12 starter
3     P3        A    319       19  backup
4     P4        B    125       23  backup
5     P5        B    112       36 reserve

In order to generate a new variable called “value” depending on the value in the points and rebounds columns, use the following code.

Best online course for R programming – Data Science Tutorials

library(dplyr)

Let’s define the new variable ‘value’ using mutate() and case_when()

df %>%
  mutate(value = case_when(points <= 102 & rebounds <=45 ~ 2,
  points <=215 & rebounds > 55 ~ 4,
  points < 225 & rebounds < 28 ~ 6,
  points < 325 & rebounds > 29 ~ 7,
  points >=25 ~ 9))
player position points rebounds value
1     P1        A    102       22     2
2     P2        B    215       12     6
3     P3        A    319       19     9
4     P4        B    125       23     6
5     P5        B    112       36     7

Hope now you are clear with the concept.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

6 + seven =

Ads Blocker Image Powered by Code Help Pro

Quality articles need supporters. Will you be one?

You currently have an Ad Blocker on.

Please support FINNSTATS.COM by disabling these ads blocker.

Powered By
100% Free SEO Tools - Tool Kits PRO