Dummy Variable Example in R

Dummy Variable Example in R, A dataset occasionally needs to be arranged according to particular properties.

They are important for statistical modeling because they facilitate the grouping of related objects by providing a dummy variable to indicate if the property requirement has been satisfied.

Dummy Variable Example in R

In order to store statistical data, dummy variables are introduced to a dataset. It is applied when categorizing data based on particular attributes desired.

One dummy variable is required fewer than the total number of categories you intend to construct. To use a dataset of five different types of automobiles to divide a population into groups based on the vehicles they drive.

How to learn Big Data for Beginners? »

Four dummy variables with values of 1 or 0 would be created. The fifth vehicle type in this illustration would be represented by all four dummy variables being equal to 0, with each dummy variable representing a vehicle type that would be denoted by 1.

How to create a dummy variable in R

A simple operator (percent in percent) is all that is required to construct a dummy variable in R, and it returns true if the variable equals the value being sought.

df<-data.frame(ID=c("B","S","T","A"),
               sex=c("M","F","M","F"),
               Height=c(5.4,5.2,6,5.6),
               Weight=c(170,162,180,NA))
df
ID sex Height Weight
1  B   M    5.4    170
2  S   F    5.2    162
3  T   M    6.0    180
4  A   F    5.6     NA

A data frame comprising four people’s height, weight, and sex is shown here.

df$male = df$sex %in% ‘M’
df
ID sex Height Weight  male
1  B   M    5.4    170  TRUE
2  S   F    5.2    162 FALSE
3  T   M    6.0    180  TRUE
4  A   F    5.6     NA FALSE

The data frame now has a new column thanks to the dummy variable df$male that we added earlier. We get the same data when it is printed together with the new variable.

One of the First Steps to Become a Data Scientist »

Useful application

Being able to group comparable objects together is frequently crucial in statistical modeling.

R – Base Data: How to Create a Dummy Variable

team$didsales = team$pastjob %in% c('Research','R&D')
team
employee  pastjob results didsales
1        1       IT     126    FALSE
2        2 Research    1280     TRUE
3        3    sales     212    FALSE
4        4    sales     301    FALSE
5        5      ops     215    FALSE
6        6      ops     168    FALSE
7        7      R&D     212     TRUE
8        8       IT     314    FALSE

how to create a dummy variable in R – roll up

aggregate(team, by=list(team$didsales),FUN=mean)
Group.1 employee pastjob  results didsales
1   FALSE      4.5      NA 222.6667        0
2    TRUE      4.5      NA 746.0000        1

The aggregate() function is used in this sales team example to display the average performance of the team members. The manager can learn a lot about a sales team’s activities as a whole by having access to this information.

What Data Science Is and What You Can Do With It » finnstats

Dummy variables can be used to divide datasets into groups. R makes performing this really simple because it only requires one small operation. This is just one of the many good things about R as a data research tool.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

5 + 20 =

Ads Blocker Image Powered by Code Help Pro

Quality articles need supporters. Will you be one?

You currently have an Ad Blocker on.

Please support FINNSTATS.COM by disabling these ads blocker.

Powered By
Best Wordpress Adblock Detecting Plugin | CHP Adblock