# Dummy Variable Example in R

Dummy Variable Example in R, A dataset occasionally needs to be arranged according to particular properties.

They are important for statistical modeling because they facilitate the grouping of related objects by providing a dummy variable to indicate if the property requirement has been satisfied.

## Dummy Variable Example in R

In order to store statistical data, dummy variables are introduced to a dataset. It is applied when categorizing data based on particular attributes desired.

One dummy variable is required fewer than the total number of categories you intend to construct. To use a dataset of five different types of automobiles to divide a population into groups based on the vehicles they drive.

How to learn Big Data for Beginners? »

Four dummy variables with values of 1 or 0 would be created. The fifth vehicle type in this illustration would be represented by all four dummy variables being equal to 0, with each dummy variable representing a vehicle type that would be denoted by 1.

### How to create a dummy variable in R

A simple operator (percent in percent) is all that is required to construct a dummy variable in R, and it returns true if the variable equals the value being sought.

df<-data.frame(ID=c("B","S","T","A"), sex=c("M","F","M","F"), Height=c(5.4,5.2,6,5.6), Weight=c(170,162,180,NA))

df

ID sex Height Weight 1 B M 5.4 170 2 S F 5.2 162 3 T M 6.0 180 4 A F 5.6 NA

A data frame comprising four people’s height, weight, and sex is shown here.

df$male = df$sex %in% ‘M’ df

ID sex Height Weight male 1 B M 5.4 170 TRUE 2 S F 5.2 162 FALSE 3 T M 6.0 180 TRUE 4 A F 5.6 NA FALSE

The data frame now has a new column thanks to the dummy variable df$male that we added earlier. We get the same data when it is printed together with the new variable.

One of the First Steps to Become a Data Scientist »

**Useful application**

Being able to group comparable objects together is frequently crucial in statistical modeling.

**R – Base Data: How to Create a Dummy Variable**

team$didsales = team$pastjob %in% c('Research','R&D') team

employee pastjob results didsales 1 1 IT 126 FALSE 2 2 Research 1280 TRUE 3 3 sales 212 FALSE 4 4 sales 301 FALSE 5 5 ops 215 FALSE 6 6 ops 168 FALSE 7 7 R&D 212 TRUE 8 8 IT 314 FALSE

how to create a dummy variable in R – roll up

aggregate(team, by=list(team$didsales),FUN=mean)

Group.1 employee pastjob results didsales 1 FALSE 4.5 NA 222.6667 0 2 TRUE 4.5 NA 746.0000 1

The aggregate() function is used in this sales team example to display the average performance of the team members. The manager can learn a lot about a sales team’s activities as a whole by having access to this information.

What Data Science Is and What You Can Do With It » finnstats

Dummy variables can be used to divide datasets into groups. R makes performing this really simple because it only requires one small operation. This is just one of the many good things about R as a data research tool.