How to Use Spread Function in R
How to Use Spread Function in R, A key-value pair can be “spread” across numerous columns using the tidyr package’s spread() function.
The basic syntax used by this function is as follows.
Free Data Science Course-Online 2022 »
spread(data, key value)
where:
data: data frame name
key: column whose values will serve as the names of variables
value: Column where new variables formed from keys will populate with values
The usage of this function is demonstrated in the examples that follow.
The tidyr package’s objective is to produce “tidy” data, which possesses the following properties:
Each column contains a variable.
Each row represents a finding.
Each cell only contains one value.
To create neat data, the tidyr package requires four essential functions:
1. Spread() function
2. The function gather().
3. The function separate().
4. The function unite().
You will be able to produce “tidy” data from any data frame if you can master these four functions.
Example 1: Spread Values Over Two Columns
Let’s say we have the R data frame shown below:
Let’s create a data frame
df <- data.frame(player=rep(c('A', 'B'), each=4),
                year=rep(c(1, 1, 2, 2), times=2),
                stat=rep(c('points', 'assists'), times=4),
                amount=c(14, 6, 18, 7, 22, 9, 38, 4))Now we can view the data frame
Best Data Science Books For Beginners »
df
player year   stat amount 1    P1   1 points   104 2    P1   1 assists    56 3    P1   2 points   108 4    P1   2 assists    45 5    P2   1 points   333 6    P2   1 assists   405 7     P2   2 points   508 8    P2   2 assists   314
The stat column’s values can be separated into separate columns by using the spread() function.
library(tidyr)
Dividing the stats column into several columns
spread(df, key=stat, value=amount)
player year assists points 1Â Â Â Â P1Â Â Â 1Â Â Â Â Â 56Â Â Â 104 2Â Â Â Â P1Â Â Â 2Â Â Â Â Â 45Â Â Â 108 3Â Â Â Â P2Â Â Â 1Â Â Â Â 405Â Â Â 333 4Â Â Â Â P2Â Â Â 2Â Â Â Â 314Â Â Â 508
Example 2: Values Should Be Spread Across More Than Two Columns
Let’s say we have the R data frame shown below:
Let’s create a data frame
df <- data.frame(player=rep(c('P1', 'P2'), each=4),
                year=rep(c(1, 1, 2, 2), times=2),
                stat=rep(c('points', 'assists', 'steals', 'blocks'), times=2),
                amount=c(104, 56, 108, 45, 333, 405, 508, 314))Now we can view the data frame
How to add Circles in Plots in R with Examples »
df
player year   stat amount 1    P1   1 points   104 2    P1   1 assists    56 3    P1   2 steals   108 4    P1   2 blocks    45 5    P2   1 points   333 6    P2   1 assists   405 7    P2   2 steals   508 8    P2   2 blocks   314
We can use the spread() function to turn the four unique values in the stat column into four new columns:
library(tidyr)
spread(df2, key=stat, value=amount)
player year assists blocks points steals 1Â Â Â Â P1Â Â Â 1Â Â Â Â Â 56Â Â Â Â NAÂ Â Â 104Â Â Â Â NA 2Â Â Â Â P1Â Â Â 2Â Â Â Â Â NAÂ Â Â Â 45Â Â Â Â NAÂ Â Â 108 3Â Â Â Â P2Â Â Â 1Â Â Â Â 405Â Â Â Â NAÂ Â Â 333Â Â Â Â NA 4Â Â Â Â P2Â Â Â 2Â Â Â Â Â NAÂ Â Â 314Â Â Â Â NAÂ Â Â 508

