How to Use Spread Function in R?-tidyr Part1

How to Use Spread Function in R, To “spread” a key-value pair across multiple columns, use the spread() method from the tidyr package.

The basic syntax used by this function is as follows.

spread(data, key value)

where:

data: Name of the data frame

key: column whose values will serve as the names of variables

value: Column where new variables formed from keys will populate with values

How to Use Spread Function in R?

The practical application of this function is demonstrated in the examples that follow.

dplyr Techniques and Tips – Data Science Tutorials

Example 1: Divide Values Between Two Columns

Let’s say we have the R data frame shown below.

Let’s create a data frame

df <- data.frame(player=rep(c('A', 'B'), each=4),
year=rep(c(1, 1, 2, 2), times=2),
stat=rep(c('points', 'assists'), times=4),
amount=c(14, 6, 18, 7, 22, 9, 38, 4))

Now we can view the data frame

df
   player year    stat amount
1     P1    1  points    125
2     P1    1 assists    142
3     P1    2  points    145
4     P1    2 assists    157
5     P2    1  points    134
6     P2    1 assists    213
7     P2    2  points    125
8     P2    2 assists    214

The stat column’s values can be separated into separate columns using the spread() function.

library(tidyr)

Dividing the stats column into several columns

spread(df, key=stat, value=amount)
player year assists points
1     P1    1     142    125
2     P1    2     157    145
3     P2    1     213    134
4     P2    2     214    125

Example 2: Values Should Be Spread Across More Than Two Columns

Let’s say we have the R data frame shown below:

Imagine we have the following data frame

df2 <- data.frame(player=rep(c('P1'), times=8),
year=rep(c(1, 2), each=4),
stat=rep(c('points', 'assists', 'steals', 'blocks'), times=2),
amount=c(115, 116, 212, 211, 229, 319, 213, 314))

Now we can view the data frame

df2
  player year    stat amount
1     P1    1  points    115
2     P1    1 assists    116
3     P1    1  steals    212
4     P1    1  blocks    211
5     P1    2  points    229
6     P1    2 assists    319
7     P1    2  steals    213
8     P1    2  blocks    314

The spread() function can be used to create four additional columns from the stat column’s four distinct values.

library(tidyr)

Dividing the stats column into several columns

spread(df2, key=stat, value=amount)
   player year assists blocks points steals
1     P1    1     116    211    115    212
2     P1    2     319    314    229    213

How to Group and Summarize Data in R – Data Science Tutorials

Have you liked this article? If you could email it to a friend or share it on Facebook, Twitter, or Linked In, I would be eternally grateful.

Please use the like buttons below to show your support. Please remember to share and comment below. 

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

four + 2 =