Reshape data in R

Reshape data in R, In general, data processing in R Programming Language is accomplished by reading data from a data frame that is organized into rows and columns.

Data frames are commonly used because data extraction is much simpler and thus easier.

However, there are times when we need to change the format of the data frame that we receive. As a result, we can use various functions in R to split, merge, and reshape the data frame.

Reshape data in R

The following are the various methods for reshaping data in a data frame:

How to Perform Tukey HSD Test in R » Quick Guide » finnstats

  1. Transpose of a Matrix
  2. Joining Rows and Columns
  3. Merging of Data Frames
  4. Melting and Casting

Why Is R – Data Reshaping Necessary?

The resultant data obtained from an experiment or study is generally different when doing an analysis or using an analytic function.

One or more columns that correspond to or identify a row are usually followed by a number of columns that represent the measured values.

These columns that identify a row can be considered the composite key of a column in a database.

1. Transpose of a Matrix

We can easily calculate the transpose of a matrix in R using the t() function. The t() function takes a matrix or data frame as input and returns the matrix or data frame’s transposition as output.

Syntax:

t(Matrix/ Data frame)

R program for calculating the transpose of a matrix

Best Books for Data Analytics » finnstats

original <- matrix(c(1:12), nrow=4, byrow=TRUE)
original
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
[4,]   10   11   12
toriginal<- t(original)
toriginal
      [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12

2. Joining Rows and Columns in Data Frame

Using functions in R, we can join two vectors or merge two data frames. These tasks are carried out primarily by two functions.

cbind():

Using the cbind() function, we can combine vectors, matrices, or data frames by columns.

Syntax: cbind(x1, x2, x3)

x1, x2, and x3 can all be vectors, matrices, or data frames.

rbind():

Using the rbind() function, we can combine vectors, matrices, or data frames by rows.\

Syntax: rbind(x1, x2, x3)

x1, x2, and x3 can all be vectors, matrices, or data frames.

Best ML Project with Dataset and Source Code » finnstats

Cbind and Rbind function in R

name <- c("finn", "stats", "for", "data")
age <- c(204, 153, 602, 229)
address <- c("India", "Brazil", "Argentina", "USA")

Cbind function

info <- cbind(name, age, address)
print(info)
     name    age   address   
[1,] "finn"  "204" "India"   
[2,] "stats" "153" "Brazil"  
[3,] "for"   "602" "Argentina"
[4,] "data"  "229" "USA"

Let’s create a new data frame

new <- data.frame(name=c("PP", "QQ"),
                   age=c("58", "47"),
                   address=c("bangalore", "kolkata"))

Rbind function

new1<- rbind(info, new)
print(new1)
   name age   address
1  finn 204     India
2 stats 153    Brazil
3   for 602 Argentina
4  data 229       USA
5    PP  58 bangalore
6    QQ  47   kolkata

3. Merging two Data Frames

In R, we can merge two data frames using the merge() function if the column names in both data frames are the same.

We can combine the two data frames using a key value.

Syntax: merge(dfA, dfB, …)

Merging two data frames in R

d1 <- data.frame(name=c("PP", "QQ", "EE"),
                 ID=c("11", "12", "13"))
d2 <- data.frame(name=c("QQ", "WW"),
                 ID=c("11", "115"))
total <- merge(d1, d2, all=TRUE)
print(total)
name  ID
1   EE  13
2   PP  11
3   QQ  11
4   QQ  12
5   WW 115

4. Melting and Casting

Many steps are involved in data reshaping in order to obtain the desired or required format.

Melting the data, which converts each row into a unique id-variable combination and then casting it, is a popular method.

How to do data reshape in R? » Data Reshaping » finnstats

This process makes use of two functions:

melt():

It transforms a data frame into a molten data frame.

Syntax: melt(data, …, na.rm=FALSE, value.name=”value”)

where,

data: data to be melted

… : arguments

na.rm: converts explicit missings into implicit missings

value.name: storing values

dcast():

It’s used to combine the molten data frame into a new form.

Syntax: melt(data, formula, fun.aggregate)

where,

data: data to be melted

formula: formula that defines how to cast

fun.aggregate: used if there is a data aggregation

melt and cast

How to arrange training and testing datasets in R » finnstats

library(MASS)
library(reshape)
a <- data.frame(id=c("11", "11", "12", "12"),
                point=c("1", "2", "1", "2"),
                x1=c("15", "31", "64", "12"),
                x2=c("6", "5", "1", "4"))
m <- melt(a, id=c("id", "point"))
print(m)
 id point variable value
1 11     1       x1    15
2 11     2       x1    31
3 12     1       x1    64
4 12     2       x1    12
5 11     1       x2     6
6 11     2       x2     5
7 12     1       x2     1
8 12     2       x2     4

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

two × five =