Reshape data in R
Reshape data in R, In general, data processing in R Programming Language is accomplished by reading data from a data frame that is organized into rows and columns.
Data frames are commonly used because data extraction is much simpler and thus easier.
However, there are times when we need to change the format of the data frame that we receive. As a result, we can use various functions in R to split, merge, and reshape the data frame.
Reshape data in R
The following are the various methods for reshaping data in a data frame:
How to Perform Tukey HSD Test in R » Quick Guide » finnstats
- Transpose of a Matrix
- Joining Rows and Columns
- Merging of Data Frames
- Melting and Casting
Why Is R – Data Reshaping Necessary?
The resultant data obtained from an experiment or study is generally different when doing an analysis or using an analytic function.
One or more columns that correspond to or identify a row are usually followed by a number of columns that represent the measured values.
These columns that identify a row can be considered the composite key of a column in a database.
1. Transpose of a Matrix
We can easily calculate the transpose of a matrix in R using the t() function. The t() function takes a matrix or data frame as input and returns the matrix or data frame’s transposition as output.
Syntax:
t(Matrix/ Data frame)
R program for calculating the transpose of a matrix
Best Books for Data Analytics » finnstats
original <- matrix(c(1:12), nrow=4, byrow=TRUE) original
[,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [4,] 10 11 12
toriginal<- t(original) toriginal
[,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12
2. Joining Rows and Columns in Data Frame
Using functions in R, we can join two vectors or merge two data frames. These tasks are carried out primarily by two functions.
cbind():
Using the cbind() function, we can combine vectors, matrices, or data frames by columns.
Syntax: cbind(x1, x2, x3)
x1, x2, and x3 can all be vectors, matrices, or data frames.
rbind():
Using the rbind() function, we can combine vectors, matrices, or data frames by rows.\
Syntax: rbind(x1, x2, x3)
x1, x2, and x3 can all be vectors, matrices, or data frames.
Best ML Project with Dataset and Source Code » finnstats
Cbind and Rbind function in R
name <- c("finn", "stats", "for", "data") age <- c(204, 153, 602, 229) address <- c("India", "Brazil", "Argentina", "USA")
Cbind function
info <- cbind(name, age, address) print(info)
name age address [1,] "finn" "204" "India" [2,] "stats" "153" "Brazil" [3,] "for" "602" "Argentina" [4,] "data" "229" "USA"
Let’s create a new data frame
new <- data.frame(name=c("PP", "QQ"), age=c("58", "47"), address=c("bangalore", "kolkata"))
Rbind function
new1<- rbind(info, new) print(new1)
name age address 1 finn 204 India 2 stats 153 Brazil 3 for 602 Argentina 4 data 229 USA 5 PP 58 bangalore 6 QQ 47 kolkata
3. Merging two Data Frames
In R, we can merge two data frames using the merge() function if the column names in both data frames are the same.
We can combine the two data frames using a key value.
Syntax: merge(dfA, dfB, …)
Merging two data frames in R
d1 <- data.frame(name=c("PP", "QQ", "EE"), ID=c("11", "12", "13")) d2 <- data.frame(name=c("QQ", "WW"), ID=c("11", "115")) total <- merge(d1, d2, all=TRUE) print(total)
name ID 1 EE 13 2 PP 11 3 QQ 11 4 QQ 12 5 WW 115
4. Melting and Casting
Many steps are involved in data reshaping in order to obtain the desired or required format.
Melting the data, which converts each row into a unique id-variable combination and then casting it, is a popular method.
How to do data reshape in R? » Data Reshaping » finnstats
This process makes use of two functions:
melt():
It transforms a data frame into a molten data frame.
Syntax: melt(data, …, na.rm=FALSE, value.name=”value”)
where,
data: data to be melted
… : arguments
na.rm: converts explicit missings into implicit missings
value.name: storing values
dcast():
It’s used to combine the molten data frame into a new form.
Syntax: melt(data, formula, fun.aggregate)
where,
data: data to be melted
formula: formula that defines how to cast
fun.aggregate: used if there is a data aggregation
melt and cast
How to arrange training and testing datasets in R » finnstats
library(MASS) library(reshape) a <- data.frame(id=c("11", "11", "12", "12"), point=c("1", "2", "1", "2"), x1=c("15", "31", "64", "12"), x2=c("6", "5", "1", "4")) m <- melt(a, id=c("id", "point")) print(m)
id point variable value 1 11 1 x1 15 2 11 2 x1 31 3 12 1 x1 64 4 12 2 x1 12 5 11 1 x2 6 6 11 2 x2 5 7 12 1 x2 1 8 12 2 x2 4