Random Forest Model in R

by finnstats

The random forest model in R is a highly useful tool in analyzing predicted outcomes for a classification or regression model.

The main idea is how explanatory variables will impact the dependent variable.

In this particular example, we analyze the impact of explanatory variables of Attribute 1, Attribute2, …Attribute6 on the dependent variable Likeability.

What are the Nonparametric tests? » Why, When and Methods »

Data Loading

Use read.xlsx function to read data into R.

data<-read.xlsx("D:/rawdata.xlsx",sheetName="Sheet1")

We then split the data set into two training dataset and test data set. Training data that we will use to create our model and then the test data we will test it.

We have randomly created a data frame with a total of 64 data row observations, 60 observations used for training the data set, and 4 observations used for testing purposes.

#Create training and test data

inputData <- data[1:60, ] # training data
testData <- data[20:64, ] # test data

While using tuneRF function we can find out best mtr

tuneRF(data2,data2[,dim(data2)[2]], stepFactor=1.5)
mtry = 8 provides best OOB error = 0.01384072

A random forest allows us to determine the most important predictors across the explanatory variables by generating many decision trees and then ranking the variables by importance.

How to run R code in PyCharm? » R & PyCharm »