How to Plot Observed and Predicted values in R
Plot Observed and Predicted values in R, In order to visualize the discrepancies between the predicted and actual values, you may want to plot the predicted values of a regression model in R.
This tutorial demonstrates how to make this style of the plot using R and ggplot2.
Approach 1: Plot of observed and predicted values in Base R
The following code demonstrates how to construct a plot of expected vs. actual values after fitting a multiple linear regression model in R.
How to find z score in R-Easy Calculation-Quick Guide »
Load Library and dataset
library(mlbench) data("BostonHousing2") head(BostonHousing2)
town tract lon lat medv cmedv crim zn indus chas nox rm age 1 Nahant 2011 -70.9550 42.2550 24.0 24.0 0.00632 18 2.31 0 0.538 6.575 65.2 2 Swampscott 2021 -70.9500 42.2875 21.6 21.6 0.02731 0 7.07 0 0.469 6.421 78.9 3 Swampscott 2022 -70.9360 42.2830 34.7 34.7 0.02729 0 7.07 0 0.469 7.185 61.1 4 Marblehead 2031 -70.9280 42.2930 33.4 33.4 0.03237 0 2.18 0 0.458 6.998 45.8 5 Marblehead 2032 -70.9220 42.2980 36.2 36.2 0.06905 0 2.18 0 0.458 7.147 54.2 6 Marblehead 2033 -70.9165 42.3040 28.7 28.7 0.02985 0 2.18 0 0.458 6.430 58.7 dis rad tax ptratio b lstat 1 4.0900 1 296 15.3 396.90 4.98 2 4.9671 2 242 17.8 396.90 9.14 3 4.9671 2 242 17.8 392.83 4.03 4 6.0622 3 222 18.7 394.63 2.94 5 6.0622 3 222 18.7 396.90 5.33 6 6.0622 3 222 18.7 394.12 5.21
model <- lm(rm ~ medv, data = BostonHousing2)
plot predicted vs. actual values
plot(x=BostonHousing2$medv, y= predict(model), xlab='Actual Values', ylab='Predicted Values', main='Predicted vs. Actual Values')
The x-axis shows the dataset’s actual values and the model’s predicted values in the y-axis. The estimated regression line is the diagonal line in the center of the plot.
Because each data point is quite close to the projected regression line, we may conclude that the regression model fits the data reasonably well.
Point Biserial Correlation in R-Quick Guide »
We can also make a data frame that displays the actual and expected values for each data point:
data <- data.frame(actual= BostonHousing2$medv, predicted=predict(model)) head(data)
view data frame values
actual predicted 1 24.0 25.92488 2 21.6 24.62709 3 34.7 31.03788 4 33.4 29.46740 5 36.2 30.70795 6 28.7 24.70194
Approach2: Plot of Predicted vs. Observed Values in ggplot2
Using the ggplot2 data visualization package, the following code explains how to make a plot of predicted vs. actual values.
library(ggplot2)
plot predicted vs. actual values
ggplot(data, aes(x=BostonHousing2$medv, y= predict(model))) + geom_point() + labs(x='Actual Values', y='Predicted Values', title='Predicted vs. Actual Values')
The predicted values from the model are displayed on the y-axis, while the actual values from the dataset are displayed on the x-axis.