How to Plot Observed and Predicted values in R

Plot Observed and Predicted values in R, In order to visualize the discrepancies between the predicted and actual values, you may want to plot the predicted values of a regression model in R.

This tutorial demonstrates how to make this style of the plot using R and ggplot2.

Approach 1: Plot of observed and predicted values in Base R

The following code demonstrates how to construct a plot of expected vs. actual values after fitting a multiple linear regression model in R.

How to find z score in R-Easy Calculation-Quick Guide »

Load Library and dataset

library(mlbench)
data("BostonHousing2")
head(BostonHousing2)
   town tract      lon     lat medv cmedv    crim zn indus chas   nox    rm  age
1     Nahant  2011 -70.9550 42.2550 24.0  24.0 0.00632 18  2.31    0 0.538 6.575 65.2
2 Swampscott  2021 -70.9500 42.2875 21.6  21.6 0.02731  0  7.07    0 0.469 6.421 78.9
3 Swampscott  2022 -70.9360 42.2830 34.7  34.7 0.02729  0  7.07    0 0.469 7.185 61.1
4 Marblehead  2031 -70.9280 42.2930 33.4  33.4 0.03237  0  2.18    0 0.458 6.998 45.8
5 Marblehead  2032 -70.9220 42.2980 36.2  36.2 0.06905  0  2.18    0 0.458 7.147 54.2
6 Marblehead  2033 -70.9165 42.3040 28.7  28.7 0.02985  0  2.18    0 0.458 6.430 58.7
     dis rad tax ptratio      b lstat
1 4.0900   1 296    15.3 396.90  4.98
2 4.9671   2 242    17.8 396.90  9.14
3 4.9671   2 242    17.8 392.83  4.03
4 6.0622   3 222    18.7 394.63  2.94
5 6.0622   3 222    18.7 396.90  5.33
6 6.0622   3 222    18.7 394.12  5.21
model <- lm(rm ~ medv, data = BostonHousing2)

plot predicted vs. actual values

plot(x=BostonHousing2$medv, y= predict(model),
     xlab='Actual Values',
     ylab='Predicted Values',
     main='Predicted vs. Actual Values')

The x-axis shows the dataset’s actual values and the model’s predicted values in the y-axis. The estimated regression line is the diagonal line in the center of the plot.

Because each data point is quite close to the projected regression line, we may conclude that the regression model fits the data reasonably well.

Point Biserial Correlation in R-Quick Guide »

We can also make a data frame that displays the actual and expected values for each data point:

data <- data.frame(actual= BostonHousing2$medv, predicted=predict(model))
head(data)

view data frame values

 actual predicted
1   24.0  25.92488
2   21.6  24.62709
3   34.7  31.03788
4   33.4  29.46740
5   36.2  30.70795
6   28.7  24.70194

Approach2: Plot of Predicted vs. Observed Values in ggplot2

Using the ggplot2 data visualization package, the following code explains how to make a plot of predicted vs. actual values.

library(ggplot2)

plot predicted vs. actual values

ggplot(data, aes(x=BostonHousing2$medv, y= predict(model))) +
  geom_point() +
   labs(x='Actual Values', y='Predicted Values', title='Predicted vs. Actual Values')

The predicted values from the model are displayed on the y-axis, while the actual values from the dataset are displayed on the x-axis.

How to Create a Covariance Matrix in R

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

1 × four =

finnstats