How to Generate Kernel Density Plots in R
Kernel Density Plots in R, we’ll look at how to make kernel density graphs in the R in this article. The user merely needs to utilize the density() function, which is an R language built-in function.
A kernel density plot is a form of a graph that uses a single continuous curve to show the distribution of values in a dataset.
A kernel density plot is similar to a histogram, but it is better at depicting the shape of distribution because the number of bins used in the histogram has no effect on it.
Let’s see the syntax first,
Syntax: density(x,…)
Parameters:
x:- the information that will be used to make the estimate …:- additional (non-default) method parameters
In this article we are going to discuss three approaches one is Kernel Density Plot, the second is filled Kernel Density Plot and the third one is multiple Kernel Density Plot.
Let’s start with approach one
Approach 1: One Kernel Density Plot
The following code demonstrates how to make a kernel density plot in R for a single dataset.
First, we need to create a data set,
data <- c(5,6,8, 3, 8,4, 4, 9,6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8, 14)
Now we can use the kernel density function
kd <- density(data) kd
Call: density.default(x = data) Data: data (23 obs.); Bandwidth 'bw' = 1.076 x y Min. :-0.2288 Min. :0.0003627 1st Qu.: 4.1356 1st Qu.:0.0191789 Median : 8.5000 Median :0.0338549 Mean : 8.5000 Mean :0.0572128 3rd Qu.:12.8644 3rd Qu.:0.1044549 Max. :17.2288 Max. :0.1438602
It’s ready to plot
plot(kd, main='Kernel Density Plot')
The x-axis depicts the dataset’s values, while the y-axis depicts the relative frequency of each value. The plot’s highest points show where the values occur most frequently.
Approach 2: Filled Kernel Density Plot
How to make a kernel density plot with certain border color and the filled-in color is shown in the code below.
We can make use of the same data set
data <- c(5,6,8, 3, 8,4, 4, 9,6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8, 14)
As usual, we can use the density function
kd <- density(data)
Okay, now it’s ready to plot
plot(kd, main='Kernel Density Plot')
Here we can add colors and borders.
polygon(kd, col='red', border='green')
The polygon function in conjunction with the density function generates a polygon under the density plot.
The polygon function is used to create the polygon below the density plot.
The density() function is used to create the density plot of the given data.
Suppose if we want to add a mean line vertically to the density plot then use abline() function.
abline(v = mean(data), col = "blue")
Now we can see how to overlay a histogram with a density plot in R. Use hist() function first then call the density function to build the data density plot.
hist(data, prob = TRUE) lines(density(data), col = "blue")
Approach 3: Multiple Kernel Density Plots
The following R code demonstrates how to combine numerous kernel density charts into a single plot.
Let’s create two datasets data1 and data2.
data1 <- c(5,7,4,5,5, 6, 7, 7, 6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8) data2 <- c(12,6,8, 3, 20,4, 5, 9,6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8, 14)
As usual way we can plot both the density plots separately.
kd1 <- density(data1) plot(kd1, col='green', lwd=2)
Now plot the second kernel density plot
kd2 <- density(data2) lines(kd2, col='pink', lwd=2)
It’s worth noting that we can use similar syntax to make as many kernel density plots as we like in a single chart.
How to Calculate the Standard Error of the Mean in R »