How to Generate Kernel Density Plots in R

Kernel Density Plots in R, we’ll look at how to make kernel density graphs in the R in this article. The user merely needs to utilize the density() function, which is an R language built-in function.

A kernel density plot is a form of a graph that uses a single continuous curve to show the distribution of values in a dataset.

A kernel density plot is similar to a histogram, but it is better at depicting the shape of distribution because the number of bins used in the histogram has no effect on it.

Let’s see the syntax first,

Syntax: density(x,…)

Parameters:

x:- the information that will be used to make the estimate
…:- additional (non-default) method parameters

In this article we are going to discuss three approaches one is Kernel Density Plot, the second is filled Kernel Density Plot and the third one is multiple Kernel Density Plot.

Let’s start with approach one

Approach 1: One Kernel Density Plot

The following code demonstrates how to make a kernel density plot in R for a single dataset.

First, we need to create a data set,

data <- c(5,6,8, 3, 8,4, 4, 9,6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8, 14)

Now we can use the kernel density function

kd <- density(data)
kd
Call:
               density.default(x = data)
Data: data (23 obs.);       Bandwidth 'bw' = 1.076
       x                 y           
 Min.   :-0.2288   Min.   :0.0003627 
 1st Qu.: 4.1356   1st Qu.:0.0191789 
 Median : 8.5000   Median :0.0338549 
 Mean   : 8.5000   Mean   :0.0572128 
 3rd Qu.:12.8644   3rd Qu.:0.1044549 
 Max.   :17.2288   Max.   :0.1438602 

It’s ready to plot

plot(kd, main='Kernel Density Plot')

The x-axis depicts the dataset’s values, while the y-axis depicts the relative frequency of each value. The plot’s highest points show where the values occur most frequently.

Approach 2: Filled Kernel Density Plot

How to make a kernel density plot with certain border color and the filled-in color is shown in the code below.

We can make use of the same data set

data <- c(5,6,8, 3, 8,4, 4, 9,6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8, 14)

As usual, we can use the density function

kd <- density(data)

Okay, now it’s ready to plot

plot(kd, main='Kernel Density Plot')

Here we can add colors and borders.

polygon(kd, col='red', border='green')

The polygon function in conjunction with the density function generates a polygon under the density plot.

The polygon function is used to create the polygon below the density plot.

The density() function is used to create the density plot of the given data.

Suppose if we want to add a mean line vertically to the density plot then use abline() function.

abline(v = mean(data), col = "blue")

Now we can see how to overlay a histogram with a density plot in R. Use hist() function first then call the density function to build the data density plot.

hist(data, prob = TRUE)                             
lines(density(data), col = "blue")

Approach 3: Multiple Kernel Density Plots

The following R code demonstrates how to combine numerous kernel density charts into a single plot.

Let’s create two datasets data1 and data2.

data1 <- c(5,7,4,5,5, 6, 7, 7, 6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8)
data2 <- c(12,6,8, 3, 20,4, 5, 9,6,7,4,5,5, 6, 7, 7, 10,11,14,3,5,8, 14)

As usual way we can plot both the density plots separately.

kd1 <- density(data1)
plot(kd1, col='green', lwd=2)

Now plot the second kernel density plot

kd2 <- density(data2)
lines(kd2, col='pink', lwd=2)

It’s worth noting that we can use similar syntax to make as many kernel density plots as we like in a single chart.

How to Calculate the Standard Error of the Mean in R »

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

1 × 2 =