68 95 99 Rule in R

68 95 99 Rule in R, The Empirical Rule, often known as the 68-95-99.7 rule, states that assuming a normal distribution dataset:

Within one standard deviation of the mean, 68 percent of data values fall.

Within two standard deviations of the mean, 95% of data values fall.

Within three standard deviations of the mean, 99.7% of data values fall.

In this lesson, we’ll show you how to use R to apply the Empirical Rule to a dataset.

68 95 99 Rule in R

In R, using the Empirical Rule

The pnorm() function in R returns the value of the normal distribution’s cumulative density function.

The following is the fundamental syntax for this function:

pnorm(q, mean, sd)

where:

q: the value of a properly distributed random variable

mean: mean of the distribution

sd: standard deviation of the distribution

To find the area under the normal distribution curve that lies between multiple standard deviations, we can use the following syntax:

find the area under the normal curve that is within one standard deviation of the mean.

pnorm(1) - pnorm(-1)
[1] 0.6826895

Inside 2 standard deviations of the mean, find the area under the normal curve

pnorm(2) - pnorm(-2)
[1] 0.9544997

Inside 3 standard deviations of the mean, find the area under the normal curve

pnorm(3) - pnorm(-3)
[1] 0.9973002

We can confirm the following from the output:

Within one standard deviation of the mean, 68 percent of data values fall.

Within two standard deviations of the mean, 95% of data values fall.

Within three standard deviations of the mean, 99.7% of data values fall.

The following examples demonstrate how to apply the Empirical Rule to various datasets.

Example 1: Using R to Apply the Empirical Rule to a Dataset

Let’s say we have a dataset with a mean of 5 and a standard deviation of 2 that is normally distributed.

To identify which values include 68 percent, 95 percent, and 99.7% of the data, we can use the following code:

Let’s define the terms mean and standard deviation

mean=5
sd=2

To find which values contain 68% of the data

mean-2; mean+2
[1] 3
[1] 7

To find which values contain 95% of the data

mean-2*2; mean+2*2
[1] 1
[1] 9

To find which values contain 99.7% of the data

mean-3*2; mean+3*2
[1] -1
[1] 11

From this output, we can see:

68 percent of the data is in the range of 3 to 7.

95 percent of the data is in the range of 1 to 9 and 99.7% of the data is in the range of -1 to 11.

Example 2: Determining the percent of data that falls between two values

Consider a dataset that is normally distributed and has a mean of 100 and a standard deviation of 5.

Let’s say we want to know what proportion of the data in this distribution falls between 90 and 110.

To obtain the solution, we can utilise the pnorm() function:

between 90 and 110, find the area under the normal curve.

pnorm(110, mean=100, sd=5) - pnorm(90, mean=100, sd=5)
0.9544997

In this distribution, 95.44 percent of the data falls between the values 90 and 110.

How to Draw Grouped Barplot in R » finnstats

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

eight − 1 =