Kurtosis in R-What do you understand by Kurtosis?
Kurtosis means bulginess and it is a numerical method in statistics that measures the sharpness of the peak in the distribution.
If a frequency distribution curve is more peaked compared to normal peaked curve then it’s a Kurtic curve.
If a frequency curve is more peaked than a normal curve then it’s called a leptokurtic curve and if it is less peaked than a normal curve then it’s called a platykurtic curve.
In terms of kurtosis, a normally peaked curve is known as a Mesokurtic curve. It is adjusted around the mode of the frequency distribution.
Principal component analysis (PCA) in R »
How can one know about kurtosis?
Kurtosis can be perceived simply by looking at a frequency distribution curve. But sometimes perception becomes difficult if the curve is slightly Kurtic.
To overcome this difficulty of subjective judgment, it is mathematically measured as the ratio of the fourth moment to the square of the second moment.
If the value of β2 is more than 3, the curve is leptokurtic and if less than 3, the curve is platykurtic. For a mesokurtic curve, β2=3.
Let us take an example,
We can make use of the function kurtosis from the e1071 package to compute the kurtosis.
apply family in r apply(), lapply(), sapply(), mapply() and tapply() »
library(e1071)
duration<-faithful$eruptions
duration
[1] 3.6 1.8 3.3 2.3 4.5 2.9 4.7 3.6 1.9 4.3 1.8 3.9 4.2 1.8 4.7 2.2 1.8 4.8 1.6 4.2 1.8 1.8 3.4 3.1 4.5
[26] 3.6 2.0 4.1 3.8 4.4 4.3 4.5 3.4 4.0 3.8 2.0 1.9 4.8 1.8 4.8 4.3 1.9 4.6 1.8 4.5 3.3 3.8 2.1 4.6 2.0
[51] 4.8 4.7 1.8 4.8 1.7 4.9 3.7 1.7 4.6 4.3 2.2 4.5 1.8 4.8 1.8 4.4 4.2 4.7 2.1 4.7 4.0 2.0 4.5 4.0 2.0
[76] 5.1 2.0 4.6 3.9 3.6 4.1 4.3 4.1 2.6 4.1 4.9 4.0 4.5 2.2 4.0 2.2 4.3 1.9 4.8 1.8 4.3 4.7 3.8 1.9 4.9
[101] 2.5 4.4 2.1 4.5 4.0 1.9 4.7 1.8 4.8 3.7 4.7 2.3 4.9 4.4 1.7 4.6 2.3 4.6 1.8 4.4 2.6 4.1 4.2 2.0 4.6
[126] 3.8 1.9 4.5 2.3 4.6 1.9 4.2 2.8 4.3 1.8 4.4 1.9 4.9 2.0 3.7 4.2 2.2 4.5 4.8 4.3 2.0 4.6 2.0 5.1 1.8
[151] 5.0 4.0 2.4 4.6 3.6 4.0 4.5 4.1 1.8 4.0 2.2 4.2 2.0 3.8 3.5 4.6 2.4 5.0 1.9 4.6 1.9 2.1 4.6 3.3 4.2
[176] 4.3 4.5 2.4 4.0 4.2 1.9 4.6 4.2 3.8 2.0 4.4 4.1 1.8 4.4 2.2 4.8 1.8 4.8 4.1 4.0 4.2 3.5 4.4 2.2 4.7
[201] 2.1 4.3 4.1 1.9 4.6 1.8 4.4 3.8 1.9 4.5 2.4 4.7 1.9 3.8 3.4 4.2 2.4 4.8 2.0 4.2 1.9 4.3 1.8 4.5 4.0
[226] 4.1 4.1 4.3 3.9 4.5 4.1 2.4 4.2 2.2 4.4 1.9 1.8 4.3 4.0 2.3 4.2 2.4 4.9 2.9 4.6 3.8 2.1 4.4 2.1 4.3
[251] 2.2 4.4 3.6 4.5 4.2 3.8 3.9 4.4 2.0 4.3 4.8 4.5 1.8 4.2 2.0 2.2 4.8 4.1 2.1 4.4 1.8 4.5
kurtosis(duration) -1.5
Let’s plot the histogram of the values.
How to find dataset differences in R Quickly Compare Datasets »
hist(duration)
Conclusion
The kurtosis value is -1.5116, which is less than 3, which indicates that duration distribution is platykurtic. The other indication is its histogram not bell-shaped.
Please read https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4321753/
Thanks for the link.
Kurtosis does not measure peak at all. You can have infinitely peaked distributions with low kurtosis (e.g., beta(.5,1)), and you can have distributions that appear perfectly flat over 99.99% of the observable data that have infinite kurtosis (eg, .9999U(0,1) + .0001Cauchy). Kurtosis measures tail weight only, and nothing about peakedness.