Weighted Standard Deviation in R With Example

by finnstats

When some values in a dataset have higher weights than others, the weighted standard deviation is a handy technique to measure the dispersion of those values.

To calculate a weighted standard deviation, use the following formula.

where:

The total number of observations is denoted by the letter N.

The number of non-zero weights is denoted by the letter M.

wi: A weights vector

xi: A set of data values in a vector

x is the weighted average.

The wt.var() function from the Hmisc package is the simplest approach to calculate a weighted standard deviation in R, and it employs the following syntax.

Let’s define the data values

x <- c(2, 10, 10, 3, ...)

Now, we can define the weights

wt <- c(1, 1, 1, 2, ...)

Let’s figure out what the weighted variance is.

weighted_var <- wtd.var(x, wt)

The weighted standard deviation can now be calculated.

weighted_sd <- sqrt(weighted_var)

The examples below demonstrate how to utilize this function in practice.

Example 1: One-Vector Weighted Standard Deviation

In R, the weighted standard deviation for a single vector can be calculated using the code below.

library(Hmisc)

Let’s define data values

x <- c(10, 11, 12, 21, 22, 30, 23, 33, 33, 12)

Now we can add define weights

wt <- c(2, 1, 1.2, 3, 2, 1, 1.5, 2, 2, 2)

Let’s calculate the weighted variance

weighted_var <- wtd.var(x, wt)

Now we can calculate the weighted standard deviation.

sqrt(weighted_var)
8.707209

The weighted standard deviation turns out to be 8.707209.

Example 2: Weighted Standard Deviation for a Data Frame Column

In R, the weighted standard deviation for one column of a data frame may be calculated using the following code.

library(Hmisc)

Create a data frame,

df <- data.frame(team=c('A', 'A', 'A', 'A', 'A', 'B', 'B', 'C'),
                 wins=c(21, 19, 10, 10, 12, 11, 15, 12),
                 points=c(1.5, 3, 2, 3, 2, 1, 1, 2))
df

  team wins points
1    A   21    1.5
2    A   19    3.0
3    A   10    2.0
4    A   10    3.0
5    A   12    2.0
6    B   11    1.0
7    B   15    1.0
8    C   12    2.0

Let’s define weights

wt <- c(1, 2, 1.5, 3, 2, 2, 2, 1)

calculate the weighted sd of points

sqrt(wtd.var(df$points, wt))
[1] 0.8269873

The points column’s weighted standard deviation comes out to be 0.8269873.

Example 3: Weighted Standard Deviation for Data Frames with Multiple Columns

The following code demonstrates how to calculate the weighted standard deviation for many columns of a data frame in R using the sapply() function.

library(Hmisc)

Let’s define a data frame

df <- data.frame(team=c('A', 'A', 'A', 'A', 'A', 'B', 'B', 'C'),
                 wins=c(21, 19, 10, 10, 12, 11, 15, 12),
                 points=c(1.5, 3, 2, 3, 2, 1, 1, 2))

Let’s define weights

wt <- c(1, 2, 1.5, 3, 2, 2, 2, 1)

calculate the weighted standard deviation of points and wins

sapply(df[c('wins', 'points')], function(x) sqrt(wtd.var(x, wt)))
  wins    points
3.7972229 0.8269873

The weighted standard deviation for the wins column is 3.79 and the weighted standard deviation for the points column is 0.826.

Adding text labels to ggplot2 Bar Chart » finnstats

Anonymous says:
September 21 at 8:24 pm
The weighted standard deviation given by sqrt(Hmisc::wtd.var) does not agree with the formula given on top of this page because Hmisc package applies the “frequency weights”-approach
(https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_variance).

Weighted Standard Deviation in R With Example

Example 1: One-Vector Weighted Standard Deviation

Example 2: Weighted Standard Deviation for a Data Frame Column

Example 3: Weighted Standard Deviation for Data Frames with Multiple Columns

You may also like...

1 Response

Leave a Reply Cancel reply

Recent Posts

Quality articles need supporters. Will you be one?

Weighted Standard Deviation in R With Example

Example 1: One-Vector Weighted Standard Deviation

Example 2: Weighted Standard Deviation for a Data Frame Column

Example 3: Weighted Standard Deviation for Data Frames with Multiple Columns

You may also like...

How to create a Sankey plot in R?

Two Sample Proportions test in R-Complete Guide

How to apply a transformation to multiple columns in R?

1 Response

Leave a Reply Cancel reply

Recent Posts

Quality articles need supporters. Will you be one?