# Weighted Standard Deviation in R With Example

When some values in a dataset have higher weights than others, the weighted standard deviation is a handy technique to measure the dispersion of those values.

To calculate a weighted standard deviation, use the following formula. where:

The total number of observations is denoted by the letter N.

The number of non-zero weights is denoted by the letter M.

wi: A weights vector

xi: A set of data values in a vector

x is the weighted average.

The wt.var() function from the Hmisc package is the simplest approach to calculate a weighted standard deviation in R, and it employs the following syntax.

Let’s define the data values

`x <- c(2, 10, 10, 3, ...)`

Now, we can define the weights

`wt <- c(1, 1, 1, 2, ...)`

Let’s figure out what the weighted variance is.

`weighted_var <- wtd.var(x, wt)`

The weighted standard deviation can now be calculated.

`weighted_sd <- sqrt(weighted_var)`

The examples below demonstrate how to utilize this function in practice.

## Example 1: One-Vector Weighted Standard Deviation

In R, the weighted standard deviation for a single vector can be calculated using the code below.

`library(Hmisc)`

Let’s define data values

`x <- c(10, 11, 12, 21, 22, 30, 23, 33, 33, 12)`

Now we can add define weights

`wt <- c(2, 1, 1.2, 3, 2, 1, 1.5, 2, 2, 2)`

Let’s calculate the weighted variance

`weighted_var <- wtd.var(x, wt)`

Now we can calculate the weighted standard deviation.

```sqrt(weighted_var)
8.707209```

The weighted standard deviation turns out to be 8.707209.

## Example 2: Weighted Standard Deviation for a Data Frame Column

In R, the weighted standard deviation for one column of a data frame may be calculated using the following code.

`library(Hmisc)`

Create a data frame,

```df <- data.frame(team=c('A', 'A', 'A', 'A', 'A', 'B', 'B', 'C'),
wins=c(21, 19, 10, 10, 12, 11, 15, 12),
points=c(1.5, 3, 2, 3, 2, 1, 1, 2))
df```
```  team wins points
1    A   21    1.5
2    A   19    3.0
3    A   10    2.0
4    A   10    3.0
5    A   12    2.0
6    B   11    1.0
7    B   15    1.0
8    C   12    2.0```

Let’s define weights

`wt <- c(1, 2, 1.5, 3, 2, 2, 2, 1)`

calculate the weighted sd of points

```sqrt(wtd.var(df\$points, wt))
 0.8269873```

The points column’s weighted standard deviation comes out to be 0.8269873.

### Example 3: Weighted Standard Deviation for Data Frames with Multiple Columns

The following code demonstrates how to calculate the weighted standard deviation for many columns of a data frame in R using the sapply() function.

`library(Hmisc)`

Let’s define a data frame

```df <- data.frame(team=c('A', 'A', 'A', 'A', 'A', 'B', 'B', 'C'),
wins=c(21, 19, 10, 10, 12, 11, 15, 12),
points=c(1.5, 3, 2, 3, 2, 1, 1, 2))```

Let’s define weights

`wt <- c(1, 2, 1.5, 3, 2, 2, 2, 1)`

calculate the weighted standard deviation of points and wins

```sapply(df[c('wins', 'points')], function(x) sqrt(wtd.var(x, wt)))
wins    points
3.7972229 0.8269873```

The weighted standard deviation for the wins column is 3.79 and the weighted standard deviation for the points column is 0.826.

### 1 Response

1. Anonymous says:

The weighted standard deviation given by sqrt(Hmisc::wtd.var) does not agree with the formula given on top of this page because Hmisc package applies the “frequency weights”-approach
(https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_variance).