Cumulative Sum calculation in R

Cumulative Sum calculation in R, using the dplyr package in R, you can calculate the cumulative sum of a column using the following methods.

Best online course for R programming – Data Science Tutorials

Approach 1: Calculate Cumulative Sum of One Column

df %>% mutate(cum_sum = cumsum(var1))

Approach 2: Calculate Cumulative Sum by Group

df %>% group_by(var1) %>% mutate(cum_sum = cumsum(var2))

The examples below demonstrate how to apply each strategy in practice.

One way ANOVA Example in R-Quick Guide – Data Science Tutorials

Example 1: Using dplyr, calculate the cumulative sum.

Let’s say we have the following R data frame:

Let’s make a dataset

df <- data.frame(day=c(1, 2, 3, 4, 5, 6, 7, 8),
                 sales=c(57, 42, 50, 99, 59, 51, 58, 45))

Now we can view the dataset

df
  day sales
1   1    57
2   2    42
3   3    50
4   4    99
5   5    59
6   6    51
7   7    58
8   8    45

To create a new column that holds the cumulative sum of the values in the ‘sales’ column, use the following code.

How to Use the Multinomial Distribution in R? – Data Science Tutorials

library(dplyr)

Let’s calculate the cumulative sum of sales

df %>% mutate(cum_sales = cumsum(sales))
    day sales cum_sales
1   1    57        57
2   2    42        99
3   3    50       149
4   4    99       248
5   5    59       307
6   6    51       358
7   7    58       416
8   8    45       461

Example 2: Using dplyr, calculate the Cumulative Sum by Group.

Let’s say we have the following R data frame.

Dealing With Missing values in R – Data Science Tutorials

Make a dataset

df <- data.frame(store=c('X', 'X', 'X', 'X', 'Y', 'Y', 'Y', 'Y'),
                 day=c(1, 2, 3, 4, 1, 2, 3, 4),
                 sales=c(87, 82, 80, 98, 98, 81, 88, 83))

View the dataset now

df
      X   1    87
2     X   2    82
3     X   3    80
4     X   4    98
5     Y   1    98
6     Y   2    81
7     Y   3    88
8     Y   4    83

To construct a new column that holds the cumulative sum of the values in the ‘sales’ column, grouped by the ‘store’ column, we can use the following code:

library(dplyr)

Now we can calculate the cumulative sum of sales by store.

Methods for Integrating R and Hadoop complete Guide – Data Science Tutorials

df %>% group_by(store) %>% mutate(cum_sales = cumsum(sales))
store   day sales cum_sales
  <chr> <dbl> <dbl>     <dbl>
1 X         1    87        87
2 X         2    82       169
3 X         3    80       249
4 X         4    98       347
5 Y         1    98        98
6 Y         2    81       179
7 Y         3    88       267
8 Y         4    83       350

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

9 + 20 =

Ads Blocker Image Powered by Code Help Pro

Quality articles need supporters. Will you be one?

You currently have an Ad Blocker on.

Please support FINNSTATS.COM by disabling these ads blocker.

Powered By
100% Free SEO Tools - Tool Kits PRO