Arrange Data by Month in R with example

Arrange Data by Month in R, To easily arrange data by month, use the floor_date() function from the lubridate package in R.

The following is the fundamental syntax for this function.

library(tidyverse)
df %>%
    group_by(month = lubridate::floor_date(date_column, 'month')) %>%
    summarize(sum = sum(value_column))

The example below demonstrates how to utilize this function in practice.

How to make a rounded corner bar plot in R? – Data Science Tutorials

Example: Arrange Data by Month in R

Assume we have the following R data frame, which indicates total sales of a particular item on various dates:

make a data frame

df <- data.frame(date=as.Date(c('1/5/2022', '1/10/2022', '2/12/2022', '2/18/2022',
                                '3/15/2022', '3/20/2022', '3/30/2022'), '%m/%d/%Y'),
                 sales=c(22, 11, 32, 14, 15, 22, 33))

Let’s view the data frame

Quantiles by Group calculation in R with examples – Data Science Tutorials

df
       date   sales
1 2022-01-05    22
2 2022-01-10    11
3 2022-02-12    32
4 2022-02-18    14
5 2022-03-15    15
6 2022-03-20    22
7 2022-03-30    33

To determine the total sales for each month, we can use the following code.

library(tidyverse)

Let’s group data by month and sum sales

df %>%
    group_by(month = lubridate::floor_date(date, 'month')) %>%
    summarize(sum_of_sales = sum(sales))
  month      sum_of_sales
  <date>            <dbl>
1 2022-01-01           33
2 2022-02-01           46
3 2022-03-01           70

We can observe the following from the output.

Free Best Online Course For Statistics – Data Science Tutorials

In January, a total of 33 sales were made.

In February, a total of 46 sales were made.

In March, a total of 70 sales were made.

We can also use another metric to aggregate the data.

We may, for example, calculate the maximum sales on a single day, categorised by month.

library(tidyverse)

Now we can group data by month and find max sales

df %>%
    group_by(month = lubridate::floor_date(date, 'month')) %>%
    summarize(max_of_sales = max(sales))
  month      max_of_sales
  <date>            <dbl>
1 2022-01-01           22
2 2022-02-01           32
3 2022-03-01           33

We can observe the following from the output.

Hypothesis Testing Examples-Quick Overview – Data Science Tutorials

In January, the highest number of sales in a single day was 22.

In February, the highest number of sales in a single day was 32.

In March, the highest number of sales made in a single day was 33.

Within the summary() function, you can use any measure you choose.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

three × four =