Age structure diagram in R

Age structure diagram also known as a population pyramid, A population pyramid is a graph that depicts a population’s age and gender distribution.

It’s a helpful chart for quickly grasping a population’s makeup as well as the present trend in population increase.

A rectangular population pyramid indicates that a population is growing at a slower rate, with older generations being replaced by new generations of nearly equal size.

A pyramid-shaped population pyramid indicates that a population is rising at a quicker rate, with older generations spawning larger new generations.

The gender is displayed on the left and right sides of the chart, the age is displayed on the y-axis, and the percentage or amount of the population is displayed on the x-axis.

This R lesson shows you how to make a population pyramid.

Age structure diagram in R

Consider the following dataset, which depicts the percentage make-up of a population by age (0 to 100 years) and gender (M = “Male,” F = “Female”).

Let’s make this example reproducible

set.seed(123)

Now we can create a data frame

df <- data.frame(age = rep(1:100, 2), gender = rep(c("M", "F"), each = 500))
head(df)
1   1      M
2   2      M
3   3      M
4   4      M
5   5      M
6   6      M

Now we can add the population variable into the above data frame.

df$population <- 1/sqrt(df$age) * runif(200, 15000, 20000)
head(df)
age gender population
1   1      M  18917.779
2   2      M  10876.846
3   3      M  11247.496
4   4      M   7547.882
5   5      M   8638.949
6   6      M   7941.566

Let’s convert the population variable into the percentage

df$population <- df$population / sum(df$population) * 100
head(df)
  age gender population
1   1      M  0.5803445
2   2      M  0.3336711
3   3      M  0.3450417
4   4      M  0.2315479
5   5      M  0.2650188
6   6      M  0.2436250

Using the ggplot2 library, we can make a basic population pyramid for this dataset:

library(ggplot2)

Now let’s create a population pyramid

ggplot(df, aes(x = age, fill = gender,
                 y = ifelse(test = gender == "M",
                            yes = -population, no = population))) +
  geom_bar(stat = "identity") +
  scale_y_continuous(labels = abs, limits = max(df$population) * c(-1,1)) +
  coord_flip()+ylab("")

Adding Labels and Titles

Using the labs() parameter, we can add titles and axis labels to the population pyramid:

ggplot(df, aes(x = age, fill = gender,
                 y = ifelse(test = gender == "M",
                            yes = -population, no = population))) +
  geom_bar(stat = "identity") +
  scale_y_continuous(labels = abs, limits = max(df$population) * c(-1,1)) +
  labs(title = "Age Structure Diagram", x = "Age", y = "Percentage of population") +
  coord_flip()

Changing the Colours

Using the scale color manual() parameter, we may change the two colors used to symbolize the genders:

ggplot(df, aes(x = age, fill = gender,
                 y = ifelse(test = gender == "M",
                            yes = -population, no = population))) +
  geom_bar(stat = "identity") +
  scale_y_continuous(labels = abs, limits = max(df$population) * c(-1,1)) +
  labs(title = "Age Structure Diagram", x = "Age", y = "Percentage of population") +
  scale_colour_manual(values = c("red", "green"),
                      aesthetics = c("colour", "fill")) +
  coord_flip()

Pyramids of Multiple Populations

The facet wrap() argument can also be used to plot multiple population pyramids together. Assume we have demographic data for three countries: A, B, and C.

The code below shows how to make a unique demographic pyramid for each country:

set.seed(123)
data<- data.frame(age = rep(1:100, 6),
                            gender = rep(c("M", "F"), each = 300),
                            country = rep(c("A", "B", "C"), each = 100, times = 2))
data$population<- round(1/sqrt(data$age)*runif(200, 15000, 20000), 0)
head(data)
age gender country population
1   1      M       A      16929
2   2      M       A      11190
3   3      M       A      11161
4   4      M       A       8049
5   5      M       A       6982
6   6      M       A       7122

Now we can create one population pyramid per country

ggplot(data, aes(x = age, fill = gender,
                          y = ifelse(test = gender == "M",
                                     yes = -population, no = population))) +
  geom_bar(stat = "identity") +
  scale_y_continuous(labels = abs, limits = max(data$population) * c(-1,1)) +
  labs(y = "Population Amount") +
  coord_flip() +
  facet_wrap(~ country) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Changing the Theme

Finally, we may change the chart’s theme. The following code, for example, utilizes theme classic() to make the charts look more minimalist.

ggplot(data, aes(x = age, fill = gender,
                          y = ifelse(test = gender == "M",
                                     yes = -population, no = population))) +
  geom_bar(stat = "identity") +
  scale_y_continuous(labels = abs, limits = max(data$population) * c(-1,1)) +
  labs(y = "Population Amount") +
  coord_flip() +
  facet_wrap(~ country) +
  theme_classic() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))+
  scale_colour_manual(values = c("red", "green"),
                      aesthetics = c("colour", "fill"))

You may also like...

2 Responses

  1. Anonymous says:

    Good job, go head other relevant advanced statistics parts.

Leave a Reply

Your email address will not be published. Required fields are marked *

3 + three =