Bubble Chart in R-ggplot & Plotly

Data visualization plays a crucial role in exploratory data analysis and business intelligence. While scatter plots are excellent for displaying relationships between two numeric variables, they become even more powerful when enhanced with additional dimensions.

A Bubble Chart is an advanced version of a scatter plot where the size and color of points represent additional variables. This allows analysts to visualize up to four dimensions of data simultaneously, making bubble charts ideal for data science, business analytics, financial analysis, and machine learning applications.

In this tutorial, you’ll learn how to create bubble charts in R using both ggplot2 and plotly, along with practical examples and customization techniques.

What is a Bubble Chart?

A bubble chart extends a traditional scatter plot by replacing points with circles (bubbles) whose size represents an additional variable.

A bubble chart can display:

  • X-axis: First numeric variable
  • Y-axis: Second numeric variable
  • Color: Third variable (group/category)
  • Bubble Size: Fourth numeric variable

This enables analysts to uncover patterns, clusters, trends, and relationships that may not be visible in standard charts.

Common Applications of Bubble Charts

Bubble charts are widely used in:

  • Business Intelligence dashboards
  • Marketing campaign analysis
  • Financial market analysis
  • Customer segmentation
  • Healthcare analytics
  • Sales performance reporting
  • Machine learning exploratory analysis

For example:

  • X-axis = Product Price
  • Y-axis = Sales Revenue
  • Bubble Size = Units Sold
  • Color = Product Category

This instantly provides four layers of information in a single visualization.

Sample Dataset

For this tutorial, we’ll use the built-in mtcars dataset.

data("mtcars")

head(mtcars)

We’ll focus on:

  • wt = Vehicle Weight
  • disp = Engine Displacement
  • cyl = Number of Cylinders
  • qsec = Quarter-Mile Time

Prepare the Data

Convert the cylinder variable into a categorical variable.

df <- mtcars

df$cyl <- as.factor(df$cyl)

head(df[, c("wt", "disp", "cyl", "qsec")])

Output:

                    wt  disp cyl  qsec
Mazda RX4         2.620 160   6   16.46
Mazda RX4 Wag     2.875 160   6   17.02
Datsun 710        2.320 108   4   18.61
Hornet 4 Drive    3.215 258   6   19.44
Hornet Sportabout 3.440 360   8   17.02
Valiant           3.460 225   6   20.22

Method 1: Create a Bubble Chart Using ggplot2

The ggplot2 package is the most popular visualization framework in R.

Load Required Package

library(ggplot2)

Basic Bubble Chart

ggplot(df,
       aes(x = wt,
           y = disp,
           color = cyl,
           size = qsec)) +
  geom_point(alpha = 0.6)

Interpretation

  • Vehicle weight is shown on the x-axis.
  • Engine displacement appears on the y-axis.
  • Different cylinder groups receive different colors.
  • Bubble size represents quarter-mile performance.

Customized Bubble Chart

For publication-quality visualizations, customize colors and bubble sizes.

library(ggplot2)

ggplot(df,
       aes(x = wt,
           y = disp,
           color = cyl,
           size = qsec)) +
  geom_point(alpha = 0.7) +
  scale_color_manual(
    values = c(
      "#AA4371",
      "#E7B800",
      "#FC4E07"
    )
  ) +
  scale_size(
    range = c(3, 15)
  ) +
  labs(
    title = "Bubble Chart of Vehicle Characteristics",
    subtitle = "Engine Displacement vs Vehicle Weight",
    x = "Weight (1000 lbs)",
    y = "Displacement",
    color = "Cylinders",
    size = "Quarter Mile Time"
  ) +
  theme_bw() +
  theme(
    legend.position = "bottom",
    plot.title = element_text(face = "bold")
  )

Why Use Alpha Transparency?

alpha = 0.7

Transparency helps reduce overplotting when bubbles overlap.

Values range from:

  • 0 = Completely transparent
  • 1 = Completely opaque

Method 2: Create an Interactive Bubble Chart Using Plotly

Interactive charts allow users to:

  • Hover over points
  • Zoom in and out
  • Filter information
  • Explore data dynamically

Load Plotly

library(plotly)

Interactive Bubble Plot

plot_ly(
  data = df,
  x = ~wt,
  y = ~disp,
  color = ~cyl,
  size = ~qsec,
  text = rownames(df),
  sizes = c(10, 50),
  type = "scatter",
  mode = "markers",
  marker = list(
    opacity = 0.7,
    sizemode = "diameter"
  )
)

This generates a fully interactive bubble chart that can be embedded into dashboards and Shiny applications.

Enhanced Interactive Bubble Chart

Add titles and labels for better presentation.

library(plotly)

plot_ly(
  data = df,
  x = ~wt,
  y = ~disp,
  color = ~cyl,
  size = ~qsec,
  text = rownames(df),
  sizes = c(10, 50),
  type = "scatter",
  mode = "markers"
) %>%
  layout(
    title = "Interactive Bubble Chart",
    xaxis = list(
      title = "Vehicle Weight"
    ),
    yaxis = list(
      title = "Engine Displacement"
    )
  )

Bubble Chart vs Scatter Plot

FeatureScatter PlotBubble Chart
X VariableYesYes
Y VariableYesYes
Color VariableOptionalOptional
Size VariableNoYes
Dimensions Displayed2–33–4
Information DensityModerateHigh

Bubble charts provide significantly more information than traditional scatter plots.

Best Practices for Bubble Charts

Use Meaningful Bubble Sizes

Avoid variables with extremely large ranges.

If necessary, scale the size variable before plotting.

df$qsec_scaled <- scale(df$qsec)

Avoid Too Many Categories

Using too many colors can make interpretation difficult.

Limit categorical groups when possible.

Add Transparency

alpha = 0.5

This improves readability when bubbles overlap.

Include Clear Legends

Always label:

  • Color groups
  • Bubble sizes
  • Axes

to ensure the chart is easy to interpret.

Real-World Use Cases

Marketing Analytics

  • X-axis = Advertising Spend
  • Y-axis = Revenue
  • Size = Number of Customers
  • Color = Marketing Channel

Stock Market Analysis

  • X-axis = Market Capitalization
  • Y-axis = Annual Return
  • Size = Trading Volume
  • Color = Industry Sector

SaaS Business Metrics

  • X-axis = Monthly Active Users
  • Y-axis = Revenue
  • Size = Customer Lifetime Value
  • Color = Subscription Plan

Healthcare Analytics

  • X-axis = Age
  • Y-axis = Healthcare Cost
  • Size = Number of Visits
  • Color = Risk Category

Conclusion

Bubble charts are among the most powerful visualization techniques available in R because they allow analysts to display multiple dimensions of data in a single chart.

Using ggplot2, you can create elegant static bubble charts suitable for reports and publications. With plotly, you can build interactive visualizations for dashboards, web applications, and business intelligence platforms.

For most data science projects, the combination of:

ggplot(df,
       aes(x = wt,
           y = disp,
           color = cyl,
           size = qsec)) +
  geom_point()

provides an excellent starting point for exploring relationships among multiple variables.

Mastering bubble charts will help you uncover deeper insights and communicate complex data

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

3 × 1 =