Bubble Chart in R-ggplot & Plotly
Data visualization plays a crucial role in exploratory data analysis and business intelligence. While scatter plots are excellent for displaying relationships between two numeric variables, they become even more powerful when enhanced with additional dimensions.
A Bubble Chart is an advanced version of a scatter plot where the size and color of points represent additional variables. This allows analysts to visualize up to four dimensions of data simultaneously, making bubble charts ideal for data science, business analytics, financial analysis, and machine learning applications.
In this tutorial, you’ll learn how to create bubble charts in R using both ggplot2 and plotly, along with practical examples and customization techniques.
What is a Bubble Chart?
A bubble chart extends a traditional scatter plot by replacing points with circles (bubbles) whose size represents an additional variable.
A bubble chart can display:
- X-axis: First numeric variable
- Y-axis: Second numeric variable
- Color: Third variable (group/category)
- Bubble Size: Fourth numeric variable
This enables analysts to uncover patterns, clusters, trends, and relationships that may not be visible in standard charts.
Common Applications of Bubble Charts
Bubble charts are widely used in:
- Business Intelligence dashboards
- Marketing campaign analysis
- Financial market analysis
- Customer segmentation
- Healthcare analytics
- Sales performance reporting
- Machine learning exploratory analysis
For example:
- X-axis = Product Price
- Y-axis = Sales Revenue
- Bubble Size = Units Sold
- Color = Product Category
This instantly provides four layers of information in a single visualization.
Sample Dataset
For this tutorial, we’ll use the built-in mtcars dataset.
data("mtcars")
head(mtcars)
We’ll focus on:
wt= Vehicle Weightdisp= Engine Displacementcyl= Number of Cylindersqsec= Quarter-Mile Time
Prepare the Data
Convert the cylinder variable into a categorical variable.
df <- mtcars
df$cyl <- as.factor(df$cyl)
head(df[, c("wt", "disp", "cyl", "qsec")])
Output:
wt disp cyl qsec
Mazda RX4 2.620 160 6 16.46
Mazda RX4 Wag 2.875 160 6 17.02
Datsun 710 2.320 108 4 18.61
Hornet 4 Drive 3.215 258 6 19.44
Hornet Sportabout 3.440 360 8 17.02
Valiant 3.460 225 6 20.22
Method 1: Create a Bubble Chart Using ggplot2
The ggplot2 package is the most popular visualization framework in R.
Load Required Package
library(ggplot2)
Basic Bubble Chart
ggplot(df,
aes(x = wt,
y = disp,
color = cyl,
size = qsec)) +
geom_point(alpha = 0.6)
Interpretation
- Vehicle weight is shown on the x-axis.
- Engine displacement appears on the y-axis.
- Different cylinder groups receive different colors.
- Bubble size represents quarter-mile performance.
Customized Bubble Chart
For publication-quality visualizations, customize colors and bubble sizes.
library(ggplot2)
ggplot(df,
aes(x = wt,
y = disp,
color = cyl,
size = qsec)) +
geom_point(alpha = 0.7) +
scale_color_manual(
values = c(
"#AA4371",
"#E7B800",
"#FC4E07"
)
) +
scale_size(
range = c(3, 15)
) +
labs(
title = "Bubble Chart of Vehicle Characteristics",
subtitle = "Engine Displacement vs Vehicle Weight",
x = "Weight (1000 lbs)",
y = "Displacement",
color = "Cylinders",
size = "Quarter Mile Time"
) +
theme_bw() +
theme(
legend.position = "bottom",
plot.title = element_text(face = "bold")
)
Why Use Alpha Transparency?
alpha = 0.7
Transparency helps reduce overplotting when bubbles overlap.
Values range from:
- 0 = Completely transparent
- 1 = Completely opaque
Method 2: Create an Interactive Bubble Chart Using Plotly
Interactive charts allow users to:
- Hover over points
- Zoom in and out
- Filter information
- Explore data dynamically
Load Plotly
library(plotly)
Interactive Bubble Plot
plot_ly(
data = df,
x = ~wt,
y = ~disp,
color = ~cyl,
size = ~qsec,
text = rownames(df),
sizes = c(10, 50),
type = "scatter",
mode = "markers",
marker = list(
opacity = 0.7,
sizemode = "diameter"
)
)
This generates a fully interactive bubble chart that can be embedded into dashboards and Shiny applications.
Enhanced Interactive Bubble Chart
Add titles and labels for better presentation.
library(plotly)
plot_ly(
data = df,
x = ~wt,
y = ~disp,
color = ~cyl,
size = ~qsec,
text = rownames(df),
sizes = c(10, 50),
type = "scatter",
mode = "markers"
) %>%
layout(
title = "Interactive Bubble Chart",
xaxis = list(
title = "Vehicle Weight"
),
yaxis = list(
title = "Engine Displacement"
)
)
Bubble Chart vs Scatter Plot
| Feature | Scatter Plot | Bubble Chart |
|---|---|---|
| X Variable | Yes | Yes |
| Y Variable | Yes | Yes |
| Color Variable | Optional | Optional |
| Size Variable | No | Yes |
| Dimensions Displayed | 2–3 | 3–4 |
| Information Density | Moderate | High |
Bubble charts provide significantly more information than traditional scatter plots.
Best Practices for Bubble Charts
Use Meaningful Bubble Sizes
Avoid variables with extremely large ranges.
If necessary, scale the size variable before plotting.
df$qsec_scaled <- scale(df$qsec)
Avoid Too Many Categories
Using too many colors can make interpretation difficult.
Limit categorical groups when possible.
Add Transparency
alpha = 0.5
This improves readability when bubbles overlap.
Include Clear Legends
Always label:
- Color groups
- Bubble sizes
- Axes
to ensure the chart is easy to interpret.
Real-World Use Cases
Marketing Analytics
- X-axis = Advertising Spend
- Y-axis = Revenue
- Size = Number of Customers
- Color = Marketing Channel
Stock Market Analysis
- X-axis = Market Capitalization
- Y-axis = Annual Return
- Size = Trading Volume
- Color = Industry Sector
SaaS Business Metrics
- X-axis = Monthly Active Users
- Y-axis = Revenue
- Size = Customer Lifetime Value
- Color = Subscription Plan
Healthcare Analytics
- X-axis = Age
- Y-axis = Healthcare Cost
- Size = Number of Visits
- Color = Risk Category
Conclusion
Bubble charts are among the most powerful visualization techniques available in R because they allow analysts to display multiple dimensions of data in a single chart.
Using ggplot2, you can create elegant static bubble charts suitable for reports and publications. With plotly, you can build interactive visualizations for dashboards, web applications, and business intelligence platforms.
For most data science projects, the combination of:
ggplot(df,
aes(x = wt,
y = disp,
color = cyl,
size = qsec)) +
geom_point()
provides an excellent starting point for exploring relationships among multiple variables.
Mastering bubble charts will help you uncover deeper insights and communicate complex data