R Percentage by Group Calculation
R Percentage by Group Calculation, one of the most common tasks is calculating percentages within groups. Whether you’re working with sales data, customer segments, survey responses, sports statistics, or business analytics, understanding how to compute group-wise percentages can provide valuable insights into the relative contribution of each observation.
In this tutorial, you’ll learn how to calculate percentages by group in R using the dplyr package, along with practical examples and best practices for data analysis.
Why Calculate Percentages by Group?
Raw numbers often don’t tell the full story. Percentages help you understand the relative contribution of each observation within a category.
For example:
- What percentage of total sales comes from each product?
- What percentage of customers belong to each region?
- What percentage of points were scored by each player on a team?
- What percentage of website traffic comes from each marketing channel?
Calculating percentages by group makes it easier to compare observations across different categories.
Example Dataset
Suppose we have a dataset showing the number of points scored by basketball players on two teams.
Create the Data Frame
df <- data.frame(
team = c('A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B'),
points = c(112, 229, 234, 104, 100,
111, 77, 136, 134, 122)
)
df
Output:
team points
1 A 112
2 A 229
3 A 234
4 A 104
5 A 100
6 B 111
7 B 77
8 B 136
9 B 134
10 B 122
Calculate Percentage by Group Using dplyr
The easiest way to calculate percentages within groups is by using the group_by() and mutate() functions from the dplyr package.
Load dplyr
library(dplyr)
Calculate Team-Wise Percentages
df_pct <- df %>%
group_by(team) %>%
mutate(percent = points / sum(points))
df_pct
Output:
# A tibble: 10 × 3
# Groups: team [2]
team points percent
<chr> <dbl> <dbl>
1 A 112 0.144
2 A 229 0.294
3 A 234 0.300
4 A 104 0.135
5 A 100 0.129
6 B 111 0.191
7 B 77 0.133
8 B 136 0.234
9 B 134 0.231
10 B 122 0.210
The percent column represents each player’s contribution to their team’s total points.
Understanding the Calculation
For Team A:
sum(df$points[df$team == "A"])
Output:
773
The first player scored:
112 / 773
Output:
0.1438
This means the player contributed approximately:
14.38%
of Team A’s total points.
Display Percentages as Percent Values
To make the output more readable, multiply by 100.
df_pct <- df %>%
group_by(team) %>%
mutate(percent = round(points / sum(points) * 100, 2))
df_pct
Output:
team points percent
1 A 112 14.49
2 A 229 29.63
3 A 234 30.27
4 A 104 13.45
5 A 100 12.94
6 B 111 19.14
7 B 77 13.28
8 B 136 23.45
9 B 134 23.10
10 B 122 21.03
Now the percentages are easier to interpret.
Using scales Package for Percentage Formatting
The scales package can format percentages automatically.
library(scales)
df_pct <- df %>%
group_by(team) %>%
mutate(percent = percent(points / sum(points)))
df_pct
Output:
14.4%
29.4%
30.0%
...
This is especially useful for reports and dashboards.
Calculate Percentage by Multiple Groups
Suppose your dataset includes teams and seasons.
df2 <- data.frame(
season = c(2024,2024,2024,2024,2025,2025,2025,2025),
team = c("A","A","B","B","A","A","B","B"),
points = c(100,150,120,180,130,170,140,190)
)
Calculate percentages within each season and team:
df2 %>%
group_by(season, team) %>%
mutate(percent = points / sum(points) * 100)
This approach is common in business analytics and time-series reporting.
Alternative Method Using Base R
If you prefer base R, use ave().
df$percent <- with(
df,
points / ave(points, team, FUN = sum)
)
df
This produces the same result without requiring dplyr.
Real-World Example: Sales Analysis
Suppose a company tracks sales by region.
sales <- data.frame(
region = c("North","North","North",
"South","South","South"),
sales = c(50000,40000,30000,
60000,45000,35000)
)
Calculate each store’s contribution to regional sales:
sales %>%
group_by(region) %>%
mutate(percent_sales = sales / sum(sales) * 100)
This helps managers identify top-performing locations within each region.
Common Mistakes When Calculating Group Percentages
Forgetting group_by()
Incorrect:
df %>%
mutate(percent = points / sum(points))
This calculates percentages based on the entire dataset rather than within each team.
Not Converting to Percentage
points / sum(points)
Returns proportions rather than percentages.
Multiply by 100 if percentage values are required.
Missing Values
When data contains missing values:
df %>%
group_by(team) %>%
mutate(percent = points / sum(points, na.rm = TRUE))
Using na.rm = TRUE prevents calculation errors.
Applications of Group Percentage Calculations
Group-wise percentages are widely used in:
Business Intelligence
- Revenue contribution by product
- Market share analysis
- Customer segmentation
Sports Analytics
- Player performance analysis
- Team contribution metrics
- Scoring distribution
Marketing Analytics
- Channel attribution
- Campaign performance
- Lead source analysis
Survey Research
- Response distributions
- Demographic analysis
- Opinion polling
Financial Analysis
- Portfolio allocation
- Expense categorization
- Budget reporting
Conclusion
Calculating percentages by group is a fundamental data manipulation task in R. By combining group_by() and mutate() from the dplyr package, you can quickly determine how much each observation contributes to its group’s total.
Whether you’re analyzing sports statistics, sales performance, customer behavior, or financial data, group-wise percentages provide valuable context that raw numbers alone cannot reveal.
Using these techniques will help you create more meaningful reports, dashboards, and statistical analyses in R.

