Filtering for Unique Values in R- Using the dplyr
Filtering for Unique Values in R, Using the dplyr package in R, you may filter for unique values in a data frame using the following methods.
Method 1: In one column, filter for unique values.
df %>% distinct(var1)
Method 2: Filtering for Unique Values in Multiple Columns
df %>% distinct(var1, var2)
Method 3: In all columns, filter for unique values.
df %>% distinct()
With the following data frame in R, the following examples explain how to utilize each method in practice.
Arrange Data by Month in R with example – Data Science Tutorials
create a data frame
df <- data.frame(team=c('X', 'X', 'X', 'X', 'Y', 'Y', 'Y', 'Y'), rebounds =c('8', '6', '5', '4', '3', '8', '9', '5'), points=c(107, 207, 208, 211, 213, 215, 219, 313))
Now we can view the data frame
df
team rebounds points 1 X 8 107 2 X 6 207 3 X 5 208 4 X 4 211 5 Y 3 213 6 Y 8 215 7 Y 9 219 8 Y 5 313
Example 1: Column Filter for Unique Values
To filter for unique values in just the team column, we can use the following code.
Rejection Region in Hypothesis Testing – Data Science Tutorials
library(dplyr)
In the team column, only unique values should be selected.
df %>% distinct(team)
team 1 X 2 Y
It’s worth noting that just the team column’s unique values are returned.
Example 2: Find Unique Values in Multiple Columns Using a Filter
To filter for unique values in the team and points columns, we can use the following code:
library(dplyr)
in the team and points columns, select unique values
df %>% distinct(team, points)
team points 1 X 107 2 X 207 3 X 208 4 X 211 5 Y 213 6 Y 215 7 Y 219 8 Y 313
It’s worth noting that just the team and points columns’ unique values are returned.
Best Books to Learn R Programming – Data Science Tutorials
Example 3: Filter all columns for unique values
To filter for unique values across all columns in the data frame, we can use the following code.
library(dplyr)
choose unique values in all columns
df %>% distinct()
team rebounds points 1 X 8 107 2 X 6 207 3 X 5 208 4 X 4 211 5 Y 3 213 6 Y 8 215 7 Y 9 219 8 Y 5 313
It’s worth noting that the unique values from each of the three columns are returned.