How to Count Distinct Values in R

How to Count Distinct Values in R?, using the n_distinct() function from dplyr, you can count the number of distinct values in an R data frame using one of the following methods.

With the given data frame, the following examples explain how to apply each of these approaches in practice.

Hypothesis Testing Examples-Quick Overview – Data Science Tutorials

How to Count Distinct Values in R

Let’s make a data frame

df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 points=c(106, 106, 108, 110, 209, 209, 122, 212),
                 assists=c(203, 206, 204, 202, 24, 25, 125, 119))
df
   team points assists
1    A    106     203
2    A    106     206
3    A    108     204
4    A    110     202
5    B    209      24
6    B    209      25
7    B    122     125
8    B    212     119

Approach 1: Count Distinct Values in One Column

The following code demonstrates how to count the number of distinct values in the ‘team’ column using n distinct().

What is Ad Hoc Analysis? – Data Science Tutorials

count the number of distinct values in the ‘team’ column

library(dplyr)
n_distinct(df$team)
[1] 2

In the ‘team’ column, there are two separate values.

Approach 2: Count Distinct Values in All Columns

The following code demonstrates how to count the number of unique values in each column of the data frame using the sapply() and n distinct() functions.

count the number of distinct values in each column

sapply(df, function(x) n_distinct(x))
    team  points assists
      2       6       8

We can observe the following from the output:

In the ‘team’ column, there are two separate values.

Arrange the rows in a specific sequence in R – Data Science Tutorials

In the ‘points’ column, there are 6 different values.

The ‘assists’ column has 8 different values.

Approach 3: Count Distinct Values by Group

The following code demonstrates how to count the number of distinct values by group using the n distinct() function.

count the number of different ‘points’ values by ‘team’

df %>%
  group_by(team) %>%
  summarize(distinct_points = n_distinct(points))
   team  distinct_points
  <chr>           <int>
1 A                   3
2 B                   3

We can observe the following from the output:

For team A, there are three different point values.

How to perform One-Sample Wilcoxon Signed Rank Test in R? – Data Science Tutorials

For team B, there are three different point values.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

sixteen + 4 =

Ads Blocker Image Powered by Code Help Pro

Quality articles need supporters. Will you be one?

You currently have an Ad Blocker on.

Please support FINNSTATS.COM by disabling these ads blocker.

Powered By
Best Wordpress Adblock Detecting Plugin | CHP Adblock