Select the First Row by Group in R

Select the First Row by Group in R, using the dplyr package in R, you might wish to choose the first row in each group frequently. To do this, use the simple syntax shown below.

Select the First Row by Group in R

Let’s say we have the dataset shown below in R,

How to add labels at the end of each line in ggplot2?

Let’s put up a dataset

df <- data.frame(team=c('P1', 'P1', 'P1', 'P1', 'P2', 'P2', 'P2', 'P2', 'P3', 'P3'),
                 points=c(56, 94, 17, 57, 55, 15, 37, 44, 55, 32))

Now we can view the data frame

df
   team points
1    P1     56
2    P1     94
3    P1     17
4    P1     57
5    P2     55
6    P2     15
7    P2     37
8    P2     44
9    P3     55
10   P3     32

To choose the first row by the group in R, use the dplyr package as demonstrated in the code below.

Please note we are arranging the data frame by points variable.

Augmented Dickey-Fuller Test in R – Data Science Tutorials

library(dplyr)
df %>%
  group_by(team) %>%
  arrange(points) %>%
  filter(row_number()==1)
team  points
  <chr>  <dbl>
1 P2        15
2 P1        17
3 P3        32

The data are sorted in ascending order by arrange() by default, however, we may easily sort the values in descending order instead.

df %>%
  group_by(team) %>%
  arrange(desc(points)) %>%
  filter(row_number()==1)
  team  points
  <chr>  <dbl>
1 P1        94
2 P2        55
3 P3        55

Remember that this code may be simply changed to select the nth row for each group. Just modify row_number() == n.

Filter Using Multiple Conditions in R – Data Science Tutorials

or instance, you may use the following syntax to choose the second row by group:

df %>%
  group_by(team) %>%
  arrange(desc(points)) %>%
  filter(row_number()==2)
team  points
  <chr>  <dbl>
1 P1        57
2 P2        44
3 P3        32

Alternatively, you might employ the syntax shown below to choose the last row by the group.

How to perform the Kruskal-Wallis test in R? – Data Science Tutorials

df %>%
  group_by(team) %>%
  arrange(desc(points)) %>%
  filter(row_number()==n())
team  points
  <chr>  <dbl>
1 P3        32
2 P1        17
3 P2        15

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

three × 5 =