How to Remove Duplicates in R with Example

by finnstats

How to Remove Duplicates in R, when we are dealing with data frames one of the common tasks is the removal of duplicate rows in R.

This can handle while using different functions in R like distinct, unique, duplicated, etc…

This tutorial describes how to remove duplicated rows from a data frame in R while using distinct, duplicated, and unique functions.

Remove Duplicates in R

Let’s load the library and create a data frame

Kurtosis in R

library(dplyr)

data<- data.frame(Column1 = c('P1', 'P1', 'P2', 'P3', 'P1', 'P1', 'P3', 'P4', 'P2', 'P4'), Column2 = c(5, 5, 3, 5, 2, 3, 4, 7, 10, 14))

data

   Column1 Column2
2       P1       5
3       P2       3
4       P3       5
5       P1       2
6       P1       3
7       P3       4
8       P4       7
9       P2      10
10      P4      14

Approach 1: Remove duplicated rows

Let’s make use of a distinct function from dplyr library.

distinct(data)

   Column1 Column2
1      P1       5
2      P2       3
3      P3       5
4      P1       2
5      P1       3
6      P3       4
7      P4       7
8      P2      10
9      P4      14

Approach 2: Remove Duplicates in Column

If we want to delete duplicate rows or values from a certain column, we can use the distinct function.

Let’s remove duplicate rows from Column2.

Quantile-Quantile Plots

distinct(data, Column2)

Suppose you want to remove duplicate values from column2 and want to retain the respective values in Column1,

distinct(data, Column2, .keep_all = TRUE)

   Column1 Column2
1      P1       5
2      P2       3
3      P1       2
4      P3       4
5      P4       7
6      P2      10
7      P4      14

Approach 3: Duplicated function

The duplicated function is also very handy to remove repeated rows from a data frame.

Aggregate Function in R

duplicated(data)

FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Let’s remove the duplicated values.

data[!duplicated(data), ]

    Column1 Column2
1       P1       5
3       P2       3
4       P3       5
5       P1       2
6       P1       3
7       P3       4
8       P4       7
9       P2      10
10      P4      14

Approach 4: Unique Function

unique(data)

    Column1 Column2
1       P1       5
3       P2       3
4       P3       5
5       P1       2
6       P1       3
7       P3       4
8       P4       7
9       P2      10
10      P4      14

FACTS About India » Must Know »