Remove duplicates in a dataframe

This snippet needs the following libraries: tidyverse: dplyr · How to load the right library →

To remove duplicates in a dataframe, use distinct(). Duplicates are rows that are replicated at least twice.

# With the pipe operator
df <- df %>%

# Without the pipe operator
df <- distinct(df)


Subset distinct/unique rows — distinct
Select only unique/distinct rows from a data frame. This is similarto but considerably faster.

