The function distinct() in the dplyr package performs arbitrary duplicate removal
Data:
dt <- data.frame(m = rep(c(1,2),4), n = rep(LETTERS[1:4],2))
Remove rows where specified columns have been duplicated:
library(dplyr)
dat %>% distinct(m, .keep_all = TRUE)
m n
1 1 A
2 2 B
Remove rows which are complete duplicates of other rows:
dat %>% distinct
m n
1 1 A
2 2 B
3 1 C
4 2 D
The general answer for duplicate row removal:
m <- c(rep("A", 3), rep("B", 3), rep("C",2))
n <- c(1,1,2,4,1,1,2,2)
df <-data.frame(m,n)
duplicated(df)
[1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
df[duplicated(df), ]
m n
2 A 1
6 B 1
8 C 2
df[!duplicated(df), ]
m n
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2