remove duplicate rows in r

Question

1 Answer

Related questions

0 votes

1 answer

Remove duplicate rows in MySQL

asked Jul 4, 2019 in SQL by Tech4ever (20.3k points)

0 votes

1 answer

Drop all duplicate rows in Python Pandas

asked Aug 24, 2019 in Data Science by sourav (17.6k points)

0 votes

1 answer

Removing duplicate rows from table in Oracle

asked Jul 19, 2019 in SQL by Tech4ever (20.3k points)

0 votes

1 answer

MySQL ON DUPLICATE KEY UPDATE for multiple rows insert in single query

asked Jul 11, 2019 in SQL by Tech4ever (20.3k points)

0 votes

1 answer

How to delete duplicate rows in SQL Server?

asked Jul 9, 2019 in SQL by Tech4ever (20.3k points)

vinita · Answer 1 · 2019-08-01T08:00:26+0000

The function distinct() in the dplyr package performs arbitrary duplicate removal

Data:

dt <- data.frame(m = rep(c(1,2),4), n = rep(LETTERS[1:4],2))

Remove rows where specified columns have been duplicated:

library(dplyr)
dat %>% distinct(m, .keep_all = TRUE)
m n
1 1 A
2 2 B

Remove rows which are complete duplicates of other rows:

dat %>% distinct
m n
1 1 A
2 2 B
3 1 C
4 2 D

The general answer for duplicate row removal:

m <- c(rep("A", 3), rep("B", 3), rep("C",2))
n <- c(1,1,2,4,1,1,2,2)
df <-data.frame(m,n)
duplicated(df)
[1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
df[duplicated(df), ]
m n
2 A 1
6 B 1
8 C 2
df[!duplicated(df), ]
m n
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2

remove duplicate rows in r

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources