0 votes
1 view
in R Programming by (5k points)

There is a similar question for PHP, but I'm working with R and am unable to translate the solution to my problem.

I have this data frame with 10 rows and 50 columns, where some of the rows are absolutely identical. If I use unique on it, I get one-row per - let's say - "type", but what I actually want is to get only those rows which only appear once. Does anyone know how I can achieve this?

I can have a look at clusters and heatmaps to sort it out manually, but I have bigger data frames than the one mentioned above (with up to 100 rows) where this gets a bit tricky.

1 Answer

0 votes
by (24.7k points)

To remove all the duplicates from the data frame, you can use the following syntax:

df[!(duplicated(df) | duplicated(df, fromLast = TRUE)), ]

For example:

Date <- as.Date(c('2006-08-30','2006-08-23', '2006-09-06','2006-08-23', '2006-09-13','2006-08-23', '2006-09-20')) 

ID <- c("x1","x1","X2","x1","X3","x1","x1") 

TransNo<-c("123","124","125","124","126","124","127")

df<-data.frame(ID,Date,TransNo)

  ID       Date TransNo

1 x1 2006-08-30     123

2 x1 2006-08-23     124

3 X2 2006-09-06     125

4 x1 2006-08-23     124

5 X3 2006-09-13     126

6 x1 2006-08-23     124

7 x1 2006-09-20     127

To get rows that occur only once:

 df[!(duplicated(df) | duplicated(df, fromLast = TRUE)), ]

  ID       Date TransNo

1 x1 2006-08-30     123

3 X2 2006-09-06     125

5 X3 2006-09-13     126

7 x1 2006-09-20     127

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...