0 votes
1 view
in R Programming by (7.3k points)

I have the following 2 data.frames:

a1 <- data.frame(a = 1:5, b=letters[1:5])

a2 <- data.frame(a = 1:3, b=letters[1:3])

I want to find the row a1 has that a2 doesn't.

Is there a built-in function for this type of operation?

(p.s: I did write a solution for it, I am simply curious if someone already made a more crafted code)

Here is my solution:

a1 <- data.frame(a = 1:5, b=letters[1:5])

a2 <- data.frame(a = 1:3, b=letters[1:3])

rows.in.a1.that.are.not.in.a2  <- function(a1,a2)

{

    a1.vec <- apply(a1, 1, paste, collapse = "")

    a2.vec <- apply(a2, 1, paste, collapse = "")

    a1.without.a2.rows <- a1[!a1.vec %in% a2.vec,]

    return(a1.without.a2.rows)

}

rows.in.a1.that.are.not.in.a2(a1,a2)

1 Answer

0 votes
by (25.4k points)

To find rows in the first object that are not present in the second object, you can use the compare function from the compare package that compares two objects and, if they are not the same, attempt to transform them to see if they are the same after being transformed.

The compare function is flexible in terms of what kind of comparisons are allowed (e.g. changing order of elements of each vector, changing the order and names of variables, shortening variables, changing the case of strings).

Arguments:

model

The “correct” object.

comparison

The object to be compared with the model.

allowAll

Allow any sort of transformation

In your case:

install.packages(“compare”)

library(“compare”)

a1 <- data.frame(a = 1:5, b = letters[1:5])

a2 <- data.frame(a = 1:3, b = letters[1:3])

To find similar rows:

comp <- compare(a1,a2,allowAll=TRUE)

comp$tM

  a b

1 1 a

2 2 b

3 3 c

To find missing rows:

missing <-data.frame(lapply(1:ncol(a1),function(x)     setdiff(a1[,x],comp$tM[,x])))

colnames(missing) <- colnames(a1)

missing

  a b

1 4 d

2 5 e

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...