Back

Explore Courses Blog Tutorials Interview Questions
0 votes
4 views
in R Programming by (7.3k points)

Can the mutate be used when the mutation is conditional (depending on the values of certain column values)?

This example helps to show what I mean.

structure(list(a = c(1, 3, 4, 6, 3, 2, 5, 1), b = c(1, 3, 4, 

2, 6, 7, 2, 6), c = c(6, 3, 6, 5, 3, 6, 5, 3), d = c(6, 2, 4, 

5, 3, 7, 2, 6), e = c(1, 2, 4, 5, 6, 7, 6, 3), f = c(2, 3, 4, 

2, 2, 7, 5, 2)), .Names = c("a", "b", "c", "d", "e", "f"), row.names = c(NA, 

8L), class = "data.frame")

  a b c d e f

1 1 1 6 6 1 2

2 3 3 3 2 2 3

3 4 4 6 4 4 4

4 6 2 5 5 5 2

5 3 6 3 3 6 2

6 2 7 6 7 7 7

7 5 2 5 2 6 5

8 1 6 3 6 3 2

I was hoping to find a solution to my problem using the dplyr package (and yes I know this not code that should work, but I guess it makes the purpose clear) for creating a new column g:

library(dplyr)

 df <- mutate(df,

         if (a == 2 | a == 5 | a == 7 | (a == 1 & b == 4)){g = 2},

         if (a == 0 | a == 1 | a == 4 | a == 3 |  c == 4) {g = 3})

The result of the code I am looking for should have this result in this particular example:

  a b c d e f  g

1 1 1 6 6 1 2  3

2 3 3 3 2 2 3  3

3 4 4 6 4 4 4  3

4 6 2 5 5 5 2 NA

5 3 6 3 3 6 2 NA

6 2 7 6 7 7 7  2

7 5 2 5 2 6 5  2

8 1 6 3 6 3 2  3

Does anyone have an idea about how to do this in dplyr? This data frame is just an example, the data frames I am dealing with are much larger. Because of its speed I tried to use dplyr, but perhaps there are other, better ways to handle this problem?

1 Answer

0 votes
by

You can use the case_when function from the dplyr package in the mutate function to get the desired output.

In your case:

df <- structure(list(a = c(1, 3, 4, 6, 3, 2, 5, 1), 

               b = c(1, 3, 4, 2, 6, 7, 2, 6), 

               c = c(6, 3, 6, 5, 3, 6, 5, 3), 

               d = c(6, 2, 4, 5, 3, 7, 2, 6), 

               e = c(1, 2, 4, 5, 6, 7, 6, 3), 

               f = c(2, 3, 4, 2, 2, 7, 5, 2)),

          .Names = c("a", "b", "c", "d", "e", "f"), 

          row.names = c(NA, 8L), class = "data.frame")

df %>% mutate(g = case_when(a == 2 | a == 5 | a == 7 | (a == 1        & b == 4) ~ 2, a == 0 | a == 1 | a == 4 | a == 3 | c ==         4 ~ 3, TRUE ~ NA_real_))

NA has to be replaced with NA_real_  because case_when requires both conditions to be of the same type. 

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.5k answers

500 comments

108k users

Browse Categories

...