2 views

Basically I am having two data frames and I want to re-code the first data frame with values from the second. The first data frame (df1) is having the data from the respondents to a survey and the other data frame(df2) is the data dictionary for df1.

The data looks like this:

df1 <-  data.frame(a = c(1,2,3),

b = c(4,5,6),

c = c(7,8,9))

df2 <- data.frame(columnIndicator = c("a","a","a","b","b","b","c","c","c" ),

df1_value = c(1,2,3,4,5,6,7,8,9),

new_value = c("a1","a2","a3","b1","b2","b3","c1","c2","c3"))

I know that I can manually re-code the data frame 1 to get the expected output by performing the following:

df1 <- within(df1,{

a[a==1] <- "a1"

a[a==2] <- "a2"

a[a==3] <- "a3"

b[b==4] <- "b4"

b[b==5] <- "b5"

b[b==6] <- "b6"

c[c==7] <- "c7"

c[c==8] <- "c8"

c[c==9] <- "c9"

})

But my real data-set is having 42 columns that need to be re-coded and that method is a little time taking. All I wanted to know that is there another way in R for me to re-code the values in df1 with the values in df2?

by (108k points)

I think in R programming you can easily achieve your goal by using the dplyr package.

library(dplyr)

df3 <- df1 %>% gather(key = "key", value = "value")

df3 %>% inner_join(df2, by = c("key" = "columnIndicator", "value" = "df1_value"))

And the output is:

key value new_value

1   a     1        a1

2   a     2        a2

3   a     3        a3

4   b     4        b1

5   b     5        b2

6   b     6        b3

7   c     7        c1

8   c     8        c2

9   c     9        c3