0 votes
1 view
in R Programming by (6.5k points)

I have an R data frame containing a factor that I want to "expand" so that for each factor level, there is an associated column in a new data frame, which contains a 1/0 indicator. E.g., suppose I have:

df.original <-data.frame(eggs = c("foo", "foo", "bar", "bar"), ham = c(1,2,3,4))

I want:

df.desired  <- data.frame(foo = c(1,1,0,0), bar=c(0,0,1,1), ham=c(1,2,3,4))

Because for certain analyses for which you need to have a completely numeric data frame (e.g., principal component analysis), I thought this feature might be built-in. Writing a function to do this shouldn't be too hard, but I can foresee some challenges relating to column names and if something exists already, I'd rather use that.

1 Answer

0 votes
by (25.3k points)

To get the desired output, you can use the model.matrix function from that creates a design (or model) matrix, e.g., by expanding factors to a set of dummy variables (depending on the contrasts) and expanding interactions similarly.

In your case:

model.matrix( ~ eggs - 1, data=df.original )

i.e.,

df.desired <-data.frame(model.matrix( ~ eggs - 1, data=df.original ), ham = c(1,2,3,4))

> head(df.desired)

  eggsbar eggsfoo ham

1       0       1   1

2       0       1   2

3       1       0   3

4       1       0   4

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...