in Data Science

Say I have a pandas column as below





and now i will take dummies for above as follows: 

type_dummies = pd.get_dummies(["Type"], prefix="type")

Then after joing it with the main DataFrame the resulting df would be something like below:

df.drop(['Type'], axis=1, inplace=True)

df = df.join(type_dummies)


type_type1    type_type2    type_type3

   1              0             0

   0              1             0

   0              0             1

But what if in my training set there is an another category as type4 in Type column. So how would I use get_dummies() method to generate dummies as much as I want. That is, in this case I want to generate 4 dummy variables although there are only 3 categories in the desired column?

1 Answer

by

You can use category data type as depicted in the code below:

df.Type=df.Type.astype('category', categories=['type1','type2','type3','type4'])



0  type1

1  type2

2  type3

pd.get_dummies(df["Type"], prefix="type")


      type_type1    type_type2   type_type3  type_type4

0         1             0            0          0

1         0             1            0          0

2         0             0            1          0

