0 votes
1 view
in Data Science by (17.6k points)

Say I have a pandas column as below





and now i will take dummies for above as follows: 

type_dummies = pd.get_dummies(["Type"], prefix="type")

Then after joing it with the main DataFrame the resulting df would be something like below:

df.drop(['Type'], axis=1, inplace=True)

df = df.join(type_dummies)


type_type1    type_type2    type_type3

   1              0             0

   0              1             0

   0              0             1

But what if in my training set there is an another category as type4 in Type column. So how would I use get_dummies() method to generate dummies as much as I want. That is, in this case I want to generate 4 dummy variables although there are only 3 categories in the desired column?

1 Answer

0 votes
by (41.4k points)

You can use category data type as depicted in the code below:

df.Type=df.Type.astype('category', categories=['type1','type2','type3','type4'])



0  type1

1  type2

2  type3

pd.get_dummies(df["Type"], prefix="type")


      type_type1    type_type2   type_type3  type_type4

0         1             0            0          0

1         0             1            0          0

2         0             0            1          0

If you wish to learn more about how to use python for data science, then go through this data science python course by Intellipaat for more insights.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !