Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I need to obtain the type for each column to properly preprocess it.

Currently I do this via the following method:

import pandas as pd

# input is of type List[List[any]]

# but has one type (int, float, str, bool) per column

df = pd.DataFrame(input, columns=key_labels)

column_types = dict(df.dtypes)

matrix = df.values

Since I only use pandas for obtaining the dtypes (per column) and use numpy for everything else I want to cut pandas from my project.

In summary: Is there a way to obtain (specific) dtypes per column from numpy

!Or: Is there a fast way to recompute the dtype of ndarray (after splicing the matrix)

1 Answer

0 votes
by (36.8k points)

If you want to change the data type of a particular column you can use pandas:

In this below example I am using the customer-churn dataset.

import pandas as pd

df_data = pd.read_csv("Customer-Churn.csv")

I am checking the data type of a particular column:

In [18]:df_data['PaymentMethod'].dtype

Out[18]:

dtype('O')

Changing the column dtype to bool

In[19]:df_data['PaymentMethod']=df_data['PaymentMethod'].astype(bool)

df_data['PaymentMethod'].dtype

Out[19]:

dtype('bool')

I hope this will help you.

If you want to know more about the Data Science then do check out the following Data Science which will help you in understanding Data Science from scratch

...