Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I would like to create views or dataframes from an existing dataframe based on column selections.

For example, I would like to create a dataframe df2 from a dataframe df1 that holds all columns from it except two of them. I tried doing the following, but it didn't work:

import numpy as np

import pandas as pd

# Create a dataframe with columns A,B,C and D

df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))

# Try to create a second dataframe df2 from df with all columns except 'B' and D

my_cols = set(df.columns)

my_cols.remove('B').remove('D')

# This returns an error ("unhashable type: set")

df2 = df[my_cols]

What am I doing wrong? Perhaps more generally, what mechanisms does pandas have to support the picking and exclusions of arbitrary sets of columns from a dataframe?

1 Answer

0 votes
by (41.4k points)

You can select the columns that are required or drop them if it is not required.

# Using DataFrame.drop

df.drop(df.columns[[1, 2]], axis=1, inplace=True)

# drop by Name

df1 = df1.drop(['B', 'C'], axis=1)

# Select the ones you want

df1 = df[['a','d']]

Browse Categories

...