Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (17.6k points)

I have df with column names: 'a', 'b', 'c' ... 'z'.


Index(['a', 'b', 'c', ... 'y', 'z'],

  dtype='object', name=0)

I have function which determine which columns should be displayed. For example:

start = con_start()

stop = con_stop()

print(my_df.columns >= start) & (my_df <= stop)

My result is:

[False False ... False False False False  True  True

True  True False False]

My goal is display dataframe only with columns that satisfy my condition. If start = 'a' and stop = 'b', I want to have:

                          a              b         

index1       index2                                                  

New York     New York       0.000000       0.000000          

California   Los Angeles   207066.666667  214466.666667     

Illinois     Chicago       138400.000000  143633.333333     

Pennsylvania Philadelphia   53000.000000   53633.333333      

Arizona      Phoenix       111833.333333  114366.666667 

1 Answer

0 votes
by (41.4k points)

To make it more robust, some assumptions are made:

1.When using iloc with array slicing. 

Here,assuming my_df.columns.is_unique evaluates to True and columns are already in order

start = df.columns.get_loc(con_start())

stop = df.columns.get_loc(con_stop())

df.iloc[:, start:stop + 1]


2.When using loc with boolean slicing.

Assuming that column values are comparable

start = con_start()

stop = con_stop()

c = df.columns.values

m = (start <= c) & (stop >= c)

df.loc[:, m]

If you want to learn Pandas visit this Python Pandas Tutorial.

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.6k answers


108k users

Browse Categories