Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (17.6k points)

I have df with column names: 'a', 'b', 'c' ... 'z'.


Index(['a', 'b', 'c', ... 'y', 'z'],

  dtype='object', name=0)

I have function which determine which columns should be displayed. For example:

start = con_start()

stop = con_stop()

print(my_df.columns >= start) & (my_df <= stop)

My result is:

[False False ... False False False False  True  True

True  True False False]

My goal is display dataframe only with columns that satisfy my condition. If start = 'a' and stop = 'b', I want to have:

                          a              b         

index1       index2                                                  

New York     New York       0.000000       0.000000          

California   Los Angeles   207066.666667  214466.666667     

Illinois     Chicago       138400.000000  143633.333333     

Pennsylvania Philadelphia   53000.000000   53633.333333      

Arizona      Phoenix       111833.333333  114366.666667 

1 Answer

0 votes
by (41.4k points)

To make it more robust, some assumptions are made:

1.When using iloc with array slicing. 

Here,assuming my_df.columns.is_unique evaluates to True and columns are already in order

start = df.columns.get_loc(con_start())

stop = df.columns.get_loc(con_stop())

df.iloc[:, start:stop + 1]


2.When using loc with boolean slicing.

Assuming that column values are comparable

start = con_start()

stop = con_stop()

c = df.columns.values

m = (start <= c) & (stop >= c)

df.loc[:, m]

If you want to learn Pandas visit this Python Pandas Tutorial.

Related questions

Browse Categories