Python pandas -> select by condition in columns name

Question

asked Jul 11, 2019 in Data Science by sourav (17.6k points)

I have df with column names: 'a', 'b', 'c' ... 'z'.

print(my_df.columns)
Index(['a', 'b', 'c', ... 'y', 'z'],
dtype='object', name=0)

I have function which determine which columns should be displayed. For example:

start = con_start()
stop = con_stop()
print(my_df.columns >= start) & (my_df <= stop)

My result is:

[False False ... False False False False True True
True True False False]

My goal is display dataframe only with columns that satisfy my condition. If start = 'a' and stop = 'b', I want to have:

0 a b
index1 index2
New York New York 0.000000 0.000000
California Los Angeles 207066.666667 214466.666667
Illinois Chicago 138400.000000 143633.333333
Pennsylvania Philadelphia 53000.000000 53633.333333
Arizona Phoenix 111833.333333 114366.666667

1 Answer

Shlok Pandey · Answer 1 · 2019-07-20T07:22:14+0000

To make it more robust, some assumptions are made:

1.When using iloc with array slicing.

Here,assuming my_df.columns.is_unique evaluates to True and columns are already in order

start = df.columns.get_loc(con_start())
stop = df.columns.get_loc(con_stop())
df.iloc[:, start:stop + 1]

2.When using loc with boolean slicing.

Assuming that column values are comparable

start = con_start()
stop = con_stop()
c = df.columns.values
m = (start <= c) & (stop >= c)
df.loc[:, m]

If you want to learn Pandas visit this Python Pandas Tutorial.

Python pandas -> select by condition in columns name

1 Answer

Related questions

Browse Categories