Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (50.2k points)
edited by

despite there being at least two good tutorials on how to index a DataFrame in Python's pandas library, I still can't work out an elegant way of SELECTing on more than one column.

>>> d = pd.DataFrame({'x':[1, 2, 3, 4, 5], 'y':[4, 5, 6, 7, 8]})

>>> d

   x  y

0  1  4

1  2  5

2  3  6

3  4  7

4  5  8

>>> d[d['x']>2] # This works fine

   x  y

2  3  6

3  4  7

4  5  8

>>> d[d['x']>2 & d['y']>7] # I had expected this to work, but it doesn't

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I have found (what I think is) a rather inelegant way of doing it, like this

>>> d[d['x']>2][d['y']>7]

But it's not pretty, and it scores fairly low for readability (I think).

Is there a better, more Python-tastic way?

1 Answer

0 votes
by (108k points)

It is just a precedence operator issue.

You have to add extra parenthesis to make your multi-condition test working:

d[(d['x']>2) & (d['y']>7)]

You can refer the following link for more information  regarding the same: 

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

29.3k questions

30.6k answers


104k users

Browse Categories