despite there being at least two good tutorials on how to index a DataFrame in Python's pandas library, I still can't work out an elegant way of SELECTing on more than one column.
>>> d = pd.DataFrame({'x':[1, 2, 3, 4, 5], 'y':[4, 5, 6, 7, 8]})
>>> d
x y
0 1 4
1 2 5
2 3 6
3 4 7
4 5 8
>>> d[d['x']>2] # This works fine
x y
2 3 6
3 4 7
4 5 8
>>> d[d['x']>2 & d['y']>7] # I had expected this to work, but it doesn't
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I have found (what I think is) a rather inelegant way of doing it, like this
>>> d[d['x']>2][d['y']>7]
But it's not pretty, and it scores fairly low for readability (I think).
Is there a better, more Python-tastic way?