How to take column-slices of dataframe in pandas

Question

asked Oct 10, 2019 in Python by Sammy (47.6k points)

I load some machine learning data from a CSV file. The first 2 columns are observations and the remaining columns are features.

Currently, I do the following:

data = pandas.read_csv('mydata.csv')

which gives something like:

data = pandas.DataFrame(np.random.rand(10,5), columns = list('abcde'))

I'd like to slice this dataframe in two data frames: one containing the columns a and b and one containing the columns c, d and e.

It is not possible to write something like

observations = data[:'c']
features = data['c':]

I'm not sure what the best method is. Do I need a pd.Panel?

By the way, I find dataframe indexing pretty inconsistent: data['a'] is permitted, but data[0] is not. On the other side, data['a':] is not permitted but data[0:] is. Is there a practical reason for this? This is really confusing if columns are indexed by Int, given that data[0] != data[0:1]

1 Answer

Vishal · Answer 1 · 2019-10-10T13:32:44+0000

To take column-slices of dataframe in pandas .loc uses label based indexing to select both rows and columns. The labels being the values of the index or the columns. Slicing with .loc includes the last element.

How to take column-slices of dataframe in pandas

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources