Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (50.2k points)

I've been very confused about how python axes are defined, and whether they refer to a DataFrame's rows or columns. Consider the code below:

 >>> df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]], columns=["col1", "col2", "col3", "col4"])

>>> df

   col1  col2 col3  col4

0     1 1     1 1

1     2 2     2 2

2     3 3     3 3

So if we call df.mean(axis=1), we'll get a mean across the rows:

 >>> df.mean(axis=1)

0    1

1    2

2    3

However, if we call df.drop(name, axis=1), we drop a column, not a row:

 >>> df.drop("col4", axis=1)

   col1  col2 col3

0     1 1     1

1     2 2     2

2     3 3     3

Can someone help me understand what is meant by an "axis" in pandas/numpy/scipy?

A side note, DataFrame.mean just might be defined wrong. It says in the documentation for DataFrame.mean that axis=1 is supposed to mean a mean over the columns, not the rows...

1 Answer

0 votes
by (108k points)

Just remember it as 0=down and 1=across.

This means:

  • Use axis=0 to implement a method down each column, or to the row labels (the index).

  • Use axis=1 to implement a method across each row, or to the column labels.

So, referring to your question, df.mean(axis=1), seems to be correctly defined. It takes the average of entries horizontally across columns, that is, along each individual row. On the other side, df.mean(axis=0) would be an operation appearing vertically downwards across rows.

Similarly, df.drop(name, axis=1) refers to action on column labels, because they intuitively go across the horizontal axis. Defining axis=0 would make the method act on rows instead.

If you want to learn more about Pandas then visit this Python Course designed by the industrial experts.

Related questions

0 votes
2 answers
0 votes
2 answers
asked Oct 10, 2019 in Python by Sammy (47.6k points)
0 votes
1 answer

Browse Categories