2 views
in Python

Here is my code to generate a dataframe:

import pandas as pd

import numpy as np

dff = pd.DataFrame(np.random.randn(1,2),columns=list('AB'))

then I got the dataframe:

+------------+---------+--------+

|            | A       |  B     |

+------------+---------+---------

| 0          | 0.626386| 1.52325|

+------------+---------+--------+

When I type the command :

dff.mean(axis=1)

I got :

0 1.074821

dtype: float64

According to the reference of pandas, axis=1 stands for columns and I expect the result of the command to be

A 0.626386

B 1.523255

dtype: float64

So here is my question: what does axis in pandas mean?

by (106k points)

It specifies the axis along which the means are computed. By default axis=0. This is consistent with the numpy.mean usage when axis is specified explicitly (in numpy.mean, axis==None by default, which computes the mean value over the flattened array), in which axis=0 along the rows (namely, index in pandas), and axis=1 along the columns. For added clarity, one may choose to specify axis='index' (instead of axis=0) or axis='columns' (instead of axis=1).

+------------+---------+--------+

|            | A | B      |

+------------+---------+---------

|       0 | 0.626386| 1.52325|----axis=1----->

+------------+---------+--------+

|         |

|  axis=0 |

↓         ↓

by (108k points)

Let me explain this is a layman's term:

• Axis 0 will work on all the ROWS in each COLUMN
• Axis 1 will work on all the COLUMNS in each ROW

So a mean on axis 0 will be the average of all the rows in each column, and a mean on axis 1 will be the average of all the columns in each row.

For more information regarding the same, do refer to the Python course.