Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

Target

I have a Pandas data frame, as shown below, with multiple columns and would like to get the total of column, column.

Data Frame - df:

image

My attempt:

I have attempted to get the sum of the column using groupby and .sum():

Total = df.groupby['MyColumn'].sum()

print Total

This causes the following error:

TypeError: 'instancemethod' object has no attribute '__getitem__'

Expected Output

I'd have expected the output to be as followed:

319

Or alternatively, I would like df to be edited with a new row entitled TOTAL containing the total:

image

1 Answer

0 votes
by (107k points)

 

Firstly you should use sum:

Total = df['MyColumn'].sum()

print (Total)

319

Then you use loc with Series, in that case, the index should be set as the same as the specific column you need to sum:

df.loc['Total'] = pd.Series(df['MyColumn'].sum(), index = ['MyColumn'])

print (df)

         X  MyColumn      Y Z

0        A 84.0   13.0 69.0

1        B 76.0   77.0 127.0

2        C 28.0   69.0 16.0

3        D 28.0   28.0 31.0

4        E 19.0   20.0 85.0

5        F 84.0  193.0 70.0

Total  NaN 319.0    NaN NaN

because if you pass scalar, the values of all rows will be filled:

df.loc['Total'] = df['MyColumn'].sum()

print (df)

         X  MyColumn      Y Z

0        A 84   13.0 69.0

1        B 76   77.0 127.0

2        C 28   69.0 16.0

3        D 28   28.0 31.0

4        E 19   20.0 85.0

5        F 84  193.0 70.0

Total  319   319 319.0  319.0

If you are interested to learn Pandas visit this Python Pandas Tutorial.

 

Related questions

31k questions

32.9k answers

507 comments

693 users

...