Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have the following CSV data:

+----------+-------------+-------+---------+

| Category | Part Number | Units |  Cost   |

+----------+-------------+-------+---------+

| Axel     |          78 |   587 | $159.95 |

| Rim      |          48 |   234 | $38.75  |

| Nut      |          39 |  1234 | $0.15   |

| Axel     |          79 |    67 | $110.95 |

+----------+-------------+-------+---------+

And the following code:

# Importing the libraries

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

# Importing the dataset

df = pd.read_csv('stock.csv',engine="python")

#Sum of values by category

df.groupby('Category').sum()['Units']

df.groupby('Category').sum()['Cost']

When I run the second to last line, I get the following output:

df.groupby('Category').sum()['Units']

Out[4]: 

Category

Axel     654

Nut     1234

Rim      234

Name: Units, dtype: int64

When I run the last line, I get the following error:

KeyError: 'Cost'

I'm not sure if there is a simple way to sum the data without converting the data type to an integer and then converting it back.

If you wish to know more about Pandas visit this Pandas Tutorial.

1 Answer

0 votes
by (41.4k points)

Here,.sum() ignores all non-numeric columns.So, you need to convert cost to numbers first.

See the code below:

df["Cost"] = df["Cost"].str[1:].astype(float)

Browse Categories

...