ValueError: Must pass DataFrame with boolean values only

Question

asked Jun 29, 2019 in Data Science by Sammy (47.6k points)
edited Jul 3, 2019 by Sammy

In this data file, the United States is broken up into four regions using the "REGION" column.

Create a query that finds the counties that belong to regions 1 or 2, whose name starts with 'Washington', and whose POPESTIMATE2015 was greater than their POPESTIMATE 2014.

This function should return a 5x2 DataFrame with the columns = ['STNAME', 'CITY NAME'] and the same index ID as the census_df (sorted ascending by index).

You'll find a description of my data in the following picture:-

enter image description here

def answer_eight():
counties=census_df[census_df['SUMLEV']==50]
regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
return grew[grew['STNAME'],grew['COUNTY']]
outcome = answer_eight()
assert outcome.shape == (5,2)
assert list (outcome.columns)== ['STNAME','CTYNAME']
print(tabulate(outcome, headers=["index"]+list(outcome.columns),tablefmt="orgtbl"))

ERROR

ValueError Traceback (most recent call last)
<ipython-input-77-546e58ae1c85> in <module>()
6 return grew[grew['STNAME'],grew['COUNTY']]
7
----> 8 outcome = answer_eight()
9 assert outcome.shape == (5,2)
10 assert list (outcome.columns)== ['STNAME','CTYNAME']
<ipython-input-77-546e58ae1c85> in answer_eight()
1 def answer_eight():
2 counties=census_df[census_df['SUMLEV']==50]
----> 3 regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
4 washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
5 grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key)
1991 return self._getitem_array(key)
1992 elif isinstance(key, DataFrame):
-> 1993 return self._getitem_frame(key)
1994 elif is_mi_columns:
1995 return self._getitem_multilevel(key)
/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_frame(self, key)
2066 def _getitem_frame(self, key):
2067 if key.values.size and not com.is_bool_dtype(key.values):
-> 2068 raise ValueError('Must pass DataFrame with boolean values only')
2069 return self.where(key)
2070
ValueError: Must pass DataFrame with boolean values only

I am clueless. Where am I going wrong?

Thanks

1 Answer

Shlok Pandey · Answer 1 · 2019-07-02T08:10:40+0000

You should pass only the condition when you compare a series or column in df with a scalar to produce a boolean mask. Do not pass the df multiple times.

def answer_eight(): counties=census_df[census_df['SUMLEV']==50]
regions = counties[(counties['REGION']==1]) | (counties['REGION']==2])]
washingtons = regions[regions['COUNTY'].str.startswith("Washington")]
grew = washingtons[washingtons['POPESTIMATE2015']>washingtons['POPESTIMATES2014']] return grew[['STNAME','COUNTY']]

If you wish to learn more about Data Science, visit the Data Science tutorial and Data Science courses by Intellipaat.

ValueError: Must pass DataFrame with boolean values only

1 Answer

Related questions

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources