Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have this data and i want to cross-tabulate between the GDP level (above average vs. below average) vs. Level of alcohol consumption (above average vs. below average). and find the correlation.

data

I'm trying this but is not what i want.

pd.crosstab(df['GDP'],df['Recorded_Consupmtion'], margins=True)

1 Answer

0 votes
by (41.4k points)

This is what you can use in your code:

df['GDP_Avg'] = np.where(df.GDP < df.GDP.mean(),'Below Average','Above Average')

df['RC_Avg'] = np.where(df.Recorded_Consupmtion < df.Recorded_Consupmtion.mean(),'Below Average','Above Average')

pd.crosstab(df['GDP_Avg'],df['RC_Avg'], margins=True)

Output:

RC_Avg         Above Average  Below Average  All

GDP_Avg                                         

Above Average              5              0    5

Below Average              1              3    4

All                        6              3    9

If you wish to learn Pandas visit this Pandas Tutorial.

Browse Categories

...