Explore Courses Blog Tutorials Interview Questions
0 votes
in Python by (19.9k points)

I cannot figure out how to reshape a huge DataFrame with a lot of variables, latitudes and longitudes like

            Var1_(lat1, len1) Var2_(lat1, len1)

date1 date2                                    

d1    d5                   v1                v5

d2    d6                   v2                v6

d3    d7                   v3                v7

d4    d8                   v4                v8

and reshape this as

                      Var1 Var2

date1 date2 lat  len           

d1    d5    lat1 len1   v1   v5

d2    d6    lat1 len1   v2   v6

d3    d7    lat1 len1   v3   v7

d4    d8    lat1 len1   v4   v8

to have those variables indexed by the lat and len values too.

Of course this is a small example, but I'm looking for something that could be valid for more variables (value always before '_') and latitudes and longitudes (values always between parenthesis and separated with a comma).

1 Answer

0 votes
by (25.1k points)

Create MultiIndex with remove () first and then Series.str.split by _ or ,, so possible reshape by DataFrame.stack, last set index names by DataFrame.rename_axis:

df.columns = df.columns.str.replace('\(|\)','').str.split('_|, ', expand=True)

df = df.stack(level=[1,2]).rename_axis(('date1','date2','lat','len'))

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.6k answers


108k users

Browse Categories