0 votes
1 view
in Data Science by (50.5k points)

fairly new to pandas so bear with me...

I have a huge csv with many tables with many rows. I would like to simply split each dataframe into 2 if it contains more than 10 rows.

If true, I would like the first dataframe to contain the first 10 and the rest in the second dataframe.

Is there a convenient function for this? I've looked around but found nothing useful...

i.e. split_dataframe(df, 2(if > 10))?

1 Answer

0 votes
by (108k points)

This will return the split DataFrames if the condition is met, otherwise return the original and None (which you would then need to handle separately). Note that this considers the splitting only has to happen one time per df and that the other part of the split (if it is longer than 10 rows (which means that the original was longer than 20 rows)) is OK.

df_new1, df_new2 = df[:10, :], df[10:, :] if len(df) > 10 else df, None

See you can also use df.head(10) and df.tail(len(df) - 10) to get the front and back according to your requirements. You can also use numerous indexing approaches: you can just give the index of the first dimension if you want, such as df[:10] instead of df[:10, :]. You can also use df.iloc and df.ix to index in comparable ways.

Check out this article on Pandas and also go through this Python course for more insight.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !