2 views

I am trying to find a more pandorable way to get all rows of a DataFrame past a certain value in the a certain column (the `Quarter` column in this case).

I want to slice a DataFrame of GDP statistics to get all rows past the first quarter of 2000 (`2000q1`). Currently, I'm doing this by getting the index number of the value in the `GDP_df["Quarter"]` column that equals `2000q1` (see below). This seems way too convoluted and there must be an easier, simpler, more idiomatic way to achieve this. Any ideas?

Current Method:

def get_GDP_df():

"gdplev.xls",

names=["Quarter", "GDP in 2009 dollars"],

parse_cols = "E,G", skiprows = 7)

year_2000 = GDP_df.index[GDP_df["Quarter"] == '2000q1'].tolist()[0]

GDP_df["Growth"] = (GDP_df["GDP in 2009 dollars"]

.pct_change()

.apply(lambda x: f"{round((x * 100), 2)}%"))

GDP_df = GDP_df[year_2000:]

return GDP_df

Output:

Also, after the DataFrame has been sliced, the indices now start at 212. Is there a method to renumber the indices so they start at 0 or 1?

by (41.4k points)

Use the following code for more simplicity:

year_2000 = (GDP_df["Quarter"] == '2000q1').idxmax()

GDP_df["Growth"] = (GDP_df["GDP in 2009 dollars"]

.pct_change()

.mul(100)

.round(2)

.apply(lambda x: f"{x}%"))

return GDP_df.loc[year_2000:]