Slice all rows of a DataFrame past a certain value in a column

Question

asked Jul 10, 2019 in Data Science by sourav (17.6k points)

I am trying to find a more pandorable way to get all rows of a DataFrame past a certain value in the a certain column (the Quarter column in this case).

I want to slice a DataFrame of GDP statistics to get all rows past the first quarter of 2000 (2000q1). Currently, I'm doing this by getting the index number of the value in the GDP_df["Quarter"] column that equals 2000q1 (see below). This seems way too convoluted and there must be an easier, simpler, more idiomatic way to achieve this. Any ideas?

Current Method:

def get_GDP_df():
GDP_df = pd.read_excel(
"gdplev.xls",
names=["Quarter", "GDP in 2009 dollars"],
parse_cols = "E,G", skiprows = 7)
year_2000 = GDP_df.index[GDP_df["Quarter"] == '2000q1'].tolist()[0]
GDP_df["Growth"] = (GDP_df["GDP in 2009 dollars"]
.pct_change()
.apply(lambda x: f"{round((x * 100), 2)}%"))
GDP_df = GDP_df[year_2000:]
return GDP_df

Output:

Also, after the DataFrame has been sliced, the indices now start at 212. Is there a method to renumber the indices so they start at 0 or 1?

1 Answer

Shlok Pandey · Answer 1 · 2019-07-15T05:50:52+0000

Use the following code for more simplicity:

year_2000 = (GDP_df["Quarter"] == '2000q1').idxmax()
GDP_df["Growth"] = (GDP_df["GDP in 2009 dollars"]
.pct_change()
.mul(100)
.round(2)
.apply(lambda x: f"{x}%"))
return GDP_df.loc[year_2000:]

Slice all rows of a DataFrame past a certain value in a column

1 Answer

Related questions

Browse Categories