0 votes
1 view
in Python by (14.5k points)

I have the following df,

pct    id

0.3    631

0.2    115

0.1    312

0.2    581

0.01   574

0.09   586

I want to first sort the df by pct,

df.sort_values(by=['pct'], ascending=False, inplace=True)

then adding up pct to 0.8 and count how many rows does that, e.g. top 4 rows in this case; I am wondering whats the best way to it. using pd.eval or pd.query?

1 Answer

0 votes
by (72.4k points)

Here is the python code to return the first records whose cumulative sum adds up a threshold:

threshold = 0.8

df1 = df [df ['pct'].cumsum().lt(threshold)]

print (df1)

output:

       pct   id

0    0.3   631

1    0.2   115

2   0.2   581

3   0.1   312

To learn writing python programs, you can enroll in this Python Course by Intellipaat.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...