How can I keep each element in the Customer_ID only the Col2_ID and the Qta with a maximum value of the Qta column and discard all these rest?
I am stuck here:
df1 = df.groupby(["Customer_ID", "Col2_ID"]).Qta.sum()
Customer_ID Col2_ID Qta
0 536544 600
536546 1
536550 1
536553 3
536555 1
...
18283 579673 134
580872 142
18287 554065 488
570715 990
573167 108
After grouping, I have to each customer multiple (Col2_ID, Qta), but to each Customer, I only want specific (Col2_ID, Qta) with the maximum value.
For example, instead of one output given by my program, I want my output to be:
Customer_ID Col2_ID Qta
0 536544 600
...
18283 580872 142
18287 570715 990