Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

I have a Series that looks the following:

   col

0  B

1  B

2  A

3  A

4  A

5  B

It's a time series, therefore the index is ordered by time.

For each row, I'd like to count how many times the value has appeared consecutively, i.e.:

Output:

   col count

0  B   1

1  B   2

2  A   1 # Value does not match the previous row => reset counter to 1

3  A   2

4  A   3

5  B   1 # Value does not match previous row => reset counter to 1

I can't figure out how to "write" that information as a new column in the DataFrame, for each row (as above). Using rolling_apply does not work well.

1 Answer

0 votes
by (107k points)

Just add the following line of code:

df['count'] = df.groupby('col').cumcount()

or you can also refer the following code if you want the counts to begin at 1.: 

df['count'] = df.groupby('col').cumcount() + 1

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Related questions

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...