Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

I have a Series that looks the following:

   col

0  B

1  B

2  A

3  A

4  A

5  B

It's a time series, therefore the index is ordered by time.

For each row, I'd like to count how many times the value has appeared consecutively, i.e.:

Output:

   col count

0  B   1

1  B   2

2  A   1 # Value does not match the previous row => reset counter to 1

3  A   2

4  A   3

5  B   1 # Value does not match previous row => reset counter to 1

I can't figure out how to "write" that information as a new column in the DataFrame, for each row (as above). Using rolling_apply does not work well.

1 Answer

0 votes
by (108k points)

Just add the following line of code:

df['count'] = df.groupby('col').cumcount()

or you can also refer the following code if you want the counts to begin at 1.: 

df['count'] = df.groupby('col').cumcount() + 1

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Related questions

Browse Categories

...