0 votes
1 view
in Data Science by (43.2k points)

I want to merge several strings in a dataframe based on a groupedby in Pandas.

This is my code so far:

import pandas as pd

from io import StringIO

data = StringIO("""

"name1","hej","2014-11-01"

"name1","du","2014-11-02"

"name1","aj","2014-12-01"

"name1","oj","2014-12-02"

"name2","fin","2014-11-01"

"name2","katt","2014-11-02"

"name2","mycket","2014-12-01"

"name2","lite","2014-12-01"

""")

# load string as stream into dataframe

df = pd.read_csv(data,header=0, names=["name","text","date"],parse_dates=[2])

# add a column with the month

df["month"] = df["date"].apply(lambda x: x.month)

I want the end result to look like this:

image

I don't get how I can use groupby and apply some sort of concatenation of the strings in the column "text". Any help appreciated!

1 Answer

0 votes
by (92.8k points)

You can just call apply function and then reset_index, refer the following code:

In[38]:

df.groupby(['name','month'])['text'].apply(','.join).reset_index()

Out[38]: 

    name  month         text

0  name1     11   du

1  name1     12 aj,oj

2  name2     11 fin,katt

3  name2     12 mycket,lite

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...