Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

I want to merge several strings in a dataframe based on a groupedby in Pandas.

This is my code so far:

import pandas as pd

from io import StringIO

data = StringIO("""

"name1","hej","2014-11-01"

"name1","du","2014-11-02"

"name1","aj","2014-12-01"

"name1","oj","2014-12-02"

"name2","fin","2014-11-01"

"name2","katt","2014-11-02"

"name2","mycket","2014-12-01"

"name2","lite","2014-12-01"

""")

# load string as stream into dataframe

df = pd.read_csv(data,header=0, names=["name","text","date"],parse_dates=[2])

# add a column with the month

df["month"] = df["date"].apply(lambda x: x.month)

I want the end result to look like this:

image

I don't get how I can use groupby and apply some sort of concatenation of the strings in the column "text". Any help appreciated!

1 Answer

0 votes
by (108k points)

You can just call apply function and then reset_index, refer the following code:

In[38]:

df.groupby(['name','month'])['text'].apply(','.join).reset_index()

Out[38]: 

    name  month         text

0  name1     11   du

1  name1     12 aj,oj

2  name2     11 fin,katt

3  name2     12 mycket,lite

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Browse Categories

...