Here I have some sheet with column 1:

Question

asked Jul 27, 2019 in Data Science by sourav (17.6k points)

Im new in pandas so would like to ask some help according to excel file. Here I have some sheet with column 1:

Index Column1
1 PF7293
2 NodeB Name=SN5208, LogicRNCID=106
3 KL5083
4 Label=DL7765A3U-2, CellID=28643, LogicRNCID=201

and I wanna create another column2 that should have some word from column1 and look like that:

Index Column2
1 PF7293
2 SN5208
3 KL5083
4 DL7765

in excel we used MID. I would like to do the same using pandas. Thank you!

Question 2

New sheet looks like that:

Column1 Column2
KL7110 BTS works
KS5007 BSS works
KL5066 Planned works
KL5147 Planned works
KL5066 Unplanned work
KL5077 Power work
KL5077 Power work
AN9045 MW work

I wanna delete same value from Column 2 for one value in column1. For example here is 2 KL5077 in column1 with same value in column2 I would Like to delete one of them. And second problem here is 2 KL5066 in Column1 with different value in Column2 and in this case I would like to put values in Column2 together like "Planned work/Unplanned work". Hope I ve explained well))

1 Answer

Shlok Pandey · Answer 1 · 2019-08-01T06:43:42+0000

We will use Series.str.extract.For each subject string in the Series, extract groups from the first match of regular expression

df['Column2'] = df['Column1'].str.extract(r'([A-Z]{2}\d{4})')

Where the regex pattern here can be though of as "2 uppercase letters" followed by "4 digits"

Output:

Index Column1 Column2
0 1 PF7293 PF7293
1 2 NodeB Name=SN5208, LogicRNCID=106 SN5208
2 3 KL5083 KL5083
3 4 Label=DL7765A3U-2, CellID=28643, LogicRNCID=201 DL7765

Answer for Question 2:

1) To drop the duplicate rows use:

df.drop_duplicates(subset=['Column1', 'Column2'], inplace=True)

2) To join the multiple 'Column2' values use:

df_new = df.groupby('Column1')['Column2'].apply('/'.join).reset_index()

Output:

Column1 Column2
0 AN9045 MW work
1 KL5066 Planned works/Unplanned work
2 KL5077 Power work
3 KL5147 Planned works
4 KL7110 BTS works
5 KS5007 BSS works

Here I have some sheet with column 1:

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources