My data looks like
04/07/16, 12:51 AM - User1: Hi
04/07/16, 8:19 PM - User2: Here’s a link for you
https://www.abcd.com/folder/1SyuIUCa10tM37lT0F8Y3D
04/07/16, 8:29 PM - User2: Thanks
Using the below code, I am able to split each message into each new line
data = []
for line in open('/content/drive/My Drive/sample.txt'):
items = line.rstrip('\r\n').split('\t') # strip new-line characters and split on column delimiter
items = [item.strip() for item in items] # strip extra whitespace off data items
data.append(items)
However, I do not want to split the line where a newline character is followed by a link. For example, Line 3 & 4 are one single message but they split up because of newline character.
Is there a way to avoid splitting when a newline character is followed by http
?