Python - RegEx for splitting text into sentences.

Question

asked Jul 29, 2019 in Python by Rajesh Malhotra (19.9k points)

I want to make a list of sentences from a string and then print them out. I don't want to use NLTK to do this. So it needs to split on a period at the end of the sentence and not at decimals or abbreviations or title of a name or if the sentence has a .com This is attempt at regex that doesn't work.

import re
text = """\
Mr. Smith bought cheapsite.com for 1.5 million dollars, i.e. he paid a lot for it. Did he mind? Adam Jones Jr. thinks he didn't. In any case, this isn't true... Well, with a probability of .9 it isn't.
"""
sentences = re.split(r' *[\.\?!][\'"\)\]]* *', text)
for stuff in sentences:
print(stuff)

Example output of what it should look like

Mr. Smith bought cheapsite.com for 1.5 million dollars, i.e. he paid a lot for it.
Did he mind?
Adam Jones Jr. thinks he didn't.
In any case, this isn't true...
Well, with a probability of .9 it isn't.

1 Answer

Related questions

0 votes

1 answer

Splitting a List into chunks in Python

asked Jan 14, 2020 in Python by Rajesh Malhotra (19.9k points)

0 votes

1 answer

Create a list of items obtained from splitting text

asked Jul 30, 2019 in Python by Rajesh Malhotra (19.9k points)

+1 vote

1 answer

Regex Replace of strings of numbers into strings

asked Jul 31, 2019 in Python by Rajesh Malhotra (19.9k points)

0 votes

1 answer

Loading multiple text files from a folder into a python list variable

asked Feb 4, 2021 in Python by laddulakshana (16.4k points)

0 votes

1 answer

Splitting an Excel Sheet into multiple sheets using UI path

asked Jul 17, 2019 in RPA by noah kapoor (5.3k points)

Anirudh Singh · Answer 1 · 2019-07-29T06:51:32+0000

You need to use the following regex.

(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s

Python - RegEx for splitting text into sentences.

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources