Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I am trying to use the NLTK toolkit to get extract place, date and time from text messages. I just installed the toolkit on my machine and I wrote this quick snippet to test it out:

sentence = "Let's meet tomorrow at 9 pm";

tokens = nltk.word_tokenize(sentence)

pos_tags = nltk.pos_tag(tokens)

print nltk.ne_chunk(pos_tags, binary=True)

I was assuming that it will identify the date (tomorrow) and time (9 pm). But, surprisingly it failed to recognize that. I get the following result when I run my above code:

(S (GPE Let/NNP) 's/POS meet/NN tomorrow/NN at/IN 9/CD pm/NN)

Can someone help me understand if I am missing something or NLTK is just not mature enough to tag time and date properly? Thanks!

1 Answer

0 votes
by (33.1k points)

In python’s NLTK library, there is a default NE chunker, which is a maximum entropy chunker trained on the ACE corpus.

You can check out: http://mattshomepage.com/articles/2016/May/23/nltk_nec/.

There is a timex module in nltk_contrib, that might help to solve your problem/

Hope this answer helps.

If you want to learn about What is Natural language Processing then visit this NLP Course.

Browse Categories

...