Explore Courses Blog Tutorials Interview Questions
0 votes
in AI and Deep Learning by (50.2k points)

The simple question again: Is it better to use Ngrams (unigram/ bigrams etc) as simple binary features or rather use their Tfidf scores in ML models such as Support Vector Machines for performing NLP tasks such as sentiment analysis or text categorization/classification? 

1 Answer

0 votes
by (108k points)

Technically, tf-idf concerns the global collocations of your queries and ngram attends to the localize collocations of words in the queries you fire. When you prove whether one works better than the other, you can conclude whether global/local cues improve sentiment analysis significantly or not. In the categorization of short chat sentences, it was found that using IDF slightly improves performance over binary features. The improvement becomes smaller as the training set becomes larger. 

If you wish to learn about Machine Learning Models and NLP then visit this Artificial Intelligence Master Training Course.

Browse Categories