List the words in a vocabulary according to occurrence in a text corpus , Scikit-Learn

Question

1 Answer

Anurag · Answer 1 · 2019-08-08T13:03:58+0000

If cv is your CountVectorizer and X is the vectorized corpus, then

zip(cv.get_feature_names(),
np.asarray(X.sum(axis=0)).ravel())

returns a list of (term, frequency) pairs for each distinct term in the corpus that the CountVectorizer extracted.

Undergo NLP Training comprehensively with the help of this video tutorial:

NLP is somewhat related to Machine Learning Tutorial as well, so studying it will always double benefit one when it comes to technology mastering.