Word cloud is also known as a Tag cloud is a visual representation of text data, typically used to depict keyword metadata on websites, or to visualize free form text. Tags are usually single words, and the importance of each tag is shown with its font size and color. This format is useful for quickly perceiving the most prominent terms and for locating a term alphabetically to determine its importance.
Now, since your data is stored in MongoDB and you are using python language, so I am hoping that you might have installed Python drivers and connected to MongoDB.
Almost everything is sorted, but for better handling of your word cloud, you just need not have to load all the files in memory
from wordcloud import WordCloud
from collections import Counter
wordc = WordCloud()
counts_all = Counter()
with open('path/to/file.txt', 'r') as f:
for line in f: # Here you can also use the Cursor
counts_line = wordc.process_text(line)
If you want more information regarding Hadoop, refer to the following link: