Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (6.5k points)
If multiple reducers are run in parallel, how can the reducer aggregate the wordcount of any word?

1 Answer

0 votes
by (11.3k points)
edited by

This is where you need to understand the difference between Mapper and Reducer. The mapper maps the value to [key, value] pairs. This means that the mapper can execute enormous amounts of data in parallel nodes and do the same assignment process.

So, does the reducer also execute in parallel with other reducers? Yes. But, unlike mappers, which are independent and can create maps for the same keys on separate mappers, for every unique key, only 1 reducer is used so whatever output that reducer gives, is the final value for THAT unique key.

And in a wordcount program, the reducer takes in the key(word) value(1 for each occurrence) pair from the mapper and sums it up to give the final count. In hadoop, this is one of the most basic programs to implement, once you do a hands-on of this and manipulate the code according to your needs, you'll understand the mechanism correctly. 

To know more about Hadoop, enroll in this big data Hadoop course and get to learn from professionals.

Browse Categories