What is Mapreduce in Hadoop?

Question

1 Answer

Praveen_1998 · Answer 1 · 2020-03-06T10:05:50+0000

MapReduce is a framework in Hadoop that is used to process huge data parallelly across different clusters. It happens in two steps Map and Reduce. Map will split the data into smaller parts into key-value pairs and those are processed parallelly. The output from the map is sent to reduce the job. Reduce receives key-value pairs and combines the intermediate data into smaller sets of key-values pairs and gives the final output.

In Layman terms, If we aim to find the top 5 most used words in the book. Map will count the occurrence of words. After that Map will create 26 nodes and each node will have words starting with different alphabets. This output will be sent to reduce. Reduce will keep the top 5 words which have more count.

To grasp concepts like Mapreduce in Hadoop, you can register for Hadoop Online Training.

Also, you can know more about MapReduce from watching this video:

What is Mapreduce in Hadoop?

1 Answer

Related questions

Browse Categories