MapReduce is a framework in Hadoop that is used to process huge data parallelly across different clusters. It happens in two steps Map and Reduce. Map will split the data into smaller parts into key-value pairs and those are processed parallelly. The output from the map is sent to reduce the job. Reduce receives key-value pairs and combines the intermediate data into smaller sets of key-values pairs and gives the final output.
In Layman terms, If we aim to find the top 5 most used words in the book. Map will count the occurrence of words. After that Map will create 26 nodes and each node will have words starting with different alphabets. This output will be sent to reduce. Reduce will keep the top 5 words which have more count.
You can know more about MapReduce from watching this video: