Normally ResourceManager(RM) runs mappers and reducers, RM creates a separate container for mapper and reducer. When a MapReduce job is submitted, ResourceManager launches the ApplicationMaster process (for MapReduce the ApplicationMaster is MRAppMaster) on a container. Then ApplicationMaster retrieves the number of input splits for the job and based on that it decides the number of mappers that has to be launched and also the number of reducers that have to be launched as per the configuration.
At this juncture, ApplicationMaster has to decide whether to negotiate resources with the ResourceManager’s scheduler to run the map and reduce tasks or run the job sequentially within the same JVM where ApplicationMaster is running.
This decision making by ApplicationMaster happens only if Uber mode is set to true in Hadoop. If uber mode is true and ApplicationMaster decides to run the MapReduce job with in the same JVM then the job is said to be running as a uber task in YARN.
By default Uber mode is set to false in Hadoop, so before running the MapReduce job in Hadoop2, you need to set the uber mode to “true”, otherwise you might notice this message:
Job job_XXXXX_xxxx running in uber mode : false
Uber configuration is used for MapReduce, whenever you have a small data set.
The Uber mode runs the map and reduce tasks within its own process and avoid the overhead of launching and communicating with remote nodes.
Configurations parameters required for uber mode are set in etc/hadoop/mapred-site.xml.
Following are the configuration option for Uber jobs:
For more information regarding the same, refer to the following video tutorial: