Apache Hadoop Yarn - Underutilization of cores

Question

asked Jul 17, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)

No matter how much I tinker with the settings in yarn-site.xml i.e using all of the below options

yarn.scheduler.minimum-allocation-vcores
yarn.nodemanager.resource.memory-mb
yarn.nodemanager.resource.cpu-vcores
yarn.scheduler.maximum-allocation-mb
yarn.scheduler.maximum-allocation-vcores

i just still cannot get my application i.e Spark to utilize all the cores on the cluster. The spark executors seem to be correctly taking up all the available memory, but each executor just keeps taking a single core and no more.

Here are the options configured in spark-defaults.conf

spark.executor.cores                    3
spark.executor.memory                   5100m
spark.yarn.executor.memoryOverhead      800
spark.driver.memory                     2g
spark.yarn.driver.memoryOverhead        400
spark.executor.instances                28
spark.reducer.maxMbInFlight             120
spark.shuffle.file.buffer.kb            200

Notice that spark.executor.cores is set to 3, but it doesn't work. How do i fix this?

1 Answer

Amit Rawat · Answer 1 · 2019-07-17T14:37:58+0000

In your case the problem doesn’t lies with yarn-site.xml and spark-defaults.conf. The actual problem arises with the resource calculator that assigns the cores to the executors or in the case of MapReduce jobs, to the Mappers/Reducers.

The default resource calculator i.e org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator uses only memory information for allocating containers and CPU scheduling is not enabled by default. In order to use both memory as well as the CPU, the resource calculator should be changed to org.apache.hadoop.yarn.util.resource.DominantResourceCalculator in the capacity-scheduler.xml file.

Here's what needs to change.

<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>

Apache Hadoop Yarn - Underutilization of cores

Apache Hadoop Yarn - Underutilization of cores

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions