0 votes
1 view
in Data Science by (11.5k points)

Hi I have one Spark job which runs fine locally with less data but when I schedule it on YARN to execute I keep on getting the following ERROR and slowly all executors gets removed from UI and my job fails

15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on myhost1.com: remote Rpc client disassociated
15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on myhost2.com: remote Rpc client disassociated

I use the following command to schedule spark job in yarn-client mode

 ./spark-submit --class com.xyz.MySpark --conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" --driver-java-options -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client --executor-memory 2G --executor-cores 8 --num-executors 12  /home/myuser/myspark-1.0.jar

I dont know what is the problem please guide. I am new to Spark. Thanks in advance.

1 Answer

0 votes
by (32.2k points)

I had a very similar problem. I had many executors being lost no matter how much memory we allocated to them.

Here the best solution to this problem is to use yarn and set –conf spark.yarn.executor.memoryOverhead=600, alternatively when cluster using mesos, try this –conf spark.mesos.executor.memoryOverhead=600 instead.

The configuration option for spark 2.3.1+ is

–conf spark.yarn.executor.memoryOverhead=600

In this problem, there was insufficient memory for YARN itself and containers were being killed because of it. But after setting the configuration option as mentioned above, you will no longer encounter lost executor problem.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !