0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)

I have an Spark app which runs with no problem in local mode,but have some problems when submitting to the Spark cluster.

The error msg are as follows:

16/06/24 15:42:06 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2, cluster-node-02): java.lang.ExceptionInInitializerError
    at GroupEvolutionES$$anonfun$6.apply(GroupEvolutionES.scala:579)
    at GroupEvolutionES$$anonfun$6.apply(GroupEvolutionES.scala:579)
    at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1595)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: A master URL must be set in your configuration
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:401)
    at GroupEvolutionES$.<init>(GroupEvolutionES.scala:37)
    at GroupEvolutionES$.<clinit>(GroupEvolutionES.scala)
    ... 14 more

16/06/24 15:42:06 WARN scheduler.TaskSetManager: Lost task 5.0 in stage 0.0 (TID 5, cluster-node-02): java.lang.NoClassDefFoundError: Could not initialize class GroupEvolutionES$
    at GroupEvolutionES$$anonfun$6.apply(GroupEvolutionES.scala:579)
    at GroupEvolutionES$$anonfun$6.apply(GroupEvolutionES.scala:579)
    at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1595)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)


In the above code, GroupEvolutionES is the main class. The error msg says "A master URL must be set in your configuration", but I have provided the "--master" parameter to spark-submit.

Anyone who knows how to fix this problem?

1 Answer

0 votes
by (32.5k points)
edited by

I think you didn’t define the sparkContext object.

Similar problem was faced by me,when I initiated the sparkContext outside the main function.

Then I realized that I have to initiate it inside the main function.

You should have created the SparkContext in the main functions of GroupEvolutionES. Just do that and your code will work fine.

If you want to know more about Spark, then do check out this awesome video tutorial:

Welcome to Intellipaat Community. Get your technical queries answered by top developers !

Categories

...