How to find the master URL for an existing spark cluster

Question

asked Jul 16, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)

Currently I am running my program as

val conf = new SparkConf()
.setAppName("Test Data Analysis")
.setMaster("local[*]")
.set("spark.executor.memory", "32g")
.set("spark.driver.memory", "32g")
.set("spark.driver.maxResultSize", "4g")

Even though I am running on a cluster of 5 machines (each with 376 GB Physical RAM). my program errors out with java.lang.OutOfMemoryError: Java heap space

My data sizes are big... but not so big that they exceed 32 GB Executor memory * 5 nodes.

I suspect it may be because I am using "local" as my master. I have seen documentation say use spark://machinename:7070

However I want to know for my cluster... how do I determine this URL and port.

In my case the spark cluster was setup/maintained by someone else and so I don't want to change topology by starting my own master.

1 Answer

Amit Rawat · Answer 1 · 2019-07-16T14:17:21+0000

You can use the below command to get the URL information:

sc.uiWebUrl

Also, if you've already set up a spark cluster on top of your physical cluster.Just check http://master:8088 where master is pointing to spark master machine. There you will be able to see spark master URI, and by default is spark://master:7077, actually quite a bit of information lives there, if you have a spark standalone cluster.

However, I see a lot of questions on SO claiming this does not work with many different reasons. Using the spark-submit utility will be a less error prone process, See usage.

But if you haven't got a spark cluster yet I will suggest you to set up the Spark Standalone cluster first.

How to find the master URL for an existing spark cluster

1 Answer

Related questions

Browse Categories