Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in Big Data Hadoop & Spark by (11.4k points)

I am trying to overwrite the spark session/spark context default configs, but it is picking entire node/cluster resource.

 spark  = SparkSession.builder
                      .master("ip")
                      .enableHiveSupport()
                      .getOrCreate()

 spark.conf.set("spark.executor.memory", '8g')
 spark.conf.set('spark.executor.cores', '3')
 spark.conf.set('spark.cores.max', '3')
 spark.conf.set("spark.driver.memory",'8g')
 sc = spark.sparkContext

1 Answer

0 votes
by (32.3k points)

Looking at your code, I don’t think you are overwriting anything .

You and see it by yourself, just type this command as soon as you start pyspark shell:

sc.getConf().getAll()

This will give you all of the current config settings. Then execute your code and do it again. You will get to know that nothing is changing.

Now, I would suggest you instead of following your approach try to create a new configuration and use that to create a SparkContext:

conf = pyspark.SparkConf().setAll([('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.cores.max', '3'), ('spark.driver.memory','8g')])

sc.stop()

sc = pyspark.SparkContext(conf=conf)

Then you can check yourself just like above with:

sc.getConf().getAll()

This should reflect the configuration you wanted.

...