Back

Explore Courses Blog Tutorials Interview Questions
+3 votes
3 views
in Big Data Hadoop & Spark by (11.4k points)

I'm trying to run pyspark on my macbook air. When i try starting it up I get the error:

Exception: Java gateway process exited before sending the driver its port number

6 Answers

+2 votes
by (32.3k points)
edited by

Try adding pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:

export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

There is a change in python/pyspark/java_gateway.py , which requires PYSPARK_SUBMIT_ARGS that includes pyspark-shell when a PYSPARK_SUBMIT_ARGS variable is set by a user.


 

One possible reason maybe that the JAVA_HOME is not set because java is not installed.

In that case you will encounter something like this:

Exception in thread "main" java.lang.UnsupportedClassVersionError:

........

........

........

    raise Exception("Java gateway process exited before sending the driver its port number")

You can resolve it by running the below commands:

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

If you want to know more about PySpark, then do check out this awesome video tutorial:



 

by (19.9k points)
Very well explained. Thank you.
by (47.2k points)
I have set SPARK_HOME already. I downloaded spark-1.4.1-bin-hadoop2.6.tgz from official website then tar it.
+1 vote
by (108k points)

I had the same issue with my iphython notebook (IPython 3.2.1) on Linux (ubuntu).

What was missing in my case was arranging the master URL in the $PYSPARK_SUBMIT_ARGS environment like this (assuming you use bash):

export PYSPARK_SUBMIT_ARGS="--master spark://<host>:<port>"

e.g.

export PYSPARK_SUBMIT_ARGS="--master spark://192.168.2.40:7077"

You can put this into your .bashrc file. Then you will get the correct URL in the log for the spark master (the location for this log is reported when you start the master with /sbin/start_master.sh).

+2 votes
by (29.3k points)

Java 10 SDK causes this error. Navigate to /Library/Java/JavaVirtualMachines then run this command to uninstall Java JDK 10 completely:

sudo rm -rf jdk-10.jdk/

After that, please download JDK 8 then the problem will be solved.

by (19.7k points)
Thanks, it helped me!
by (29.5k points)
hi this worked like a charm !!! thanksssss
by (33.1k points)
Thanks! It also worked for me.
0 votes
by (44.4k points)

You actually have to define "pyspark-shell" in PYSPARK_SUBMIT_ARGS if you define this.

For example:

import os

os.environ['PYSPARK_SUBMIT_ARGS'] = "--master mymaster --total-executor 2 --conf "spark.driver.extraJavaOptions=-Dhttp.proxyHost=proxy.mycorp.com-Dhttp.proxyPort=1234 -Dhttp.nonProxyHosts=localhost|.mycorp.com|127.0.0.1 -Dhttps.proxyHost=proxy.mycorp.com -Dhttps.proxyPort=1234 -Dhttps.nonProxyHosts=localhost|.mycorp.com|127.0.0.1 pyspark-shell"

This will work

0 votes
by (40.7k points)

According to me one reason can be JAVA_HOME is not set because java is not installed.

I faced the same issue. It says:

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0

    at java.lang.ClassLoader.defineClass1(Native Method)

    at java.lang.ClassLoader.defineClass(ClassLoader.java:643)

    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)

    at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)

    at java.net.URLClassLoader.access$000(URLClassLoader.java:73)

    at java.net.URLClassLoader$1.run(URLClassLoader.java:212)

    at java.security.AccessController.doPrivileged(Native Method)

    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)

    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)

    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:296)

    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)

    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:406)

Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/opt/spark/python/pyspark/conf.py", line 104, in __init__

    SparkContext._ensure_initialized()

  File "/opt/spark/python/pyspark/context.py", line 243, in _ensure_initialized

    SparkContext._gateway = gateway or launch_gateway()

  File "/opt/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway

    raise Exception("Java gateway process exited before sending the driver its port number")

Exception: Java gateway process exited before sending the driver its port number

At this code, sc = pyspark.SparkConf(). I solved it by running the code given below:

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

For more information, have a look at this: https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-get-on-ubuntu-16-04

0 votes
by (106k points)

You can get rid of these errors by using the below-mentioned code. I had set up the SPARK_HOME though. You may follow these simple steps from eproblems website.

spark_home = os.environ.get('SPARK_HOME', None)

Browse Categories

...