Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

Installed apache-maven-3.3.3, scala 2.11.6, then ran:

$ git clone git://github.com/apache/spark.git -b branch-1.4
$ cd spark
$ build/mvn -DskipTests clean package


Finally:

$ git clone https://github.com/apache/incubator-zeppelin
$ cd incubator-zeppelin/
$ mvn install -DskipTests


Then ran the server:

$ bin/zeppelin-daemon.sh start


Running a simple notebook beginning with %pyspark, I got an error about py4j not being found. Just did pip install py4j (ref).

Now I'm getting this error:

pyspark is not responding Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark.py", line 22, in <module>
    from pyspark.conf import SparkConf
ImportError: No module named pyspark.conf

1 Answer

0 votes
by (32.3k points)

You just need to set up environment variables and you will get rid of this error.

Just keep in mind you have to point to where the Spark directory is and where your Python executable is before setting the following environment variables:

export SPARK_HOME=~/spark-version-with-your-full-directory

export PYSPARK_PYTHON=~/your-directory/bin/python

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Browse Categories

...