0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)

Installed apache-maven-3.3.3, scala 2.11.6, then ran:

$ git clone git://github.com/apache/spark.git -b branch-1.4
$ cd spark
$ build/mvn -DskipTests clean package


Finally:

$ git clone https://github.com/apache/incubator-zeppelin
$ cd incubator-zeppelin/
$ mvn install -DskipTests


Then ran the server:

$ bin/zeppelin-daemon.sh start


Running a simple notebook beginning with %pyspark, I got an error about py4j not being found. Just did pip install py4j (ref).

Now I'm getting this error:

pyspark is not responding Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark.py", line 22, in <module>
    from pyspark.conf import SparkConf
ImportError: No module named pyspark.conf

1 Answer

0 votes
by (32.2k points)

You just need to set up environment variables and you will get rid of this error.

Just keep in mind you have to point to where the Spark directory is and where your Python executable is before setting the following environment variables:

export SPARK_HOME=~/spark-version-with-your-full-directory

export PYSPARK_PYTHON=~/your-directory/bin/python

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...