Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Big Data Hadoop & Spark by (11.4k points)

I installed Spark, ran the sbt assembly, and can open bin/pyspark with no problem. However, I am running into problems loading the pyspark module into ipython. I'm getting the following error: 

In [1]: import pyspark
ImportError                               Traceback (most recent call last)
<ipython-input-1-c15ae3402d12> in <module>()
----> 1 import pyspark

/usr/local/spark/python/pyspark/ in <module>()
     62 from pyspark.conf import SparkConf
---> 63 from pyspark.context import SparkContext
     64 from pyspark.sql import SQLContext
     65 from pyspark.rdd import RDD

/usr/local/spark/python/pyspark/ in <module>()
     28 from pyspark.conf import SparkConf
     29 from pyspark.files import SparkFiles
---> 30 from pyspark.java_gateway import launch_gateway
     31 from pyspark.serializers import PickleSerializer, BatchedSerializer, UTF8Deserializer, \
     32     PairDeserializer, CompressedSerializer

/usr/local/spark/python/pyspark/ in <module>()
     24 from subprocess import Popen, PIPE
     25 from threading import Thread
---> 26 from py4j.java_gateway import java_import, JavaGateway, GatewayClient

ImportError: No module named py4j.java_gateway

1 Answer

0 votes
by (32.3k points)

In order to resolve this error you need add following paths in .bashrc:

export SPARK_HOME=/home/a141890/apps/spark



Browse Categories