Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (11.4k points)

I have Spark installed properly on my machine and am able to run python programs with the pyspark modules without error when using ./bin/pyspark as my python interpreter.

However, when I attempt to run the regular Python shell, when I try to import pyspark modules I get this error:

from pyspark import SparkContext

and it says

"No module named pyspark"

How can I fix this? Is there an environment variable I need to set to point Python to the pyspark headers/libraries/etc.? If my spark installation is /spark/, which pyspark paths do I need to include? Or can pyspark programs only be run from the pyspark interpreter?

1 Answer

0 votes
by (32.3k points)
edited by

Add the below export path line to bashrc file and and hopefully your modules will be correctly found:

# Add the PySpark classes to the Python path:


There is one more method.

Use findspark

1. Go to your python shell

            pip install findspark

            import findspark


 2.import the necessary modules

           from pyspark import SparkContext

     from pyspark import SparkConf


Now, you will find no errors and successful import of Spark modules will be done.

If you want to know more about Spark, then do check out this awesome video tutorial:

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Browse Categories