0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)

I'm using spark 1.4.0-rc2 so I can use python 3 with spark. If I add export PYSPARK_PYTHON=python3 to my .bashrc file, I can run spark interactively with python 3. However, if I want to run a standalone program in local mode, I get an error:

Exception: Python in worker has different version 3.4 than that in driver 2.7, PySpark cannot run with different minor versions

How can I specify the version of python for the driver? Setting export PYSPARK_DRIVER_PYTHON=python3 didn't work. 

1 Answer

0 votes
by (31.4k points)

You need to make sure that you're launching the standalone project with Python 3. If you are submitting your standalone program through spark-submit then it should work fine, but if you are using python to launch, use python3 to start your program.


Also, keep in mind that you have to set your env variables in ./conf/spark-env.sh (if it doesn't exist you can use spark-env.sh.template as a base).

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...