You can use Spark with Python using the 'pyspark' implement that comes bundled with Apache Spark. The prerequisites for this include:
- DK 8
- Python (Comes installed by default in modern Linux OS builds)
- A stable version of Apache Spark
Run the following command (depending on the location of your Spark installation) to start a PySpark shell:
Alternatively, Jupyter Notebook provided by Google Colab can be used for PySpark as well after installing the sufficient Java and Spark packages.
If you are looking for an online course to learn Spark, I recommend this Apache Spark Training program by Intellipaat.