Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (11.4k points)

I am trying to setup Apache Spark on Windows.

After searching a bit, I understand that the standalone mode is what I want. Which binaries do I download in order to run Apache spark in windows? I see distributions with hadoop and cdh at the spark download page.

I don't have references in web to this. A step by step guide to this is highly appreciated.

1 Answer

0 votes
by (32.3k points)
edited by

To install Spark locally in Windows follow the steps below:

  • Install Java 7 or later. To test java installation is complete, open command prompt type java and hit enter. If you receive a message 'Java' is not recognized as an internal or external command. You need to configure your environment variables, JAVA_HOME and PATH to point to the path of jdk.

  • Download and install Scala.

  • Set SCALA_HOME in Control Panel\System and Security\System goto "Adv System settings" and add %SCALA_HOME%\bin in PATH variable in environment variables.

  • Install Python 2.6 or later.

  • Download SBT. Install it and set SBT_HOME as an environment variable with value as <<SBT PATH>>.

  • Download winutils.exe. Since we don't have a local Hadoop installation on Windows we have to download winutils.exe and place it in a bin directory under a created Hadoop home directory. Set HADOOP_HOME = <<Hadoop home directory>> in environment variable.

  • We will be using a pre-built Spark package, so choose a Spark pre-built package for Hadoop Spark download. Download and extract it.

  • Set SPARK_HOME and add %SPARK_HOME%\bin in PATH variable in environment variables.

  • Run command: spark-shell

  • Your Spark is ready for Windows.

Also, for smoother understanding, go for this tutorial video to install Spark in Windows.

If you want to know more about Spark, then do check out this awesome video tutorial:

Browse Categories