Can apache spark run without hadoop?

Question

3 Answers

Shivangi · Answer 1 · 2019-05-24T06:47:36+0000

There are no dependencies of Spark on Hadoop. So, you can use Spark without Hadoop but you'll not be able to use some functionalities that are dependent on Hadoop. Spark can basically run over any distributed file system,it doesn't necessarily have to be Hadoop.

Spark doesn’t have it’s own storage system.So, it is dependent on other Storage facilities like cassandra, hdfs, s3 etc.

Although it is better to run Spark with Hadoop, you can run Spark without Hadoop in stand-alone mode.You can refer to Spark Documentation for more details.

Aarav · Answer 2 · 2019-07-09T07:50:30+0000

Apache Spark is an open source distributed cluster computing framework. And it can definitely run with Hadoop.

As Hadoop is a framework for distributed storage (HDFS) and distributed processing (YARN).

It is only used by Spark for storing and processing purpose and that too can be substituted by other storages and cluster managers available for Spark.

Distributed Storage:

Since Spark does not have its own distributed storage system, it has to depend on one of these storage systems for distributed computing.

S3 – Non-urgent batch jobs. S3 fits very specific use cases when data locality isn’t critical.

Cassandra – Perfect for streaming data analysis and an overkill for batch jobs.

HDFS – Great fit for batch jobs without compromising on data locality.

Distributed processing:

You can run Spark in three different modes on following cluster managers:

Spark Standalone
Hadoop YARN
Apache Mesos

Amit Rawat · Answer 3 · 2019-09-18T09:31:10+0000

Spark can work without Hadoop but some of its functionality depends on Hadoop's code (e.g. handling of Parquet files). We're operating Spark on Mesos and S3 which was a little complicated to set up but works well once done. If you want more information regarding the same, refer to the following video:

Can apache spark run without hadoop?

3 Answers

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources