Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (108k points)

What are the key abstractions of Apache Spark?

1 Answer

0 votes
by (106k points)

Spark has a definite layer architecture based on two main abstractions, including:

·  RDD: RDDs are fixed (read-only) basic sets of elements that can be run in parallel(parallel processing) on ​​many devices. Each record in the RDD can be divided into logical parts and then executed on different nodes of the cluster.

· DAG: A directed acyclic graph is the scheduling level of the Apache Spark Architecture, which implements a stage-oriented (hierarchical) scheduling. The Apache Spark Architecture can create many levels of DAGs compared to MapReduce, which creates maps in two levels of Map and Reduce.

Here is a video tutorial which you can watch to learn more about spark:-

Browse Categories