Spark has a definite layer architecture based on two main abstractions, including:
· RDD: RDDs are fixed (read-only) basic sets of elements that can be run in parallel(parallel processing) on many devices. Each record in the RDD can be divided into logical parts and then executed on different nodes of the cluster.
· DAG: A directed acyclic graph is the scheduling level of the Apache Spark Architecture, which implements a stage-oriented (hierarchical) scheduling. The Apache Spark Architecture can create many levels of DAGs compared to MapReduce, which creates maps in two levels of Map and Reduce.
Here is a video tutorial which you can watch to learn more about spark:-