Apache Spark is an open-source framework that has been developed for cluster-based computing. With Spark, we can capitalize on implicit features, such as data parallelism and fault tolerance. In other words, Spark provides an easy interface to process huge amounts of data across large clusters in parallel with minimum data loss. The framework was originally developed at the University of California, Berkeley, and is currently being maintained by the Apache Software Foundation.
You can go through this Apache Spark Tutorial by Intellipaat for a detailed explanation.