RDD (Resilient Distributed Dataset) is a basic data structure in Spark and it is immutable. RDD is an immutable distributed dataset collection and each distributed dataset is divided into partitions across the nodes of the cluster so that we can execute the operations in parallel.
RDD can be created using either parallelize() method or referencing a dataset from the external database. We can also create an RDD from an existing RDD when we applied a transformation on RDD.
If you are interested in to learn Spark, I would suggest this Spark Certification by Intellipaat.
You can watch this video on Spark RDD for a better understanding of RDD: