What is the difference between an RDD partition and a slice?

The Spark Programming Guide mentions slices as a feature of RDDs (both parallel collections or Hadoop datasets.) ("Spark will run one task for each slice of the cluster.") But under the section on RDD persistence, the concept of partitions is used without introduction. Also, the RDD docs only mention partitions with no mention of slices, while the SparkContext docs mentions slices for creating RDDs, but partitions for running jobs on RDDs. Are these two concepts the same? If not, how do they differ?

1 Answer

What is the difference between an RDD partition and a slice?

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources