Cartesian product and combinations are two different things, the cartesian product will create an RDD of size rdd.size() ^ 2 and combinations(defined as “combs” in the below code) will create an RDD of size rdd.size() choose 2
val rdd = spark.sparkContext.parallelize(1 to 5)
val combs = rdd.cartesian(rdd).filter{ case (a,b) => a < b }
combs.collect()
Note this will only work if an ordering is defined on the elements of the list, since we use <. This one only works for choosing two but can easily be extended by making sure the relationship a < b for all a and b in the sequence.