Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (11.4k points)

I have a spark pair RDD (key, count) as below

Array[(String, Int)] = Array((a,1), (b,2), (c,1), (d,3))

Using spark scala API how to get a new pair RDD which is sorted by value?

Required result: Array((d,3), (b,2), (a,1), (c,1))

1 Answer

0 votes
by (32.3k points)

This should work:

//Let’s assume here that the pair's second type has an Ordering, which is the case for Int

rdd.sortBy(_._2) // same as rdd.sortBy(pair => pair._2)

(Though you might want to take the key to account too when there are ties.)

Related questions

Browse Categories