Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I have a spark pair RDD (key, count) as below

Array[(String, Int)] = Array((a,1), (b,2), (c,1), (d,3))


Using spark scala API how to get a new pair RDD which is sorted by value?

Required result: Array((d,3), (b,2), (a,1), (c,1))

1 Answer

0 votes
by (32.3k points)

This should work:

//Let’s assume here that the pair's second type has an Ordering, which is the case for Int

rdd.sortBy(_._2) // same as rdd.sortBy(pair => pair._2)

(Though you might want to take the key to account too when there are ties.)

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.5k answers

500 comments

108k users

Browse Categories

...