Explore Courses Blog Tutorials Interview Questions
+5 votes
in Big Data Hadoop & Spark by (11.4k points)

I'm using pyspark(Python 2.7.9/Spark 1.3.1) and have a dataframe GroupObject which I need to filter & sort in the descending order. Trying to achieve it via this piece of code.

group_by_dataframe.count().filter("`count` >= 10").sort('count', ascending=False)

But it throws the following error.

sort() got an unexpected keyword argument 'ascending'

3 Answers

+5 votes
by (32.3k points)

In PySpark 1.3 ascending parameter is not accepted by sort method. You can use desc method instead:

from pyspark.sql.functions import col



    .filter("`count` >= 10")


or desc function:

from pyspark.sql.functions import desc



    .filter("`count` >= 10")


Both the above methods are valid for Spark 2.3 and greater, including Spark 2.x.

by (19.7k points)
It worked for me!
by (47.2k points)
Thanks a lot buddy worked like a charm!!
by (106k points)
Understood properly, nicely explained.
by (44.3k points)
This worked for me too
by (32.1k points)
Yes worked for me as well!
+3 votes
by (29.3k points)

You can Use orderBy:

group_by_dataframe.count().filter("`count` >= 10").orderBy('count', ascending=False)

For more information refer:

by (29.8k points)
Hi, thanks for the answer, solved my issue!!!
by (33.1k points)
Thanks, using the orderby() method is simple and worked well.
by (19.9k points)
This worked for me.
+1 vote
by (108k points)

As @chandra answered you can use groupBy and orderBy as follows:

dataFrameWay = df.groupBy("firstName").count().withColumnRenamed("count","distinct_name").sort(desc("count"))

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers


94.2k users

Browse Categories