Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (11.4k points)

Is there any difference in semantics between and df.filter(df.col("onlyColumnInOneColumnDataFrame").isNotNull() && 

!df.col("onlyColumnInOneColumnDataFrame").isNaN()) where df is Apache Spark Dataframe?

1 Answer

0 votes
by (32.3k points)

With you actually drop the rows containing any null or NaN values.

And With df.filter(df.col("onlyColumnInOneColumnDataFrame").isNotNull()) you drop those rows which have null only in the column onlyColumnInOneColumnDataFrame.


In order to achieve the same thing with , you can do:["onlyColumnInOneColumnDataFrame"])

Browse Categories