Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I am from a Java background and new to Scala.

I am using Scala and Spark. But I'm not able to understand where I use == and ===.

Could anyone let me know in which scenario I need to use these two operators, and what's are difference between == and ===?

1 Answer

0 votes
by (32.3k points)

The "==" is using the equals methods which checks if the two references point to the same object. The definition of "===" depends on the context/object. For Spark , "===" is using the equalTo method

Let's pick a fragment of Spark-Scala code:

 dataFrame.filter($"age" === 21)

There are a few things going on here:

The $"age" creates a Spark Column object referencing the column named age within in a dataframe. The $ operator is defined in an implicit class StringToColumn. Implicit classes are a similar concept to C# extension methods or mixins in other dynamic languages. The $ operator is like a method added on to the StringContext class.

The triple equals operator === is normally the Scala type-safe equals operator, analogous to the one in Javascript. Spark overrides this with a method in Column to create a new Column object that compares the Column to the left with the object on the right, returning a boolean. The double-equals operator(==) cannot be overridden, therefore Spark must use the triple equals.

The dataFrame.filter method takes an argument of Column, which defines the comparison to apply to the rows in the DataFrame. Only rows that match the condition will be included in the resulting DataFrame.

Referencing Spark: An important difference between “==” and “===” of Spark can be determined by the return value. For Column:

== returns a boolean


=== returns a column (which contains the result of the comparisons of the elements of two columns)

...