How to replace null values with a specific value in Dataframe using spark in Java?

Question

1 Answer

Amit Rawat · Answer 1 · 2019-07-29T14:28:44+0000

For java:

I think you need to use the fill(String value, String[] columns) method of your dataframe, which automatically replaces Null values in a given list of columns with the value you specified.

So if you are very clear about the value that you want to replace the Null with...:

String[] colNames = {"Name"}
dataframe = dataframe.na.fill("a", colNames)

You can do the same for the rest of your columns.

And if you want to solve this kind of problem in scala:

You can use .na.fill function (check this for reference:org.apache.spark.sql.DataFrameNaFunctions).

The function that you need here is:

def fill(value: String, cols: Seq[String]): DataFrame

Now, you can freely choose the columns, and also you can choose the value you want to replace the null or NaN.

For your case, do something like this:

val df2 = df.na.fill("a", Seq("Name"))
.na.fill("a2", Seq("Place"))

Learn Spark with this Spark Certification Course by Intellipaat.

How to replace null values with a specific value in Dataframe using spark in Java?

1 Answer

Related questions

Browse Categories