Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in R Programming by (11.4k points)

When I m trying to do the same thing in my code as mentioned below

dataframe.map(row => {
  val row1 = row.getAs[String](1)
  val make = if (row1.toLowerCase == "tesla") "S" else row1
  Row(row(0),make,row(2))
})


I tried to solve it by taking reference from somewhere But I am getting encoder error as

Unable to find encoder for type stored in a Dataset. Primitive types (Int, S tring, etc) and Product types (case classes) are supported by importing spark.im plicits._ Support for serializing other types will be added in future releases.

1 Answer

0 votes
by (32.3k points)

You can simply use DataFrame API:

import org.apache.spark.sql.functions.{when, lower}

val df = Seq(

  (2012, "Tesla", "S"), (1997, "Ford", "E350"),

  (2015, "Chevy", "Volt")

).toDF("year", "make", "model")

df.withColumn("make", when(lower($"make") === "tesla", "S").otherwise($"make"))

Use statically typed Dataset in order to use map you should :

import spark.implicits._

case class Record(year: Int, make: String, model: String)

df.as[Record].map {

  case tesla if tesla.make.toLowerCase == "tesla" => tesla.copy(make = "S")

  case rec => rec

}

In order to map over Dataset[Row], you need to provide required encoder.

Here is how I did it:

import org.apache.spark.sql.catalyst.encoders.RowEncoder

import org.apache.spark.sql.types._

import org.apache.spark.sql.Row

// Yup, it would be possible to reuse df.schema here

val schema = StructType(Seq(

  StructField("year", IntegerType),

  StructField("make", StringType),

  StructField("model", StringType)

))

val encoder = RowEncoder(schema)

df.map {

  case Row(year, make: String, model) if make.toLowerCase == "tesla" => 

    Row(year, "S", model)

  case row => row

} (encoder)

Browse Categories

...