Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I am trying to import spark.implicits._ Apparently, this is an object inside a class in scala. when i import it in a method like so:

def f() = {
  val spark = SparkSession()....
  import spark.implicits._
}


It works fine, however i am writing a test class and i want to make this import available for all tests I have tried:

class SomeSpec extends FlatSpec with BeforeAndAfter {
  var spark:SparkSession = _

  //This won't compile
  import spark.implicits._

  before {
    spark = SparkSession()....
    //This won't either
    import spark.implicits._
  }

  "a test" should "run" in {
    //Even this won't compile (although it already looks bad here)
    import spark.implicits._

    //This was the only way i could make it work
    val spark = this.spark
    import spark.implicits._
  }
}


Not only does this look bad, i don't want to do it for every test What is the "correct" way of doing it?

1 Answer

0 votes
by (32.3k points)
edited by

Create a SparkSession object and use the spark.implicit._ just before you want to convert any rdd to datasets and then proceed.

Like this:

val spark = SparkSession

      .builder

      .appName("SparkSQL")

      .master("local[*]")

      .getOrCreate()

import spark.implicits._

val someDataset = someRdd.toDS

Another approach will try to use something like SharedSQLContext directly, which provides a testImplicits: SQLImplicits, i.e.:

class SomeSpec extends FlatSpec with SharedSQLContext {

  import testImplicits._

  // ...

}

If you want to know more about Scala, then do check out this awesome video tutorial:

 

Browse Categories

...