Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I am working on data science and trying to convert my sequence of data into a data frame. It was a single line of code to convert which was very easy but I am getting an exception. I tried to solve the exception which was thrown by defining the class outside the main function. Even after that, I am getting the same exception. Can anyone help me? 

package sparkWCExample.spWCExample

      import org.apache.log4j.Level

      import org.apache.spark.sql.{Dataset, SparkSession , DataFrame , Row , Encoders }

      import org.apache.spark.sql.functions._

      import org.apache.spark.SparkContext

      import org.apache.spark.SparkConf

      import org.apache.spark.sql.Row

      import org.apache.spark.sql.Dataset

      // Create the case classes for our domain

case class Department(id: String, name: String)

case class Person(name: String, age: Long)

object DatasetExample  {

             def  main(args: Array[String]){

          println("Start now")

          val conf = new SparkConf().setAppName("Spark Scala WordCount Example").setMaster("local[1]")

        val spark = SparkSession.builder().config(conf).appName("CsvExample").master("local").getOrCreate()

        val sqlContext = new org.apache.spark.sql.SQLContext(spark.sparkContext)

        import sqlContext.implicits._

        import spark.implicits._

//val df = spark.read.options(Map("inferSchema"->"true","delimiter"->",","header"->"true")).csv("C:\\Sankha\\Study\\data\\salary.csv")

// Create the Departments

val department1 = new Department("123456", "Computer Science")

val department2 = new Department("789012", "Mechanical Engineering")

val department3 = new Department("345678", "Theater and Drama")

val department4 = new Department("901234", "Indoor Recreation")

val caseClassDS = Seq(Person("Andy", 32)).toDS()

val df = Seq(department1,department2,department3,department4).toDF

        }

}

The above code was written on scala 2.12

And I am getting exception as shown below:

toDF is not a member of Seq[sparkWCExample.spWCExample.Department toDS is not a member of Seq[sparkWCExample.spWCExample.Person

1 Answer

0 votes
by (36.8k points)

I have edited your code. You have imported many unused packages so I removed them. The next thing that I have done is that I have sorted the spark context, you were using depreciated spark context and initialization which was wrong. After changes the code looks like this as shown below:

import org.apache.spark.{SparkConf, SparkContext}

import org.apache.spark.sql.{DataFrame, Dataset, SparkSession}

// Create the case classes for our domain

case class Department(id: String, name: String)

case class Person(name: String, age: Long)

object DatasetExample {

  def main(args: Array[String]) {

    println("Start now")

    val conf = new SparkConf().setAppName("Spark Scala WordCount Example").setMaster("local[1]")

    val spark = SparkSession.builder().config(conf).appName("CsvExample").master("local").getOrCreate()

    val sc: SparkContext = spark.sparkContext

    import spark.implicits._

    //val df = spark.read.options(Map("inferSchema"->"true","delimiter"->",","header"->"true")).csv("C:\\Sankha\\Study\\data\\salary.csv")

    // Create the Departments

    val department1 = Department("123456", "Computer Science")

    val department2 = Department("789012", "Mechanical Engineering")

    val department3 = Department("345678", "Theater and Drama")

    val department4 = Department("901234", "Indoor Recreation")

    val caseClassDS: Dataset[Person] = Seq(Person("Andy", 32)).toDS()

    val df: DataFrame = Seq(department1, department2, department3, department4).toDF

  }

}

If you want to know more about the Data Science then do check out the following Data Science link which will help you in understanding Data Science from scratch 

Browse Categories

...