How can I load an existing CSV file and convert it as a DataFrame in Spark?

I want the exact command to load CSV file as DF.

I have tried:

scala> val df = sqlContext.load("hdfs:///csv/file/dir/file.csv")

But got an error.

java.lang.RuntimeException: hdfs:///csv/file/dir/file.csv is not a Parquet file.

1 Answer

A DataFrame can be defined as a dataset designed as named columns,i.e. is a distributed collection of data. Conceptually, it is equivalent to relational tables.

Spark functionality contains some core parts and CSV is one of them.

A DataFrame may be created from a variety of input sources including the CSV text files, JSON files, etc.

To load a CSV file as a DataFrame write these command on your Spark shell :"csv").option("header","true").load("/home/amit/uo.csv")

