Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Big Data Hadoop & Spark by (11.4k points)
I am using Spark 1.3.1 (PySpark) and I have generated a table using a SQL query. I now have an object that is a DataFrame. I want to export this DataFrame object (I have called it "table") to a csv file so I can manipulate it and plot the columns. How do I export the DataFrame "table" to a csv file?

1 Answer

+1 vote
by (32.3k points)

If data frame fits in a driver memory and you want to save to local files system you can use toPandas method and convert Spark DataFrame to local Pandas DataFrame and then simply use to_csv:

df.toPandas().to_csv('mycsv.csv')

Otherwise simply use spark-csv:

In Spark 2.0+ you can use csv data source directly:

df.write.csv('mycsv.csv')

Spark 1.4+

df.write.format('com.intelli.spark.csv').save('mycsv.csv')

Spark 1.3

df.save('mycsv.csv', 'com.intelli.spark.csv')

Related questions

Browse Categories

...