Spark 2.0.x dump a csv file from a dataframe containing one array of type string

Question

1 Answer

Amit Rawat · Answer 1 · 2019-07-23T05:34:53+0000

As the error says, The CSV file format doesn't support array type.

So, in order to dump the csv dataframe including column ArrayOfString, you need to express it as a string.

Try the following :

import org.apache.spark.sql.functions._
val stringify = udf((vs: Seq[String]) => vs match {
case null => null
case _ => s"""[${vs.mkString(",")}]"""
})
df.withColumn("ArrayOfString", stringify($"ArrayOfString")).write.csv(...)

or

import org.apache.spark.sql.Column
def stringify(c: Column) = concat(lit("["), concat_ws(",", c), lit("]"))
df.withColumn("ArrayOfString", stringify($"ArrayOfString")).write.csv(...)

Spark 2.0.x dump a csv file from a dataframe containing one array of type string

1 Answer

Related questions

Browse Categories