Change nullable property of column in spark dataframe

Question

1 Answer

Amit Rawat · Answer 1 · 2019-07-23T04:53:57+0000

With the imports

import org.apache.spark.sql.types.{StructField, StructType}
import org.apache.spark.sql.{DataFrame, SQLContext}
import org.apache.spark.{SparkConf, SparkContext}

you can use

/**

* Set nullable property of column.

* @param df source DataFrame

* @param cn is the column name to change

* @param nullable is the flag to set, such that the column is either nullable or not

*/

def setNullableStateOfColumn( df: DataFrame, cn: String, nullable: Boolean) : DataFrame = {
  // get schema
  val schema = df.schema
  // modify [[StructField] with name `cn`
  val newSchema = StructType(schema.map {
    case StructField( c, t, _, m) if c.equals(cn) => StructField( c, t, nullable = nullable, m)
    case y: StructField => y
  })
  // apply new schema
  df.sqlContext.createDataFrame( df.rdd, newSchema )
}

Directly.

Change nullable property of column in spark dataframe

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources