Spark < 2.0:
You can use Hadoop configuration options:
mapred.min.split.size
Mapred.max.split.size
as well as HDFS block size to control partition size for filesystem based formats*.
val minSplit: Int = ???
val maxSplit: Int = ???
sc.hadoopConfiguration.setInt("mapred.min.split.size", minSplit)
sc.hadoopConfiguration.setInt("mapred.max.split.size", maxSplit)
Spark 2.0+:
You can use spark.sql.files.maxPartitionBytes configuration:
spark.conf.set("spark.sql.files.maxPartitionBytes", maxSplit)
I would suggest you to always check documentation / implementation details of the format you used as the values present in both the cases may not be in use by a specific data source API.