Another way to perform this task is by using DataFrame.repartition(). The problem with using coalesce(1) is that your parallelism drops to 1, and it can be slow at best and error out at worst. And Neither increasing that no. helps you, if you do coalesce(5) you get more parallelism, but end up with 5 files per partition.
In order to get one file per partition without using coalesce(), use repartition().
Do like this:
df.repartition($"entity", $"year", $"month", $"day", $"status").write.partitionBy("entity", "year", "month", "day", "status").mode(SaveMode.Append).parquet(s"$location")
Once you do this, you will get one parquet file per output partition, instead of multiple files.
If you want to know more about Spark, then do check out this awesome video tutorial: