I would suggest you to use limit method in you program, like this:
Applying limit() to your df will result in a new Dataframe. This is a transformation and does not perform collecting the data.
While when you do:
It will result in an Array of Rows. This is an action and performs collecting the data (similar to collect).
If you want to know more about Spark, then do check out this awesome video tutorial: