Missing heartbeats or executors being killed such problems usually occur due to OOMs.
I would suggest you inspect the logs on the individual executors (look for the text "running beyond physical memory"). If you have got some no. of executors and you find it cumbersome to inspect all of the logs manually, I recommend you to monitor your job in the Spark UI while it runs. As soon as a task fails, it will report the cause in the UI, so it's easy to see. Also, keep in mind that some tasks will report failure due to missing executors that have already been killed, so make sure that you are looking at causes for each of the individual failing tasks.
Most of the OOM problems can be solved quickly by simply repartitioning your data at appropriate places in your code (again look at the Spark UI for hints as to where there might be a need for a call to repartition). Repartitioning can be your key in many such cases.
You can also scale up your machines to accommodate the need for memory.