Spark has a master-slave architecture, where it has got one central coordinator,i.e. Driver, that communicates with many distributed workers nodes(executors).
The driver is basically a process where the main method runs. It converts the user program into tasks and just after that it schedules the tasks on the executors.
Workers (slaves) are running Spark instances where executors live to execute tasks. They are the compute nodes in Spark.
A worker receives serialized tasks that it runs in a thread pool.
It hosts a local Block Manager that serves blocks to other workers in a Spark cluster. Workers communicate among themselves using their Block Manager instances.
When you create SparkContext, each worker starts an executor. This is a separate process (JVM), and it loads your jars, too. The executors connects back to your driver program and now the driver can send them commands, like flatMap, map and reduceByKey. When the driver quits, the executors shuts down.
Executor is a distributed agent that is responsible for executing tasks. It typically runs for the entire lifetime of a Spark application which is called static allocation of executors.
Executors reports to HeartbeatReceiver RPC Endpoint on the driver by sending heartbeat and partial metrics for active tasks. It runs multiple tasks over its lifetime(parallel and sequentially). They track running tasks (by their task ids in runningTasks internal registry). They are also used to provide in-memory storage for RDDs that are cached by user programs through Block Manager.
Answering your questions
1. As soon as the executors are started they quickly get registered with the driver and right from that point of time they communicate directly. The workers job is to communicate with the cluster manager for the availability of their resources.
2. In a YARN cluster you can do that with --num-executors. In a standalone cluster you will be provided with one executor per worker unless you work with spark.executor.cores and a worker has enough cores to hold more than one executor.
3. You can assign the number of cores per executor with --executor-cores
4. --total-executor-cores is the max number of executor cores per application
5. there's not a good reason to run more than one worker per machine. You would have many JVM sitting in one machine for instance.
EXAMPLE 1: Since no. of cores and executors acquired by the Spark is directly proportional to the offering made by the scheduler, Spark will acquire cores and executors accordingly. So in the end you will get 5 executors with 8 cores each.
EXAMPLE 2 to 5: No executors will be launched, Since Spark won't be able to allocate as many cores as requested in a single worker.
If you want to know more about Spark, then do check out this awesome video tutorial: