With the advent of docker and scheduling & orchestration services like Amazon's ECS, I'm trying to determine the optimal way to deploy my Node API. With Docker and ECS aside, I've wanted to take advantage of the Node cluster library to gracefully handle crashing the node app in the event of an asynchronous error as suggested in the documentation, by creating a master process and multiple worker processors.
One of the benefits of the cluster approach, besides gracefully handling errors, is creating a worker processor for each available CPU. But does this make sense in the docker world? Would it make sense to have multiple node processes running in a single docker container that was going to be scaled into a cluster of EC2 instances on ECS?
Without the Node cluster approach, I'd lose the ability to gracefully handle errors and so I think that at a minimum, I should run a master and one worker processes per docker container. I'm still confused as to how many CPUs to define in the Task Definition for ECS. The ECS documentation says something about each container instance having 1024 units per CPU, but that isn't the same thing as EC2 compute units, is it? And with that said, I'd need to pick EC2 instance types with the appropriate amount of vCPUs to achieve this right?
I understand that achieving the most optimal configuration may require some level of benchmarking my specific Node API application, but it would be awesome to have a better idea of where to start. Maybe there is some studying/research I need to do? Any pointers to guide me on the path or recommendations would be most appreciated!
- Does it make sense to run a master/worker cluster as described here inside a Docker container to achieve graceful crashing?
- Would it make sense to use the nearly identical code as described in the Cluster docs, to 'scale' to available CPUs via require('os').cpus().length?
- What does Amazon mean in the documentation for ECS Task Definitions, where it says for the cpus setting, that a container instance has 1024 units per CPU? And what would be a good starting point for this setting?
- What would be a good starting point for the instance type to use for an ECS cluster aimed at serving a Node API based on the above? And how do the available vCPUs affect the previous questions?