This idea uses the ELB's capability to detect an unhealthy node and remove it from the pool BUT it relies upon the ELB behaving as expected in the assumptions below. This is something I've been meaning to test for myself but haven't had the time yet. I'll update the answer when I do.
Process Overview
The following logic may be wrapped and run at the time the node must be shut down.
- Block new HTTP connections to nodeX, however, continue to allow existing connections
- Wait for existing connections to empty, either by monitoring existing connections to your application or by allowing a "safe" amount of time.
- Initiate shutdown on the nodeX EC2 instance using the EC2 API directly or Abstracted scripts.
"safe" in line with your application, which may not be possible to determine for some applications.
Assumptions that need to be tested
We know that ELB removes unhealthy instances from its pool I'd expect this to be graceful, so that:
- A new connection to a recently closed port will be gracefully redirected to a successive node within the pool
- When a node is marked Bad, the already established connections to that node are unaffected.
possible test cases:
- Fire HTTP connections at ELB (E.g. from a curl script) logging the results during scripted opening-closing of one of the nodes HTTP ports. You would have to be compelled to experiment to find an appropriate amount of time that permits ELB to always determine a state change.
- Maintain a long HTTP session, (E.g. file download) whereas blocking new HTTP connections, the long session should hopefully continue.
1. How to block HTTP Connections
Use a local firewall on nodeX to dam new sessions but continue to enable established sessions.
For example IP tables:
iptables -A INPUT -j DROP -p tcp --syn --destination-port