Hystrix: Custom circuit breaker and recovery logic

Question

asked Jul 5, 2019 in DevOps and Agile by chandra (29.3k points)

I just read the Hystrix guide and am trying to wrap my head around how the default circuit breaker and recovery period operate, and then how to customize their behavior.

Obviously, if the circuit is tripped, Hystrix will automatically call the command's getFallBack() method; this much I understand. But what criteria go into making the circuit tripped in the first place? Ideally, I'd like to try hitting a backing service several times (say, a max of 3 attempts) before we consider the service to be offline/unhealthy and trip the circuit breaker. How could I implement this, and where?

But I imagine that if I override the default circuit breaker, I must also override whatever mechanism handles the default recovery period. If a backing service goes down, it could be for any one of several reasons:

1. There is a network outage between the client and server

2. The service was deployed with a bug that makes it incapable of returning valid responses to the client

3. The client was deployed with a bug that makes it incapable of sending valid requests to the server

4. Some weird, momentary service hiccup (perhaps the service is doing a major garbage collection, etc.)

etc.

In most of these cases, it is not sufficient to have a recovery period that merely waits N seconds and then tries again. If the service has a bug in it, or if someone pulled some network cables in the data center, we will always get failures from this service. Only in a small number of cases will the client-service automagically heal itself without any human interaction.

So I guess my next question is partial "How do I customize the default recovery period strategy?", but I guess it is mainly: "How do I use Hystrix to notify devops when a service is down and requires manual intervention?"

1 Answer

yeshwanth.intelli · Answer 1 · 2019-07-05T13:55:10+0000

The reasons for Hystrix to call the fallback method:

An Exception

A Timeout

Many parallel requests

Many exceptions in the previous calls.

You can do a retry in your run() method if the return exception you receive from your service indicates that a retry will help you to resolve the problem.But in this fallback method this retry doesn’t make any sense because it is calling the same service again.

how to notify devops:

You should connect a monitoring system to Hystrix that polls the status of the circuit breaker and the ratio of successful and unsuccessful calls. You can use the metrics publishers provided by JMX, or write your own adapter using Hystrix' API.

Hystrix: Custom circuit breaker and recovery logic

Hystrix: Custom circuit breaker and recovery logic

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions