Technical Analysis of Hystrix Fuse-Hystrix Circuit Breaker

Introduction of Circuit Breaker

Fuse, there is a good analogy in real life, that is, a safe box will be installed in the home circuit, when the current is too high, the fuse in the safe box will automatically break, to protect all kinds of electrical appliances and circuits in the home. Circuit Breaker in Hystrix also plays such a role. Hystrix will report the status of success, failure, timeout and rejection to the fuse corresponding to each commandKey during operation. The fuse maintains the statistical data of calculation and determines whether the fuse is open or not according to the statistical information. If opened, subsequent requests are truncated. Then it defaults to 5 seconds at intervals, tries to open half and puts in some traffic requests, which is equivalent to a health check on the dependent service. If restored, the fuse closes, and then the call is completely restored. The following picture:


Fuse switch diagram. png

Note that the commandKey mentioned above is the andCommandKey(HystrixCommandKey.Factory.asKey("testCommandKey") set at initialization time.

Look at the position of the fuse in the entire Hystrix flow chart, starting with step 4, as follows:


Hystrix flow chart. png

Hystrix checks the status of Circuit Breaker. If Circuit Breaker is in an open state, Hystrix will not execute the corresponding instructions, but will go directly to the failed state (Figure 8 Fallback). If Circuit Breaker is closed, Hystrix will continue to check thread pools, task queues, semaphores (Fig. 5)

How to Use Circuit Breaker

Because Hystrix is a fault-tolerant framework, we only need to configure some parameters to achieve the purpose of fusing when we use it. But if we want to achieve real results, we must understand these parameters. Circuit Breaker consists of six parameters.
1,circuitBreaker.enabled
Whether the fuse is enabled or not, the default is TURE.
2,circuitBreaker.forceOpen
The fuse is forced to open and remains open all the time. Default value FLASE.
3,circuitBreaker.forceClosed
The fuse is forced to close, and it remains closed all the time. Default value FLASE.
4,circuitBreaker.errorThresholdPercentage
Set the percentage of errors, the default value is 50%. For example, there are 100 requests in a period of time (10s), 55 of which have timed out or returned abnormally. Then the percentage of errors in this period is 55%, which is greater than the default value of 50%. In this case, trigger fuse-on.
5,circuitBreaker.requestVolumeThreshold
The default value is 20. This means that at least 20 requests are required before errorThresholdPercentage error percentages are calculated. For example, 19 requests failed in a period of time (10s). The percentage of errors is 100%, but the fuse will not open, because the value of request Volume Threshold is 20. This parameter is very important. Whether the fuse is opened or not first meets this condition. The source code is as follows.

// check if we are past the statisticalWindowVolumeThreshold
if (health.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
    // we are not past the minimum volume threshold for the statisticalWindow so we'll return false immediately and not calculate anything
    return false;
}

if (health.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
    return false;
}

6,circuitBreaker.sleepWindowInMilliseconds
Half-open test dormancy time, default value 5000ms. When the fuse is turned on for a period of time, such as 5000ms, it will try to put a part of the past traffic to test whether the dependent service is restored or not.

Test code (simulate 10 calls and turn on the fuse switch when the error percentage is 5%). :

package myHystrix.threadpool;

import com.netflix.hystrix.*;
import org.junit.Test;

import java.util.Random;

/**
 * Created by wangxindong on 2017/8/15.
 */
public class GetOrderCircuitBreakerCommand extends HystrixCommand<String> {

    public GetOrderCircuitBreakerCommand(String name){
        super(Setter.withGroupKey(HystrixCommandGroupKey.Factory.asKey("ThreadPoolTestGroup"))
                .andCommandKey(HystrixCommandKey.Factory.asKey("testCommandKey"))
                .andThreadPoolKey(HystrixThreadPoolKey.Factory.asKey(name))
                .andCommandPropertiesDefaults(
                        HystrixCommandProperties.Setter()
                                .withCircuitBreakerEnabled(true)//The default is true, which is shown in this example
                                .withCircuitBreakerForceOpen(false)//The default is false, which is shown in this example
                                .withCircuitBreakerForceClosed(false)//The default is false, which is shown in this example
                                .withCircuitBreakerErrorThresholdPercentage(5)//(1) The percentage of errors is more than 5%.
                                .withCircuitBreakerRequestVolumeThreshold(10)//(2) 10 calls within 10 seconds, while satisfying (1) (2) fuse opening
                                .withCircuitBreakerSleepWindowInMilliseconds(5000)//After five seconds, the fuse will try to turn on (off) half-way and put in the request again.
//                                .withExecutionTimeoutInMilliseconds(1000)
                )
                .andThreadPoolPropertiesDefaults(
                        HystrixThreadPoolProperties.Setter()
                                .withMaxQueueSize(10)   //Configure queue size
                                .withCoreSize(2)    // Configure the number of threads in the thread pool
                )
        );
    }

    @Override
    protected String run() throws Exception {
        Random rand = new Random();
        //Percentage of simulation errors (rude but provable)
        if(1==rand.nextInt(2)){
//            System.out.println("make exception");
            throw new Exception("make exception");
        }
        return "running:  ";
    }

    @Override
    protected String getFallback() {
//        System.out.println("FAILBACK");
        return "fallback: ";
    }

    public static class UnitTest{

        @Test
        public void testCircuitBreaker() throws Exception{
            for(int i=0;i<25;i++){
                Thread.sleep(500);
                HystrixCommand<String> command = new GetOrderCircuitBreakerCommand("testCircuitBreaker");
                String result = command.execute();
                //In this example, from the 11th time, the fuse starts to open.
                System.out.println("call times:"+(i+1)+"   result:"+result +" isCircuitBreakerOpen: "+command.isCircuitBreakerOpen());
                //After 5 seconds in this example, the fuse tries to close and release new requests.
            }
        }
    }
}

Test results:

call times:1 result:fallback: isCircuitBreakerOpen: false
call times:2 result:running: isCircuitBreakerOpen: false
call times:3 result:running: isCircuitBreakerOpen: false
call times:4 result:fallback: isCircuitBreakerOpen: false
call times:5 result:running: isCircuitBreakerOpen: false
call times:6 result:fallback: isCircuitBreakerOpen: false
call times:7 result:fallback: isCircuitBreakerOpen: false
call times:8 result:fallback: isCircuitBreakerOpen: false
call times:9 result:fallback: isCircuitBreakerOpen: false
call times:10 result:fallback: isCircuitBreakerOpen: false
Fuse open
call times:11 result:fallback: isCircuitBreakerOpen: true
call times:12 result:fallback: isCircuitBreakerOpen: true
call times:13 result:fallback: isCircuitBreakerOpen: true
call times:14 result:fallback: isCircuitBreakerOpen: true
call times:15 result:fallback: isCircuitBreakerOpen: true
call times:16 result:fallback: isCircuitBreakerOpen: true
call times:17 result:fallback: isCircuitBreakerOpen: true
call times:18 result:fallback: isCircuitBreakerOpen: true
call times:19 result:fallback: isCircuitBreakerOpen: true
call times:20 result:fallback: isCircuitBreakerOpen: true
5s Back fuse closure
call times:21 result:running: isCircuitBreakerOpen: false
call times:22 result:running: isCircuitBreakerOpen: false
call times:23 result:fallback: isCircuitBreakerOpen: false
call times:24 result:running: isCircuitBreakerOpen: false
call times:25 result:running: isCircuitBreakerOpen: false

3. Circuit Breaker Source Code Hystrix Circuit Breaker. Java Analysis

HystrixCircuitBreaker.java.png

Factory is a factory class that provides an example of HystrixCircuitBreaker

public static class Factory {
        //Save the HystrixCircuitBreaker object with a Concurrent HashMap
        private static ConcurrentHashMap<String, HystrixCircuitBreaker> circuitBreakersByCommand = new ConcurrentHashMap<String, HystrixCircuitBreaker>();
        
//Hystrix first checks Concurrent HashMap to see if there is a corresponding cache in the circuit breaker, and if so returns directly. If not, a new instance of HystrixCircuitBreaker will be created, added to the cache and returned.
        public static HystrixCircuitBreaker getInstance(HystrixCommandKey key, HystrixCommandGroupKey group, HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
            
            HystrixCircuitBreaker previouslyCached = circuitBreakersByCommand.get(key.name());
            if (previouslyCached != null) {
                return previouslyCached;
            }

            
            HystrixCircuitBreaker cbForCommand = circuitBreakersByCommand.putIfAbsent(key.name(), new HystrixCircuitBreakerImpl(key, group, properties, metrics));
            if (cbForCommand == null) {
                return circuitBreakersByCommand.get(key.name());
            } else {
                return cbForCommand;
            }
        }

        
        public static HystrixCircuitBreaker getInstance(HystrixCommandKey key) {
            return circuitBreakersByCommand.get(key.name());
        }

        static void reset() {
            circuitBreakersByCommand.clear();
        }
}

HystrixCircuit Breaker Impl is the implementation of HystrixCircuit Breaker. allowRequest(), isOpen(), and markSuccess() all have default implementations in HystrixCircuit Breaker Impl.

static class HystrixCircuitBreakerImpl implements HystrixCircuitBreaker {
        private final HystrixCommandProperties properties;
        private final HystrixCommandMetrics metrics;

        /* The variable circuitOpen represents the state of the circuit breaker, which is turned off by default. */
        private AtomicBoolean circuitOpen = new AtomicBoolean(false);

        /* The variable circuitOpenedOrLastTestedTime records the initial time of the break recovery timer for the transition from Open state to Close state */
        private AtomicLong circuitOpenedOrLastTestedTime = new AtomicLong();

        protected HystrixCircuitBreakerImpl(HystrixCommandKey key, HystrixCommandGroupKey commandGroup, HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
            this.properties = properties;
            this.metrics = metrics;
        }

        /*Used to close fuses and reset statistics*/
        public void markSuccess() {
            if (circuitOpen.get()) {
                if (circuitOpen.compareAndSet(true, false)) {
                    //win the thread race to reset metrics
                    //Unsubscribe from the current stream to reset the health counts stream.  This only affects the health counts view,
                    //and all other metric consumers are unaffected by the reset
                    metrics.resetStream();
                }
            }
        }

        @Override
        public boolean allowRequest() {
            //Whether to set mandatory opening
            if (properties.circuitBreakerForceOpen().get()) {
                return false;
            }
            if (properties.circuitBreakerForceClosed().get()) {//Whether to set mandatory shutdown
                isOpen();
                // properties have asked us to ignore errors so we will ignore the results of isOpen and just allow all traffic through
                return true;
            }
            return !isOpen() || allowSingleTest();
        }

        public boolean allowSingleTest() {
            long timeCircuitOpenedOrWasLastTested = circuitOpenedOrLastTestedTime.get();
            //Get the initial time recorded by the fuse recovery timer, circuitOpened OrLastTestedTime, and then determine whether the following two conditions are met simultaneously:
            // 1) The state of the fuse is open (circuitOpen.get() == true)
            // 2) The difference between the current time and the initial time of the timer is greater than the timer threshold circuitBreaker Sleep Windows In Milliseconds (default is 5 seconds)
            //If it is satisfied at the same time, it means that it can transit from Open state to Close state. Hystrix sets the circuitOpenedOrLastTestedTime to the current time through the CAS operation and returns true. If it is not satisfied at the same time, return false, which means the fuse is off or the timer is not in time.
            if (circuitOpen.get() && System.currentTimeMillis() > timeCircuitOpenedOrWasLastTested + properties.circuitBreakerSleepWindowInMilliseconds().get()) {
                // We push the 'circuitOpenedTime' ahead by 'sleepWindow' since we have allowed one request to try.
                // If it succeeds the circuit will be closed, otherwise another singleTest will be allowed at the end of the 'sleepWindow'.
                if (circuitOpenedOrLastTestedTime.compareAndSet(timeCircuitOpenedOrWasLastTested, System.currentTimeMillis())) {
                    // if this returns true that means we set the time so we'll return true to allow the singleTest
                    // if it returned false it means another thread raced us and allowed the singleTest before we did
                    return true;
                }
            }
            return false;
        }

        @Override
        public boolean isOpen() {
            if (circuitOpen.get()) {//Getting the state of the circuit breaker
                // if we're open we immediately return true and don't bother attempting to 'close' ourself as that is left to allowSingleTest and a subsequent successful test to close
                return true;
            }

            // Getting HealthCounts objects from Metrics data
            HealthCounts health = metrics.getHealthCounts();

            // Check whether the corresponding total Count is less than the request capacity threshold in the attribute, circuitBreaker Request Volume Threshold, default 20, if so, indicating that the fuse can remain closed and return false
            if (health.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
                
                return false;
            }

            //If the total number of requests is not satisfied, check again whether the error rate (error Percentage) is less than the error percentage threshold in the attribute (default 50). If so, the circuit breaker can remain closed and return false.
            if (health.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
                return false;
            } else {
                // If the threshold is exceeded, Hystrix will determine that something is wrong with the service, so the circuit breaker is set to the open state by CAS operation, and the system time at this time is recorded as the initial time of the timer, and finally returned to true.
                if (circuitOpen.compareAndSet(false, true)) {
                    circuitOpenedOrLastTestedTime.set(System.currentTimeMillis());
                    return true;
                } else {
                    return true;
                }
            }
        }

    }
Four, summary

Each fuse maintains 10 buckets by default, one bucket per second. Each blucket records the status of success, failure, timeout and rejection. The default error exceeds 50% and more than 20 requests are intercepted within 10 seconds. The following figure shows how HystrixCommand or Hystrix x Observable Command interacts with Hystrix Circuit Breaker and its logic and decision-making processes, including the behavior of counters in circuit breakers.


Fuse Flow Interaction Diagram. png

For reprinting, please indicate the source and attach a link. http://www.jianshu.com/p/14958039fd15
Reference material: https://github.com/Netflix/Hystrix/wiki

Keywords: Java less Attribute Junit

Added by bseven on Mon, 03 Jun 2019 02:43:21 +0300